Paper: Blind Speaker Counting in Highly Reverberant Environments by Clustering Coherence Features

Using only two microphones, like those commonly found on mobile devices, we show in this work how to count the number of people talking in a meeting scenario. This paper has been accepted for publication and presentation at the 2017 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) in Kuala Lumpur, Malaysia.

Continue reading Paper: Blind Speaker Counting in Highly Reverberant Environments by Clustering Coherence Features

  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  

An even faster, parallel, MATLAB executable (MEX) compilation of the PESQ measure

In a previous post of mine (which you should read now if you haven’t already) I explained how to create a MATLAB executable for the widely used PESQ algorithm. The main reason for wanting to do this was to save time when running a large amount of speech quality tests and the speed increases obtained from using a PESQ MEX function were amazing! At the time of writing that post, the MEX function was approximately 8 times faster than anything else online that I could find. In fact, it is still the fastest implementation I can find, however, I think there is room for improvement and I finally found some time to get it working. In this post, I will show step-by-step how you can compile the PESQ MEX function to accept audio vectors directly from MATLAB and, which, should give great speed increases when run on parallel cores.

Continue reading An even faster, parallel, MATLAB executable (MEX) compilation of the PESQ measure

Paper: Active Speech Control using Wave-Domain Processing with a Linear Wall of Dipole Secondary Sources

Ever wondered if you could cancel someones voice without the need for a physical wall or partition? In this work presented at ICASSP 2017 in New Orleans, USA, we investigate the possibilities of cancelling speech over a loudspeaker wall. The method is not limited to speech, in-fact, it works much better for periodic signals as the non-stationarity of speech degrades the performance.

Scenario Layout - ICASSP2017IEEE_ICASSP2017_1

Continue reading Paper: Active Speech Control using Wave-Domain Processing with a Linear Wall of Dipole Secondary Sources

Paper: Towards Real-Time Source Counting by Estimation of Coherent-to-Diffuse Ratios from Ad-Hoc Microphone Array Recordings

Blindly counting the number of speech sources (talkers) in a meeting room can be a difficult task. This paper was presented at HSCMA 2017 at the Google Offices in San Francisco and shows how using coherent-to-diffuse ratios could allow real-time source counting.

Example Layout - Towards Real-Time Source Counting by Estimation of Coherent-to-Diffuse Ratios from Ad-hoc Microphone Array Recordings
Success Rate - Towards Real-Time Source Counting by Estimation of Coherent-to-Diffuse Ratios from Ad-hoc Microphone Array Recordings

Continue reading Paper: Towards Real-Time Source Counting by Estimation of Coherent-to-Diffuse Ratios from Ad-Hoc Microphone Array Recordings

Paper: Reproducing Personal Sound Zones Using a Hybrid Synthesis of Dynamic and Parametric Loudspeakers

A hybrid loudspeaker system consisting of dynamic point sources and parametric loudspeaker models shows great results above, generally inevitable, soundfield aliasing frequencies.
This work was presented at APSIPA ASC 2016 in Jeju, Korea.

Loudspeaker Layout - APSIPA2016
L16_Results - APSIPA2016

Continue reading Paper: Reproducing Personal Sound Zones Using a Hybrid Synthesis of Dynamic and Parametric Loudspeakers