Using only two microphones, like those commonly found on mobile devices, we show in this work how to count the number of people talking in a meeting scenario. This paper has been accepted for publication and presentation at the 2017 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) in Kuala Lumpur, Malaysia.
S. Pasha, J. Donley and C. Ritz, “Blind Speaker Counting in Highly Reverberant Environments by Clustering Coherence Features,”presented at the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 1-4, 2017.
This paper proposes the use of the frequency domain Magnitude Squared Coherence (MSC) between two ad-hoc recordings of speech as a reliable speaker discrimination feature for source counting applications in highly reverberant environments. The proposed source counting method does not require knowledge of the microphone spacing and does not assume any relative distance between the sources and the microphones. Source counting is based on clustering the frequency domain MSC of the speech signals derived from short time segments. Experiments show that the frequency domain MSC is speaker-dependent and the method was successfully used to obtain highly accurate source counting results for up to six active speakers for varying levels of reverberation and microphone spacing.