BUT researchers develop device for recognizing speaker’s age, sex and emotional state

BUT researchers develop device for recognizing speaker’s age, sex and emotional state

18. 10. 2012

Identifying a speaker, analysing their speech or working out their age, sex and/or emotional state are activities closely bound up with listening to a speaker directly and also carry a certain notion of reliability. However, such a method is extremely ineffective in terms of time and costs since records cannot be listened in any other, more rapid way.

In this case, each minute of speech requires one minute of listening. The subsequent analysis also needs to be accounted for, which often consumes a lot of time and resources.

Based on a technical description made by the researchers and following a survey of the technical equipment currently available in the market, technology is presently unable to achieve the required reliability in voice analysis in order to ascertain age, sex and the emotional state of a speaker as achieved when testing using direct listening. Devices do exist that are based on simplified principles of analysing an acoustic signal. However, those devices simulating the automatic detection of emotions give relatively non-relevant results, they do not have the ability for multilingual analysis of speech and lack any capacity to determine the sex or age of the speaker.

These drawbacks have been largely eliminated by a technical solution offered by an invention developed by BUT researchers at the Faculty of Electrical Engineering and Communication. The device works by recording (collecting) voices (acoustic information) using a recording unit. The voice recordings are analysed using a computing unit by comparing them to voice patterns saved in the basic memory unit. The result is probabilistic information on the emotional state, sex and age received from the recorded voice. Another clear advantage of this technical solution is the option to process a hundred per cent of operation on all lines using IP, digital or analogue technology in real time.

The device is now in the phase of a functional model and no further research or development is currently planned. Nevertheless, probable interested licence applicants could include companies dealing with automating voice analyses of phone calls, call centre operators, help-lines and emergency call lines. The device is protected by the IPO CZ as a utility model.

FaLang translation system by Faboba