The Science behind Empathic

Expressions of emotion can include sighs, grunts, shouts and cries. These emotional sounds come from the area of the brain called the limbic system. Fluent speech and language processing, on the other hand, come from another set of structures in the brain that include Broca’s area and Wernicke’s area. Empathic is the first app to interpret these emotional sounds from non-verbal vocalisations alone.

There are a variety of products that help to detect emotions in fluent speech, this is called ‘sentiment analysis’. It is typically based on analysis of voice recordings from people who are speaking fluently in specific languages. However, this focuses on interpreting words; it does not recognise non-verbal vocalisations. Empathic is different, our AI is only trained on the voice recordings of non-verbal people in real world settings. We have been working with individuals who use fewer than 20 reliable, functional words, along with their families, service providers, health care and home care professionals since 2019. After meeting our research team, these volunteers painstakingly record and tag each emotion as the non-verbal person vocalises. Over the years we measured the sound waves that were common to each emotion and identified meaningful patterns. When we had gathered enough data, we began building, evaluating and testing AI models to deliver the most accurate interpretations. So far we can identify 10 emotions by analysing just a few seconds of a non-verbal person’s voice. This number is set to increase as Empathic’s AI continues to learn.

audio waves

We have found that it does not matter whether the person expressing an emotion is non-verbal because they have severe autism, dementia, an acquired brain injury or other conditions. In almost every case, their voice registers a set of characteristics that can be identified as specific emotions such as ‘happy’, ‘excited’ and ‘confused’. We also found these vocal patterns to be similar across the languages we tested.

We are preparing research papers for publication and our findings are consistent with a great deal of published work in this field. However the majority of publications focus on fluent speech. There are some papers that refer to non-verbal speech but our data sets are many times larger than the majority of publications. Finally, it is unusual to find data that is derived from non-verbal vocalisations that are gathered in real world settings. To learn more about this area please see the “Further Reading” section below.

We would like to thank the universities, health care professionals and non-verbal individuals and their families who have participated in, and continue to contribute to, Empathic’s research and development.

* “Non-verbal” is also known as ‘minimally-verbal’, ‘minimally speaking’, ‘non-speaking’ and there are other accepted terms. For the sake of clarity we use the term ‘non-verbal’ to refer to people who can use fewer than 20 functional words. i.e. words they choose to say that others can understand.

 

 

 

 

    • Huang, Kun-Yi, et al. (2019). Speech Emotion Recognition Using Deep Neural Network Considering Verbal and Nonverbal Speech Sounds. In: ICASSP 2019 – 2019

 

    • IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Brighton, UK, pp. 5866-5870.