Artificial neural networks make life easier for hearing aid users

Artificial neural networks make life easier for hearing aid users
Credit: Oticon

For people with hearing loss, it can very difficult to understand and separate voices in noisy environments. This problem may soon be history thanks to a new groundbreaking algorithm that is designed to recognise and separate voices efficiently in unknown sound environments.

People with normal hearing are usually able to understand each other without effort when communicating in noisy environments. However, for people with hearing loss, it is very challenging to understand and separate voices in noisy environments, and a hearing aid may really help. But there's still some way to go when it comes to general sound processing in hearing aids, explains Morten Kolbæk:

"When the scenario is known in advance, as in certain clinical test setups, existing algorithms can already beat human performance when it comes to recognising and distinguishing speakers. However, in normal listening situations without any prior knowledge, the human auditory brain remains the best machine."

But this is exactly what Morten Kolbæk has worked on changing with his new algorithm.

"Because of its ability to function in unknown environments with unknown voices, the applicability of this algorithm is so much stronger than what we have seen with previous technology. It's an important step forward when it comes to solving challenging listening situations in everyday life," says one of Morten Kolbæk's two supervisors, Jesper Jensen, Senior Researcher at Oticon and Professor at the Centre for Acoustic Signal Processing Research (CASPR) at AAU.

Professor Zheng-Hua Tan, who is also affiliated with CASPR and supervisor of the project, agrees on the major potential of the algorithm within sound research.

"The key to success for this algorithm is its ability to learn from data and then construct powerful statistical models that are able to represent complex listening situations. This leads to solutions that work very well even in new and unknown listening situations," explains Zheng-Hua Tan.

Noise reduction and speech separation

Specifically, Morten Kolbæk's Ph.D. project has dealt with two different but well-known listening scenarios.

The first track sets out to solve the challenges of one-to-one conversations in noisy spaces such as car cabins. Hearing aid users face such challenges on a regular basis.

"To solve them, we have developed algorithms that can amplify the sound of the speaker while reducing noise significantly without any prior knowledge about the listening situation. Current hearing aids are pre-programmed for a number of different situations, but in real life, the environment is constantly changing and requires a hearing aid that is able to read the specific situation instantly," explains Morten Kolbæk.

Demo of a single-microphone speech enhancement and separation system based on deep learning. The system is trained using utterance-level permutation invariant training (uPIT) and the system is speaker independent. That is, the speakers in the demo have not been “seen” by the system during training. Furthermore, the system is designed to handle up to three speakers and does not need knowledge about the number of speakers at test time. In other words, the system automatically identifies the number of speakers in the input. Credit: Oticon

The second track of the project revolves around speech separation. This scenario involves several speakers, and the hearing aid user may be interested in hearing some or all of them. The solution is an algorithm that can separate voices while reducing noise. This track can be considered an extension of the first track, but now with two or more voices.

"You can say that Morten figured out that by tweaking a few things here and there, the algorithm works with several unknown speakers in noisy environments. Both of Morten's research tracks are significant and have attracted a great deal of attention," says Jesper Jensen.

Deep neural networks

The method used in creating the algorithms is called "deep learning," which falls under the machine learning category. More specifically, Morten Kolbæk has worked with deep neural networks, a type of algorithm that you train by feeding it examples of the signals it will encounter in the real world.

"If, for instance, we talk about speech-in-noise, you provide the algorithm with an example of a voice in a noisy environment and one of the voice without any noise. In this way, the algorithm learns how to process the noisy signal in order to achieve a clear voice signal. You feed the network with thousands of examples, and during this process, it will learn how to process a given voice in a realistic environment," Jesper Jensen explains.

"The power of deep learning comes from its hierarchical structure that is capable of transforming noisy or mixed voice signals into clean or separated voices through layer-by-layer processing. The widespread use of deep learning today is due to three major factors: ever-increasing computation power, increasing amount of big data for training algorithms and novel methods for training deep neural networks," says Zheng-Hua Tan.

A computer behind the ear

One thing is to develop the algorithm, another is to make it work in an actual hearing aid. Currently, Morten Kolbæk's algorithm for speech separation only works on a larger scale.

"When it comes to hearing aids, the challenge is always to make the technology work on a small computer behind the ear. And right now, Morten's algorithm requires too much space for this. Even if Mortens algorithm can separate several unknown voices from each other, it isn't able to choose which voice to present to the hearing aid user. So there are some practical issues that we need to solve before we can introduce it in a hearing aid solution. However, the most important thing is that these issues now seem solvable."

The cocktail party phenomenon

People with normal hearing are often capable of focusing on one speaker of interest, even in acoustically difficult situations where other people are speaking simultaneously. Known as the cocktail party phenomenon, the problem has generated a very active research area on how the human brain is able to solve this issue so well. With this Ph.D. project, we're one step closer toward solving this problem, Jesper Jensen explains:

"You sometimes hear that the cocktail party problem has been solved. This is not yet the case. If the environment and voices are completely unknown, which is often the case in the real world, current technology simply cannot match the human brain which works extremely well in unknown environments. But Morten's algorithm is a major step toward getting machines to function and help people with normal hearing and those with hearing loss in such environments," he says.

Explore further: New technology enhances speech perception

More information: Single-Microphone Speech Enhancement and Separation Using Deep Learning. arxiv.org/abs/1808.10620