Fujitsu develops world's first wearable, hands-free speech translation device

Fujitsu develops world's first wearable, hands-free speech translation device
Figure 1: The newly developed wearable, hands-free speech translation device. Credit: Fujitsu

Fujitsu Laboratories today announced the development of the world's first wearable, hands-free speech translation device, suitable for tasks in which the users' hands are often occupied, such as in diagnoses or treatment in healthcare.

In recent years, with an increase in the number of visitors to Japan, more and more non-Japanese patients are going to hospitals, creating issues in supporting communication in multiple languages. In 2016, Fujitsu Laboratories developed hands-free technology that recognizes people's voices and the locations of speakers, and that automatically changes to the appropriate language without physical manipulation of the device. That same year, it also worked with the University of Tokyo Hospital and the National Institute of Information and Communications Technology (NICT) to conduct a field trial of multilingual speech translation in the medical field using stationary-type tablets. Based on the results, Fujitsu Laboratories learned that, as there are many situations in which healthcare providers have their hands full, such as when providing care in a hospital ward, there was a great need for a wearable speech translation device that could be used without being physically touched.

In order to expand the usability of multilingual speech translation, Fujitsu Laboratories has developed the world's first compact, wearable, hands-free speech translation device by developing technology to differentiate speakers using small omnidirectional microphones. This is possible through an ingenious modification of the shape of the sound channel, and by improving the accuracy of speech detection technology that is highly resistant to background noise. Use of this device is expected to reduce the burden on healthcare providers whose hands are often constrained by other tasks.

Fujitsu Laboratories will evaluate the effectiveness of these newly developed translation devices in healthcare situations as part of a multilingual speech translation clinical trial being carried out jointly with Fujitsu Limited, the University of Tokyo Hospital, and NICT, with the new devices being deployed in November 2017.

Development Background

With the increase in the number of visitors to Japan in recent years, there has been demand for the commercialization of a multilingual speech translation system that helps to overcome communication problems. The Multilingual Speech Translation Technology Promotion Consortium has been conducting a variety of R&D and carrying out trials in various fields on the basis of the "Promotion of Global Communications Plan: Research, Development, and Social Demonstration of Multilingual Speech Translation Technology - (I. Research & Development of Multilingual Speech Translation Technology) Basic Plan" from the Ministry of Internal Affairs and Communications.

In 2016, Fujitsu Laboratories developed hands-free technology that recognizes people's voices and the locations of speakers, and that automatically changes to the appropriate language without physically touching the device. That same year, it also worked with the University of Tokyo Hospital and NICT to conduct a field trial of multilingual speech translation in the medical field using stationary-type tablets. As a result, it learned that healthcare providers do not just speak with patients in set locations, such as reception desks and diagnostic rooms, but also in a variety of situations when providing care throughout a hospital ward, leading to significant demand for a wearable speech translation device that could be used without physical manipulation.

Issues

With the hands-free speech translation technology developed in 2016 to run on tablets, the system used an external directional microphone to identify the direction of the speaker. To create a wearable speech translation device, however, it was necessary to develop a miniature directional microphone.

In addition, because there is a great deal of background noise in healthcare situations, such as the sounds of air conditioners and diagnostic devices, there were issues with low accuracy in detecting speech due to the impact of background noise when the healthcare provider was far from the patient.

Fujitsu develops world's first wearable, hands-free speech translation device
Figure 2: Usage scenario for the wearable, hands-free speech translation device and relationship to directivity. Credit: Fujitsu

About the Newly Developed Technology

Now, Fujitsu Laboratories has developed the world's first wearable, hands-free speech translation device that can be used in a variety of situations, including healthcare environments (Figure 1, Figure 2). Features of the technology are as follows:

1. Miniaturization through sound channel configuration utilizing sound diffraction and miniature omnidirectional microphones

Fujitsu Laboratories successfully miniaturized the devices through the use of miniaturized omnidirectional microphones and technology that enhances the directivity of sound in the target direction using an L-shaped sound channel, which dampens sound from directions other than the target direction. As shown in Figure 2, sounds from the direction of the healthcare provider are diffracted once, while sounds from other directions are diffracted twice. Because sound is dampened when it is diffracted, this can enhance the directionality of sounds from the direction of the healthcare provider.

2. Improved speech detection accuracy

Fujitsu Laboratories adopted a high-sensitivity microphone element for the patient's direction (outward-facing), increasing the recording levels for the patient's voice. In addition, it suppressed ambient noise, such as from air conditioners and diagnostic devices, through the use of noise suppression technology.

3. Structure and unit design for ease of use in healthcare situations

In developing this wearable, hands-free speech translation technology, Fujitsu Laboratories miniaturized and optimized the sound channel configuration, taking into consideration ease of use in healthcare situations, and using miniaturization and weight reduction techniques developed by Fujitsu Connected Technologies Limited in its development of smartphones and other mobile phones. Fujitsu Laboratories decided on a hanging name-badge form factor style that enables the healthcare provider to freely use both hands, with button icons, form and markings that enable intuitive operation, as well as a rounded shape to provide a pleasant and unobtrusive impression to both the healthcare provider and the patient.

Effects

With this newly developed technology, Fujitsu Laboratories achieved a speech detection accuracy of 95% in an environment with comparable noise levels to an examination room in a large hospital (about 60 decibels of noise) at a natural distance for a face-to-face conversation between a healthcare provider and a patient of about 80 cm. This newly developed translation device reduces the burden on healthcare providers when using speech translation, freeing up their hands during tasks that often require both hands, such as providing care in a ward.

Fujitsu Laboratories will be carrying out clinical trials in healthcare institutions across Japan, including the University of Tokyo Hospital, beginning in November 2017, using both this newly developed wearable, hands-free speech translation device and a speech translation system that supports accurate translations between Japanese and English or Chinese in healthcare situations, developed by NICT. In addition, based on the results of these clinical trials, the number of supported languages and the scope of usage will be expanded.

Going forward, Fujitsu Laboratories aims to expand speech translation systems using this technology to a variety of fields, such as in assisting guests in tourism and in public services from local governments, with the goal of commercialization in fiscal 2018.

Explore further: Speech signal processing technology for smart devices to achieve multilingual speech translation service