what is speech recognition and synthesis from google

11 months ago 34
Nature

Speech Recognition & Synthesis from Google is an application that provides text-to-speech and speech-to-text functionality for Android devices. It allows applications to read aloud the text on the screen with support for many languages, and also empowers applications with speech-to-text functionality to convert the users voice to text. The app is powered by WaveNet, software created by Googles UK-based AI subsidiary DeepMind, which was bought by Google in 2014. WaveNet tries to distinguish itself from its competitors, Amazon and Microsoft, with distinct AI features. DeepMinds AI voice synthesis technology is notably advanced and realistic, using techniques such as concatenative synthesis to piece together individual phonemes to form words and sentences.

Google Cloud Text-to-Speech is another service provided by Google that converts text into natural-sounding speech using an API powered by Googles machine learning technology. It offers over 220 voices across 40+ languages and variants, and can be used to personalize communication based on user preference of voice and language.

In addition, Google is also testing an Android app called Project Relate, which provides voice transcription and synthesis for people with speech impairments. The app is descended from Project Euphonia, which was spearheaded by Google research scientist Dimitri Kanevsky, who himself has impaired speech and brought firsthand knowledge to the AI-based solution. To enable these capabilities, the researchers at Google have built a database of over a million speech samples by volunteers, which was used to train up the base level of intelligence for the speech-recognition AI.