Microsoft’s AI group debuts customizable speech-to-text technology, rapidly expanding ‘cognitive services’ for … – GeekWire

Microsofts Artificial Intelligence and Research Group, a major new engineering and research division formed last year inside the Redmond company, is debutinganew technology that lets developers customizeMicrosofts speech-to-text engine for use in their own apps and online services.

Thenew Custom Speech Service is set for releasetoday asa public preview. Microsoft says itletsdevelopers upload a unique vocabulary such as alien names in Human Interacts VR game Starship Commander to produce a sophisticated language model for recognizing voice commands and other speech from users.

Its the latest in a series of cognitive services from Microsofts Artificial Intelligence and Research Group, a 5,000-person division led by Microsoft Research chief Harry Shum. The company says it has expandedfrom four to 25 cognitive services in the last two years, including 19 in preview and six that are generally available.

The company says it will bring two more cognitive services,Content ModeratorandBing Speech API, out of preview and make them generally available next month. Content Moderator analyze images and videowith technology including optical-character and objectrecognition, helping companiesfilter out unwanted content. The Bing Speech API converts audio into text, interprets the intent of the language and converts text back to speech.

Microsoft formed the group to accelerate its artificial intelligenceadvances, aiming get more of its technologies out of the labs and into itsown products as well as its services for third partydevelopers. TheAI and Research Groupalso includes MicrosoftsCortana voice-based assistant and Bing search engine.

The company is competing against rivals including Amazon, Google and others in the booming field of artificial intelligence. AI and machine learningare increasingly becoming integralparts of their cloud platforms, as well.

Microsoftsnew Custom Speech Servicealso includesan acoustic model that cancels out background noise to improvespeech recognition. Microsoft citedthe example of using Custom Speech Service at anairport kiosk where the environmental noise would otherwise make speech recognitionvery difficult.

The combination of a language model and this acoustic model in a single API that is customizable for your vocabulary is truly unique in the market, said Irving Kwong, group program manager, in an interview. In going from a private preview to a public preview, the servicewill be able to take on tens of thousands of new customers.

See the original post:

Microsoft's AI group debuts customizable speech-to-text technology, rapidly expanding 'cognitive services' for ... - GeekWire

Related Posts

Comments are closed.