Content - Inhalt

Google (28) Manfred Kyber (12) Gedichte (11) Maerchen (11) gmail (9) Google Chrome (7) art (7) going green (7) Herman Hesse (6) just funny (6) Google Search (5) Daylight Saving Time (3) Picasa (3) climate change (3) green construction (3) word of wisdom (3) 3D fractal (2) C.F. Meyer (2) Google docs (2) Haiti earthquake (2) Japan Earthquake (2) President Obama (2) animal rights (2) historic speech (2) kitties (2) Alles hat seine Zeit (1) Alley Cat Allies (1) Alley-Cats (1) Animal Abuse Registry (1) Anna Graceman (1) Autism (1) Bill Carman (1) Blind (1) Blätter wehen vom Baume (1) Bookmarks (1) Charlie Chaplin (1) Cherokee (1) Comic Sans MS (1) Cyberchondrie (1) Das Tagewerk vor Sonnenaufgang (1) Dayton OH (1) Der Königsgaukler (1) Der Prediger Salomo (1) Der Tod und das kleine Mädchen (1) Der grosse Augenblick (1) Der römische Brunnen (1) Detlef Fabian (1) Die Haselmaushochzeit (1) Die Wanderung (1) Die fleißige Ameise (1) Die getupften Teufelchen (1) Die kleine Meerjungfrau (1) Disable Auto-Adding Contacts (1) Eliezer Sternberg (1) Entgegenkommen (1) Ewigkeit (1) FDA Pet Health and Safety Widget (1) Freundschaft (1) Google Toolbar 7 (1) Google-Doodle (1) Hans Christian Andersen (1) Heldentum (1) IE Tab (1) IE6 funeral (1) Java Updates (1) Klage (1) Klip House (1) Lucas Murray (1) Magie der Farben (1) Marius Müller Westernhagen (1) Meditation II (1) Ofra Haza (1) Omar Rayyan (1) Oracle (1) PBDEs (1) Pina Bausch (1) Regen (1) Remove old Java (1) Romel Joseph (1) Stefan Waggershausen (1) Stufen (1) Stumme Bitten (1) Sun (1) Temple Grandin (1) The Child Mandrake (1) Twitter (1) Utah House Bill 210 (1) Windows XP (1) Wuppertaler Tanztheater (1) Zwei Segel (1) about my blogs (1) address-features (1) background image (1) blogger (1) cat-purr (1) census 2010 (1) chemical flame retardants (1) echolocation (1) email delegation (1) endangered languages (1) fairytales (1) farm animals (1) flash animation cat (1) gmail-labs (1) google maps (1) hyperthyroidism (1) loudest purr (1) medical (1) neuroscience (1) passive house (1) public transportation (1) sea level (1) speech technology (1) this land is your land (1) turn off conversation view (1) volcanic eruption (1) walk-score (1)

Tuesday, February 22, 2011

Speech technology at Google: teaching machines to talk and listen

from Official Google Blog 

This is the latest post in our series profiling entrepreneurial Googlers working on products across the company and around the world. Here, you’ll get a behind-the-scenes look at how one Googler built an entire R&D team around voice technology that has gone on to power products like YouTube transcriptions and Voice Search. - Ed.

When I first interviewed at Google during the summer of 2004, mobile was just making its way onto the company’s radar. My passion was speech technology, the field in which I’d already worked for 20 years. After 10 years of speech research at SRI, followed by 10 years helping build Nuance Communications, the company I co-founded in 1994, I was ready for a new challenge. I felt that mobile was an area ripe for innovation, with a need for speech technology, and destined to be a key platform for delivery of services.

During my interview, I shared my desire to pursue the mobile space and mentioned that if Google didn’t have any big plans for mobile, then I probably wouldn’t be a good fit for the company. Well, I got the job, and I started soon after, without a team or even a defined role. In classic Google fashion, I was encouraged to explore the company, learn about what various teams were working on and figure out what was needed.

After a few months, I presented an idea to senior management to build a telephone-based spoken interface to local search. Although there was a diversity of opinion at the meeting about what applications made the most sense for Google, all agreed that I should start to build a team focused on speech technology. With help from a couple of Google colleagues who also had speech backgrounds, I began recruiting, and within a few months people were busily building our own speech recognition system.

Six years later, I’m excited by how far we’ve come and, in turn, how our long-term goals have expanded. When I started, I had to sell other teams on the value of speech technology to Google's mission. Now, I’m constantly approached by other teams with ideas and needs for speech. The biggest challenge is scaling our effort to meet the opportunities. We've advanced from GOOG-411, our first speech-driven service, to Voice Search,Voice InputVoice Actions, a Voice API for Android developers, automatic captioning of YouTube videos,automatic transcription of voicemail for Google Voice and speech-to-speech translation, amongst others. In the past year alone, we’ve ported our technology to more than 20 languages.

Speech technology requires an enormous amount of data to feed our statistical models and lots of computing power to train our systems—and Google is the ideal place to pursue such technical approaches. With large amounts of data, computing power and an infrastructure focused on supporting large-scale services, we’re encouraged to launch quickly and iterate based on real-time feedback.

I’ve been exploring speech technology for nearly three decades, yet I see huge potential for further innovation. We envision a comprehensive interface for voice and text communication that defies all barriers of modality and language and makes information truly universally accessible. And it’s here at Google that I think we have the best chance to make this future a reality.

Update 9:39 PM: Changed title of post to clarify that speech technology is not only used on mobile phones but also for transcription tasks like YouTube captioning and voicemail transcription. -Ed.

Posted by Mike Cohen, Manager, Speech Technology