Microsoft presents speech recognition breakthrough

Have you ever been to Interspeech, the annual Conference of the International Speech Communication Association, held in beautiful Florence, Italy?

And you call yourself a hardcore speech technology nut?! Tell me if these PDFs make sense to you.

Well, it’s good to know that while some companies are busy buying up smaller competitors, there are brainiacs all over the world who actually fawn over speech recognition. Let’s thank Microsoft for pouring billions of dollars into its Research arms (yes, it has R&D facilities around the globe) so that we may one day face a Terminator who won’t mix up “Go fetch me a beer!” with “Gopher meets a deer!” and blow us into pieces because it thinks we are too dumb.

Microsoft researchers are improving large vocabulary speech recognition by enhancing neural network models of “senones” (so cutting edge that even Wikipedia doesn’t offer an explanation):

Earlier work on DNNs had used phonemes. The research took a leap forward when Yu, after discussions with principal researcher Li Deng and Alex Acero, principal researcher and manager of the Speech group, proposed modeling the thousands of senones, much smaller acoustic-model building blocks, directly with DNNs. The resulting paperContext-Dependent Pre-trained Deep Neural Networks for Large Vocabulary Speech Recognition by Dahl, Yu, Deng, and Acero, describes the first hybrid context-dependent DNN-HMM (CD-DNN-HMM) model applied successfully to large-vocabulary speech-recognition problems.

“Others have tried context-dependent ANN models,” Yu observes, “using different architectural approaches that did not perform as well. It was an amazing moment when we suddenly saw a big jump in accuracy when working on voice-based Internet search. We realized that by modeling senones directly using DNNs, we had managed to outperform state-of-the-art conventional CD-GMM-HMM large-vocabulary speech-recognition systems by a relative error reduction of more than 16 percent. This is extremely significant when you consider that speech recognition has been an active research area for more than five decades.”

Kudos, Microsoft. We cannot wait to see this breakthrough commercialized because, Lord knows, you could use some help:

(h/t Jason Hersh)


Genesys Mobile Customer Engagement strategy unveiled in the land Down Under

I do like how Genesys has evolved its narrative over the years. In the beginning it was all about voice calls. Then it became interactions. Now Genesys labels them conversations.

Conversations are two-way, engaging, and personal — exactly the expectation of customers in today’s multichannel and mobile world when dealing with an organization.

It’s no surprise that Genesys continues to focus on mobile and multichannel conversations. I also think that the Groupama iPhone app opened the eyes of many in the industry to finally get a glimpse of how today’s technologies can shape customer service experiences of tomorrow.

The company is taking advantage of the interest in mobile applications to unveil its Mobile Customer Engagement strategy during G-Force Melbourne (Australia’s second most populous city, not the sunny town in Florida). Its latest G8 Suite offers solutions to help companies make this mobile leap as seamless and as painless as possible.

However, I think the bigger news came from ZDNet Australia:

Contact centre software licensing will be a thing of the past, because contact centre services will move to the cloud within five years, according to Alcatel-Lucent Enterprise’s Asia Pacific senior vice president Michael McBrien

The company, which offers its Genesys call centre services both in the cloud and as an on-premise software-based service, has around 1000 customers in the cloud already, McBrien told journalists at the G-Force 2011 conference in Melbourne today. He said that it would encompass Alcatel-Lucent Enterprises’ entire customer base by 2016.

Having 1,000 cloudy customers is quite impressive, especially in the contact center space. Genesys, of course, does not offer the infrastructure for cloud-based solutions and have no interest in becoming a massive cloud company, but I have to think that its cloud partners are grinning ear to ear with 1,000 customers. I also wonder how the Genesys licensing scheme will evolve with the cloud paradigm shift and how that impacts the company financially since software licenses are very much its bread and butter. McBrien told ZDNet that the company should “go to a different market, a mid market.” Reminds me of the Genesys Express days — didn’t work out too well though. Maybe we’ll see it resurrected in the near future?

In the meantime, there’s still a license.dat file lurking somewhere between the premise and cloud, and the sacred port of 7260 listens intently amidst the hum of a data center, each ping echoes through the vast networks like a drop of gold coins.

Read the official Mobile Customer Engagement strategy press release:

Melbourne, Australia, August 24, 2011  Alcatel-Lucent (Euronext Paris and NYSE: ALU) is calling upon companies to take advantage of today’s powerful, multi-channel smart phones by creating a new model for mobile customer engagement. With today’s mobile approach mostly limited to self-service and limited transactions, Alcatel-Lucent is prescribing a strategy that brings conversations to mobile customer service applications by intelligently linking contact center agents and customer care resources from across the enterprise, including those in the back office and branch locations.

While many companies already offer their customers mobile service apps — consider the thousands available for banking, retail and travel — these are often poorly integrated within a company’s existing customer service strategy and contact center technology platform. As a result, today’s mobile customers suffer from a disconnected experience that often delivers frustrating hold times or no way to contact an agent or resource for additional support. This disconnected approach also fails to unleash the power of today’s smart phones in transforming customer engagement with proactive contact, personalized applications and location-based services.

During the G-Force Melbourne 2011 customer event, Alcatel-Lucent outlined its Genesys Mobile Customer Engagement strategy, which focuses on helping companies move from transactional applications to mobile conversations and recommends the following best practices:

  • Contact Me – provide seamless and secure ‘click-to-call’ capabilities with context from smart phone applications with immediate agent support or scheduled call backs.
  • Connect Me – deliver mobile customers to best resource from contact center to back office departments and branch locations, across any channel – voice, SMS, chat.
  • Know Me – provide personalized mobile experience based current service tasks and proactive contact with targeted offers and location-based services.

“Today’s consumers rely on their smart phones and tablets to be their ‘windows to the world.’ Businesses need to be creative in offering apps that integrate into all areas of the enterprise, from sales and marketing to customer care,” said Tom Burns, President, Alcatel-Lucent Enterprise. “Our mobile solutions featured in the G8 suite are bringing our core cross-channel routing and application openness together with the power of our Genesys Conversation Manager to provide the context and presence information needed to deliver the next generation mobile experience.”

At G-Force Melbourne, Genesys is bringing together the solutions companies need to deliver the next generation mobile customer experience. The Genesys G8 suite will be on display, including:

  • Conversation Manager – bring agent conversation to an iPhone
  • Integrated Mobile Customer Care Apps – mobile customer service applications
  • UC Connect – linking mobile customers to back office and mobile experts
About Alcatel-Lucent (Euronext Paris and NYSE: ALU)

The long-trusted partner of service providers, enterprises, strategic industries and governments around the world, Alcatel-Lucent is a leader in mobile, fixed, IP and Optics technologies, and a pioneer in applications and services. Alcatel-Lucent includes Bell Labs, one of the world’s foremost centres of research and innovation in communications technology.

With operations in more than 130 countries and one of the most experienced global services organizations in the industry, Alcatel-Lucent is a local partner with global reach.

The Company achieved revenues of Euro 16 billion in 2010 and is incorporated in France and headquartered in Paris.

For more information, visit Alcatel-Lucent on:, read the latest posts on the Alcatel-Lucent blog and follow the Company on Twitter: external link


Contact the Alcatel-Lucent Press Office:

Exodus Software releases License Audit Service for Genesys

Exodus Software has some interesting product ideas brewing across the pond (HERA, for example), and today it’s debuted a License Audit Service (LAS) for the Genesys CTI platform.

Licensing typically takes up a big chunk of the overall expenditures in a CTI deployment, and as with most any other software licensing procurement, the recommendation is to “buy into the future” to accommodate for growth. However, this being the real world, there are times when the growth projection is off, or the economic climate requires re-evaluating license costs, or frankly you just want to know for sure how the licenses are being consumed.

Exodus claims that LAS can help “reduce (annual) license support costs by as much as 30%”… That’s quite a staggering number, especially when some big contact centers house thousands of agents.

Details (e.g. demo, pricing, etc.) are available from the brochure (PDF).

Nuance picks up Loquendo

Earlier this month, there was the rumor about Telecom Italia offloading its speech services arm Loquendo. Then on August 13, The Washington Post reported Loquendo being sold to Nuance for $75.5 million.

However, as of the date and time of this post, neither Loquendo or Nuance has this information on their websites. The press release came from Telecom Italia:

Telecom Italia has announced the sale of its 99.98% stake in Loquendo to U.S. company Nuance Communications, Inc. on the basis of an enterprise value of €53 million.

The sale of Loquendo, a 2001 voice technology spin-off from Telecom Italia’s research labs with a workforce of around 100, is part of a process of rationalization of the Group’s shareholdings and a shift of focus toward its core business.

Nuance is committed to keeping the company’s headquarters in Turin, and creating a global centre of excellence in voice technology R&D and reinforcing its collaboration with Italian universities.

The deal is expected to close around the end of September.

Rome, 13 August 2011

So there you have it, the almighty Nuance becoming even stronger. One less player in the speech industry. Is this good or bad for the industry?

SpeechTEK: What’s the buzz?

I wasn’t able to attend SpeechTEK NYC this year, but here’s what I gathered based on a couple of on-site sources and my own digging…

  • Thumbs down on the first keynote speech, “Responding to the Voice of the Constituent/Customer” by David Gergen. Inviting a senior political analyst from CNN to open an event tailored to speech tech may not have been the best idea. Echoing throughout the conference were comments such as “Ten minutes of useful information with 40 minutes of fluff.” (Sounds like CNN…)
  • Nuance Dragon Go! was a hit. This mobile app for the Apple iOS was released in mid-July and combines Nuance’s Dragon voice recognition engine with natural language understanding to deliver the most relevant content based on a user’s voice query.
  • Microsoft Xbox with Kinect crashed. From what I’ve been told, it crashed “a few times” during the Microsoft Tellme VIP Event. Attendees got to witness what a core dump on a projection screen looks like. (Not good.)
  • Vendors continue to push cloud and voice biometrics. Everybody’s still talking about the cloud. It’s here and it’s here to stay. Companies are utilizing it, and cloud vendors are making money from it. Get ready for the next big thing: voice biometrics.
  • Interactions, Inc. — the best IVR nobody’s heard of? This Boston-based company just secured $12 million in new funding. Check out their online demos. Almost too good to be true?
  • Smaller venue, less attendees, less tweets. Is the economy taking a toll on the industry?

SpeechTEK: Picks for Wednesday

Please also check out the first and second part of this series. Remember to contact me if you’d like to share your SpeechTEK experience as a guest blogger.

Wednesday, August 10 is the last day of the conference. Are the best sessions saved till the last?

SD301 – When Clients Become Designers – Carrie Nelson (Sr. Speech Software Engineering Consultant, Avaya)

An interesting topic for consultants. Is there a delicate balance and how to achieve it?

A302 – Recent Changes in Speech Patent Law – K.W. “Bill” Scholz (President, NewSpeech), Mark Webbink (Visiting Profession & Executive Dir., New York Law School), Steven Hoffberg (Partner, Ostrolenk Faber LLP), Marie Meteer (Executive Dir., Speech Technology Consortium), Mark Powell (Dir. Communications Technology Area, USPTO)

There has been lots of news about patents lately: from “patent trolls” targeting independent Apple App Store developers to Google losing out on acquiring Nortel patents to Microsoft’s involvement in Android patents. It appears that tech companies are boosting their firepower in the Patent Battles. Definitely an area to pay attention to in the speech industry as well.

D301 – Innovative Uses of Speech Technology – David Thomson (PMTS, AT&T Labs), Brad Kayton (CEO, Vgo Communications), K.W. “Bill” Scholz (President, NewSpeech)

IVRs, smartphone apps, dictation software, etc. — but what else can speech tech apply to? This session explores speech’s role in robotics and language learning. How cool is that?