Microsoft Tellme talks itself into embarrassment against Siri

The Xbox 360 is great. And with Kinect it’s phenomenal. But for Microsoft Chief Research and Strategy Officer Craig Mundie to dismiss Apple Siri and brag about Tellme in Windows smartphones?

Utter embarrassment, especially coming from an executive who’s supposed to be the technology visionary of the company. (Heck, we all know Ballmer isn’t the guy.)

Tellme was rumored to have cost $800 million in 2007, but has the technology been stuck in the past?

It’s now evident that the software giant does not consider speech technology important and will likely pay a price for it, much like when it played catch-up during the Web boom. While Microsoft wasn’t looking, Nuance had scooped up dozens of other companies and partnered with Apple to give Siri a voice in the cloud.

And now the blogosphere is abuzz with this:

What does Tellme tell you?

Nuance, nuisance to some competitors

Speech tech leader Nuance is basking in some positively stellar publicity lately, mostly riding on the buzz about Apple’s upcoming iOS 5, Mac OS X “Lion” and the mega data center situated in a remote North Carolina town. The latest Apple mobile and desktop operating systems are said to have some deep integration to speech services from Nuance.

After years of mergers, acquisitions, and lawsuits, Nuance has finally struck gold with the Apple partnership. Forget telecom and IVRs — mobile consumer technology is where the bling is. Smartphone and tablet sales will no doubt exceed that of IVRs and speech servers.

Nuance and the speech industry in general have an intriguing history.

Things really took shape when four major players emerged in the 1990s: Lernout & Hauspie, Nuance, ScanSoft, and SpeechWorks. Back then the Internet was just blossoming, CPU, RAM, and storage were still expensive, and nobody’s heard of cloud computing yet. Speech innovation was highly dependent on the Ph.D. talents and R&D money. So naturally, companies with the thicker cash stash gained an advantage.

ScanSoft picked up some notable deals: Lernout & Hauspie (December 2001, filed for bankruptcy; Dragon Systems was acquired previously by L&H), SpeechWorks (August 2003), and LocusDialog (January 2004). In the span of three years the industry consolidated to just two big players: Nuance and ScanSoft.

Then in September 2005 a merger was announced between them, and the new entity to be called Nuance Communications.

Now came an acquisition binge, thanks to CEO Paul Ricci. Since 2000 there were 43 acquisitions. Some of the better-known buys included: Dictaphone (March 2006), Tegic Communications (August 2007), Jott Networks (July 2009), Spinvox (December 2009), and MacSpeech (February 2010).

Growth through M&A was just part of the story. Nuance was also busy in the courts fighting its up-and-coming, lesser-known competitors. One such unlucky competitor was Vlingo. According to a BusinessWeek article in May by author Peter Burrows:

“Competing with Nuance is like having a venereal disease that’s in remission,” says Dave Grannan, CEO of Vlingo, a speech-recognition startup that’s involved in five Ricci-related lawsuits. (Nuance has four suits against Vlingo; Vlingo has one against Nuance.) “We crush them whenever we go head-to-head with them. But just when you’re thinking life is great—boom, there’s a sore on your lip.”

Vlingo’s adventures with Ricci began in 2008, soon after Yahoo! (YHOO) chose Vlingo software over Nuance. Three months later, Grannan learned from a Boston Globe reporter that Nuance had filed a patent suit—without contacting the company to discuss royalties. “It was clearly an effort to hurt our business,” says Grannan, who expects to spend $15 million on legal fees. Nuance spokesperson Rebecca Paquette said neither Ricci nor the company would comment on specific lawsuits against Vlingo or others. “In these highly technical fields, many companies attempt to gain advantage by simply using Nuance’s inventions rather than developing their own,” she wrote in an e-mail. “We have a duty to our stockholders to preserve the value of the company and its assets.”

By summer 2009, with Vlingo running out of cash, according to Grannan, Ricci approached him about an acquisition. On Sept. 21, they met in San Francisco for a 14-hour negotiation. No agreement. Two days later, Ricci surprised Grannan and Vlingo co-founder Mike Phillips by calling to offer two more alternatives. First, Ricci promised to pay them and co-founder John Nguyen $5 million each if they could persuade their board to sell at his price. If that failed, and the three execs agreed to jump ship to Nuance, he’d pay them the amount they would have received in an acquisition—plus another $5 million if they stayed with the company for two years. As Ricci talked over speakerphone, Grannan says he looked at Phillips, mouthed “What the f—?”, and asked Ricci to repeat. Ricci, who speaks in the measured tones of an academic, obliged. After notifying Vlingo’s board of the offer, Grannan called Ricci back to express the board’s displeasure. “I was flabbergasted,” says Vlingo board member Bob Davoli. “I’ve been on 55 boards in my career and been a CEO twice—but I’ve never heard of anything like this.”

Vlingo’s board later accepted a $15 million investment in the company, after Ricci suggested that such a deal would align their interests and lead to a cessation of hostilities, says Davoli. “That was wishful thinking,” he says. Rather than drop the lawsuits, Ricci stopped taking Grannan’s phone calls. When Vlingo’s board stopped admitting a Nuance-appointed director to its board meetings, Nuance sued for the right to attend.

And guess what? Nuance is still knocking at Vlingo’s door: TechCrunch recently reported a new patent infringement lawsuit.

Tellme Networks also found itself enduring the wrath of Ricci in 2006:

In late 2006, Ricci took a run at a customer—Tellme Networks, which made an automated telephone-response system. Ricci had just purchased a company that made speech software used by Tellme. Mike McCue, Tellme’s then-CEO, says he was contacted by Ricci, who declared he’d sue Tellme, introduce a competing product, and refuse to sell it more software unless Tellme’s board agreed to sell to Nuance at Ricci’s price. “It was an all-out attack on every front,” says McCue, who now runs Flipboard, maker of a popular news app. McCue did sell the company in 2007—to Microsoft (MSFT), for a far higher price. (Press reports had it at around $800 million.) A court later dismissed Nuance’s patent claims as invalid. “We were able to outlast Nuance,” says McCue. “But a lot of companies can’t handle the pressure and give in. It’s happened time and again.” Ricci declined to comment on dealings with Tellme or Vlingo.

Nuance’s aggressive tactics aren’t just reserved for U.S. companies, either. The founder of an Indian speech company told me a story about his ordeal with Nuance. After being contracted to develop an application for Nuance, it poached the co-founder to work as an employee. (He left for Nuance within two months after being his business partner for nine years.) To add insult to injury, Nuance pre-emptively served legal notice in fear of a lawsuit going after this co-founder and his new employer. Efforts failed to negotiate a more reasonable, smoother transition for the co-founder to jump ship.

David vs. Goliath in the speech industry. David usually loses. From bleeding money and resources.

Indeed, from the BusinessWeek article:

Nuance’s Paul Ricci built the dominant speech-recognition company with engineering, acquisitions—and a lot of lawsuits.

 

What Tellme (Microsoft) is up to

It was big news when Microsoft scooped up Tellme in 2007 for a rumored $800 million. Not only did the acquisition highlight the Redmond software giant’s foray into speech recognition technology, but also its willingness to pour money into modern UI research and development. Speech recognition, by any application, is just another way for the user to interact with a system. Unlike other speech recognition vendors, Tellme did not concentrate on contact center applications, but provides solutions in telephony, Web, and mobile applications. It’s no wonder that Microsoft, being a software shop, was interested.

Fast forward to 2010 and TMCnet has an interview with Grant Shirk, Director of Industry Solutions at Tellme Business Solutions, who shares some information on what the future holds for Tellme. Some tidbits of particular interest to me:

In addition to the emergence of distributed computing platforms for speech recognition, we expect to see more IVR services moving into the network as businesses seek the performance improvements and lower costs that on-demand platforms can provide. Virtualization of queuing and routing is a logical next step that can drive higher agent utilization within the contact center and reduce the cost and necessity of standalone CTI services. Tellme expects this virtualization to also improve the customer experience by getting callers to the right agent at the right time (avoiding unnecessary transfers) and enabling the growth of innovative services like virtual hold and scheduled call backs.

It appears that all divisions within Microsoft are aligning to its Azure cloud platform, from Office to SQL Server to Tellme speech recognition. But is Microsoft playing catch-up? Google has had its cloud-based Apps forever. Voxeo and several other IVR vendors lead in cloud-based IVRs. And unfortunately for Tellme, virtual hold and scheduled call backs are old, old news…

But there’s hope:

Together with Ford Motor Company and Kia, Microsoft and the Tellme platform pioneered the use of network-based speech to drive in-car experiences with the Ford SYNC product, and Kia with UVO (your voice), both profiled at CES and SpeechTEK 2009. The Ford SYNC service accesses the Tellme platform to provide drivers with hands-free access to local business search, driving directions, and other information. We expect to see more manufacturers moving toward a network-based model in the near future.

In addition, speech is quickly becoming an integral part of the mobile device interface. A great example that showcases the power of speech and language processing technologies is the recently launched Bing Mobile client. To provide mobile users the best possible speech performance for these advanced tasks, the speech features need to take advantage of network-based (rather than embedded) recognition capabilities.

Now this is what I think Microsoft Tellme will be the leader of the pack. Ford SYNC is one of the features that distinguishes Ford automobiles from its competitors. Considering how Ford has come back to be profitable again, perhaps it also has Microsoft to thank. Although some geeks have joked that they’d rather not encounter the BSOD in their cars, SYNC is clearly something that car manufacturers believe in, and we should not be surprised to see such in-car voice-activated driver assistance platforms to be ubiquitous in the near future.

The future seems bright:

Tellme continues to drive momentum and significant interest in the adoption of the Microsoft speech engine. In 2009, we answered over 1.3 billion calls (nearly 50 percent of our total annual traffic) on this engine, and our clients are observing significant improvements in recognition accuracy, automation, and task completion across their applications. The average task completion improvement when moving to the new engine has been three percent on average, out of the box.

The close relationship between the delivery and R&D teams within the Speech at Microsoft group allows us to continually influence the evolution and enhancement of the speech engine to best meet the evolving needs of our customers.

The key is the close relationship with R&D. If anything, Microsoft is a gigantic R&D machine: consistently spends over $6 billion annually since 2005. With that type of R&D backing, it’s all but certain that Tellme will continue to have a major impact in speech recognition technology in general, not limited to just contact center applications.