REDMOND, Wash., July 9, 2003 — Microsoft Corp. today announced several key milestones for bringing speech technology to the mainstream including the first public beta release of the highly anticipated Microsoft®
Speech Server, a Windows Server System (TM) , and the beta 3 release of the Speech Application Software Development Kit (SASDK). In addition, Microsoft established several new resources to engage with enterprises looking to adopt Microsoft’s Speech Application Language Tags (SALT)-based speech offerings. The resources include the Microsoft Speech Server Beta Program; an Early Adopter Program (EAP); and specialized training courses on Speech Server, the SASDK, and voice user interface (VUI) and speech application design. To build a solid foundation for companies interested in becoming partners, Microsoft also today introduced the Microsoft Speech Partner Program.
“Speech technology is on the cusp of reaching its full potential, and we are committed to bringing it to the mainstream,”
said Kai-Fu Lee, corporate vice president of the Speech Technologies group at Microsoft.
“With the beta release of Microsoft Speech Server and the beta 3 of the SASDK, we are making it easier for enterprise companies and their customers to access information. We are excited to deliver this open-standards-based technology to the market as a common platform on which developers, partners and enterprises can create great speech applications.”
In addition to inviting enterprise customers to participate in Microsoft Speech Server Beta Program for application development and feedback, Microsoft is expanding the list of marquee customers that are excited about the business value and competitive advantages of Microsoft Speech Server. The EAP is an extension of Microsoft’s current Joint Development Program (JDP) in which partners and customers have been working on the technical preview of the Microsoft Speech Server. Microsoft is working closely with these companies to build highly effective enterprise-grade speech applications.
“This announcement clearly shows that Microsoft understands the fundamental measures for enabling enterprises to accrue the significant business benefits of speech technology,”
said Brian Strachman, senior analyst with InStat/MDR.
“By providing customers with an open, standards-based speech server; to developing a solid partner and beta customer ecosystem; to creating developer excitement around the Speech Application SDK, Microsoft is well-positioned to achieve its vision of making speech a mainstream and pervasive technology,”
Microsoft is providing companies with a comprehensive speech package for developing, testing, deploying and managing telephony and multimodal speech applications. Used in conjunction with the SASDK, Microsoft Speech Server enables enterprises to deploy speech applications that can improve employee productivity, increase customer satisfaction, create new revenue opportunities, and reduce costs through streamlining Web and call-center infrastructures.
Microsoft Speech Server Beta v1.0
Designed to run on the Windows Server (TM) 2003 operating system, Microsoft Speech Server is the most flexible and integrated platform for delivering low total cost of ownership for speech deployments. Taking advantage of the improved secure architecture and new security-aware features of Windows Server 2003, Microsoft Speech Server includes additional security features to help protect and defend systems, resources and users from potential security threats. Built on SALT, an open industry standard, Microsoft Speech Server extends existing Web markup languages by adding speech recognition and prompt functionality to both telephony and multimodal applications.
For connectivity into the enterprise telephony infrastructure and call-control functionality, Intel Corp. and Intervoice Inc. will provide a Telephony Interface Manager (TIM) that supports Microsoft Speech Server. The TIM will provide fast and easy integration of the speech server with the Intel NetStructure communications boards, enabling deployment of robust speech processing applications. Multimodal applications do not require a TIM.
The following are additional key components of the Microsoft Speech Server:
Speech Engine Services (SES)
Speech Recognition Engine. This component includes the state-of-the-art Microsoft Speech Recognition Engine for accurately handling users’ speech inputs.
Prompt Engine. The Prompt Engine joins prerecorded prompts from a database and plays them back so that users hear a human voice.
Text-to-Speech Engine. When prerecorded prompts are unavailable, SpeechWorks’ Speechify Text-to-Speech Engine synthesizes audio output from a text string.
Telephony Application Services (TAS)
SALT Interpreter. This component deals with all the speech interface and presentation logic (input and output). In addition, the SALT Interpreter handles interactions between the speech application and the telephony components of the architecture.
Media and Speech Manager. The Media and Speech Manager handles requests made by SALT Interpreters to SES for speech recognition and prompt playback, and manages interfaces with the third-party TIM to deliver audio to and from the telephone user.
SALT Interpreter Controller. The SALT Interpreter Controller manages creation, deletion and resetting of the multiple instances of the SALT Interpreter that are managing dialogs with individual callers.
“Microsoft Speech Server is unique to the marketplace in that it is the only speech server that supports both unified telephony and multimodal applications. By building our speech technology offerings upon the open, industry-standard SALT specification, customers can use speech to access information from standard telephones and cell phones as well as GUI-based devices like PDAs, Tablet PCs and ‘smart’ phones,”
said Xuedong Huang, general manger of the Speech Technologies group at Microsoft.
Microsoft Speech Application SDK Beta v3.0
The Microsoft Speech Application SDK is a set of tools and ASP.NET controls based on the SALT specification that enables developers to build both telephony and multimodal applications. Developers can incorporate speech functionality into Web applications quickly and easily, and can learn the concepts necessary to build a speech application with the familiar Microsoft Visual Studio®
.NET 2003 development environment. Users can access these applications across a variety of devices, from the desktop to the telephone, using speech as a possible mode of interaction. New features included in beta 3 of the SASDK include these:
Pocket Internet Explorer Bits. This feature allows Pocket PC access to Microsoft Speech Server applications.
Speech Application Wizard. This wizard enables developers to jump-start application development by creating a new project in Visual Studio .NET 2003 that contains all the necessary objects.
Telephony Application Simulator. This simulation of the Speech Server allows developers to deploy telephony applications on the desktop and interact with the application.
Enhanced dual-tone multifrequency (DTMF) support.
Speech Application Controls. Preset controls manage responses containing digits and letters, for example, credit card numbers and expiration dates, currency amounts, ZIP codes and Social Security numbers.
Enhancements to Grammar Authoring. The enhancements provide a flowchart view of grammars, the ability to type text for grammar phrases into grammar files, a Pronunciation Editor for unusual words, and integration into the Visual Studio .NET 2003 environment.
Speech Controls Outline Panel. A dockable Visual Studio menu shows users the sequence of controls in the speech application.
Microsoft is offering three, five-day instructor-led courses for companies interested in building the skills necessary to support enterprise-grade solutions. The courses include the following:
“Speech Applications: Planning, VUI Design and Maintenance”
“Developing Speech Applications With the Microsoft Speech Application Software Development Kit”
“Deploying and Administering Microsoft Speech Server”
Additional information about Microsoft Speech Server and the Speech Application SDK is available at http://www.microsoft.com/speech/ . Pricing and availability details for Microsoft Speech Server will be announced at a later date. Enterprise companies interested in taking advantage of the Microsoft Speech Server beta program can find more information at http://www.microsoft.com/speech/beta/ .
Founded in 1975, Microsoft (Nasdaq
) is the worldwide leader in software, services and Internet technologies for personal and business computing. The company offers a wide range of products and services designed to empower people through great software — any time, any place and on any device.
Microsoft, Windows Server System, Windows Server and Visual Studio are either registered trademarks or trademarks of Microsoft Corp. in the United States and/or other countries.
The names of actual companies and products mentioned herein may be the trademarks of their respective owners.
Note to editors: If you are interested in viewing additional information on Microsoft, please visit the Microsoft Web page at http://www.microsoft.com/presspass/ on Microsoft’s corporate information pages. Web links, telephone numbers and titles were correct at time of publication, but may since have changed. For additional assistance, journalists and analysts may contact Microsoft’s Rapid Response Team or other appropriate contacts listed at http://www.microsoft.com/presspass/contactpr.asp .