Speech
Introduction to Speech Technology-based Systems

The goal of research and development in Speech Technology has been to develop automatic systems that users can converse with, as if they are conversing with another human being. While the motivation for such systems has initially been academic and pursued largely in academic institutions, in recent years a strong business case has evolved for such speech-based man (user)-machine interfaces, especially in telephony-based systems. We shall refer to such speech-based user interfaces as Spoken Language Interfaces (SLI) in the sequel.In an SLI, an Automatic Speech Recognition (ASR) system is used to listen, recognize and understand what the user is trying to speak and then respond in a natural-sounding synthetic voice using an unrestricted vocabulary Text-to-Speech (TTS) synthesis system or a restricted vocabulary pre-recorded (canned) human voice prompts. Since the ASR technology has not matured to a stage where it can understand anything that the user might speak, telephony-based SLI systems (also referred to as dialog or conversational systems) are restricted to a specific task, for example a credit card billing enquiry in an automatic system for banks. Thus, in such task-specific systems, the active-vocabulary (and the so called Perplexity in the ASR parlance, which is a measure of the amount of branching possible from one word to any of the other available words) is restricted. The active-vocabulary is narrowed down to a finite set of key words or a set of sequence of key words occurring in fluent speech, the sequence of key words being constrained by the grammar of the language (so called Language Models). The automatic system understands the intention of the user based on the key word or the particular sequence of key words spotted in the fluent speech of the user, and initiates appropriate action. The action could be reading out the last five transactions on the credit card, sending the account statement by fax or email etc. Such task specific ASR systems have come to be known as Key Word Spotting (KWS) systems.

This article is organized as follows: in Sub-Section 2, we highlight the practical issues encountered in the development of ASR systems. In Sub-Section 3, we develop a case for SLIs and highlight the tangible and intangible benefits of SLIs. In Sub-Section 4, we present a case study of an automatic credit card billing enquiry system. In Sub-Section 5, we elaborate on what SST can do for you such as Indian language support, SLI design etc. We summarise this article in Sub-Section 6.

 
Search  Go
Home | About Us | Products | Services