Speech
What can SST do for you

SST has developed the ASR modeling algorithms in house and owns the IPR. Hence, the ASR models can be trained for any language. The specific problems that are mentioned in Sub-Section 2 of this article, are all addressed by SST's ASR algorithms and the methods used to train the ASR systems. Some of the techniques used to counter the problems are:

  • Speech data of speakers from different geographical locations is collected for training the speech models using clustering techniques, to achieve robustness against pronunciation variations.
  • Speaker normalization is done to achieve accurate speaker independent recognition.
  • Channel normalization techniques and noise robust algorithms are used to combat the degradations caused by the telephone channel. Echo cancellation algorithms are developed in house and ported onto the computer telephony interface cards, which are also developed in house, for a better match between the ASR, the TTS, the echo cancellation algorithms and the telephony interface cards.
  • In a voice barge-in situation, when a user interrupts the system prompt, the SLI system detects the user's speech and stops the play back of its own prompt immediately. But at the same time, the SLI system should be insensitive to background (ambient) noise and the speech of people talking in the background. Otherwise, if there are false alarms due to the background noise or speech, there will be frequent drop-outs in the system prompts, causing a lot of irritation to the user.
  • SST has designed Spoken Language Interfaces for several IVR systems, both ASR based and non ASR based and there are live systems used by thousands of people everyday (see projects section on the web site).
  • A Call Completion Rate (CCR) parameter is used in all of SST's SLI systems to measure the percentage of calls successfully completed by the SLI system. If the CCR is not satisfactory, the user interface is continuously modified to improve the CCR. The user interface design needs to take care of the needs of both the novice users and the experienced users, which is a conflicting requirement. For example, a novice user needs as much help as possible to use the system, which requires prompts which are relatively longer. On the other hand, an experienced user wants to finish his work as soon as possible and terminate the call and would naturally get irritated by prompts which are long. This is where voice barge-in feature would be extremely useful, so that the experienced user can interrupt the system prompts and get his work done.
 
Search  Go
Home | About Us | Products | Services