Using Microsoft Speech Engine

Download Source Code: speech

Introduction

Speech recognition is one of the fastest growing and commercially most promising applications of natural language technology. Speech is the most natural communicative medium for humans in many situations, including applications such as giving dictation; querying database or information-retrieval systems; or generally giving commands to a computer or other device.

Motivation

I planned to build a program to help users in performing  some common operation on his/her computer not only by keyboard and mouse but also by voice.

Purpose

The purpose of this application is to facilitate users in giving commands to computer by voice. Users can dictate computer mere with the movement of their tongue. Using only your voice, you can start programs, open menus, click buttons and other objects on the screen, dictate text into documents and write and send e-mails. Just above everything you do with your keyborad and mouse can be done with your voice.

Background

I read several articles about how to use Text to Speech and Speech to Text, but as I wanted to find out how to do it the opposite way, I realized that there is a lack of easily understandable articles covering this theme, so I decided to write a very basic one on my own and share my experiences with you.

Requirements

We need SAPI (it is part of .Net Framework now)

The easiest way to check if you have SAPI in your system is to enter your control panel-> speech. Here you should see the “Text to Speech” tab AND the “Speech recognition” tab. If you don’t see the “Speech Recognition” tab then you should download it from the Microsoft site. Download

Methodology:

The Implementation of the Application is based on Artificial Intelligence Techniques “Natural Language Processing” and “Digital signal processing”.

method

How it Works

The project’s Interface is shown below (fig-2)

interface 1

In order to start talking right away, you should press the Let’s Talk Button. The Microsoft Speech Recognition will start in the OFF state. The Application will not work unless you switch ON the Windows Speech recognition “Listening Mode”.

How you code it

First of all you need to add reference the System.Speech assembly in your application.

references

System.Speech contains the following namespaces and classes.

system.speech

Now you can add following namespaces in your application.

namspaces

Then you declare the objects of SpeechRecognizer and SpeechSynthesizer and initialize them in constructor.

classes decl

 “talk_click” event of “let’s Talk” Button

Follow these steps to implement the recognition:

1- Create a simple grammar that recognizes sample strings.

“red”, “green”, or “blue”, “Hi”, “standard”, “ON”, “Open My Computer”, “documents”, “hello”, “close”, “How are you”,”Good Bye”.

choices

2- Create a GrammarBuilder object and append the Choices object.grammer

3- Create the Grammar instance and load it into the speech recognition engine.grammers

4- Register a handler for the SpeechRecognized event.reco

Each time  SpeechRecognized event will trigger whenever you speak.

There is some sort of Digital Signal processing and Natural language Processing technique to Convert voice into speech and vice versa.

SpeechRecognized will then execute the instruction as shown in the figure.

speech reco event part 1

 

speech part 2

you can make different cases and condition on each word in the dictionary/grammar.

Also you can give commands to computer to perform some specific tasks such as open my computer and play songs etc.

commands

As you speak Documents (e.Result.Text.Contains(“documents”) will be true and Process.Start() will open the Documents.

SpeechSynthesizer

Using SpeechSynthesizer class of the speech recognition engine you can convert text to speech.  you can use the speak(“text”) function of SpeechSynthesizer for this purpose.

ts

you can set Voice gender and Voice age using SpeechSynthesizer.SelectVoiceByHints() .

I hope this article will serve you well. Let me know if there is any mistake or if any update is required. Do Good Have Good !

Download Source File:  speech

About the Author

Umer Asif

Computer Science Dept.

Govt. College University Lahore Pakistan.