Sit CommandYesterday I gave you a brief introduction to speech recognition using C#. Today I would like to expand on that tutorial by showing you how to listen for specific words. To give this tutorial a little bit of a real-world scenario, I am going to show you how to listen for specific commands such as application names. I will also show you how to launch those external programs once the speech recognition engine has detected the command to start those apps. If you haven’t read yesterday’s article, I would advise you to do that now as we’ll be using the same code from that tutorial and adding a few more tiny little pieces. Once you’ve read that article, come back here and continue on.

In yesterday’s example, we used a new default DictationGrammar object in our recognition engine. That allowed our engine to listen for any speech provided and type it into our rich text box. Today, we’re going to create ourselves a new Grammar object that will include a few different choices for the engine to listen for. To do that, the first thing we need to do is to add a new method that will return our new Grammar object.

private Grammar CreateGrammarObject() { … }

Inside that method, you will need to create a new “Choices” object which accepts an array of commands to listen for. As I mentioned above, we will stick to application names so that we can launch those programs just by saying “start (application name)”. To keep things simple, we will add choices for “Calculator”, “Notepad”, “Internet Explorer”, and “Paint”.

Choices commandChoices = new Choices(“Calculator”, “Notepad”, “Internet Explorer”, “Paint”);

Next, we will need to add a new GrammarBuilder which will listen for the word “Start” before it will look for words we added to our Choices object. Instead of first listening for the word “Start”, you can change this to listen for whatever you want. If this was Star Trek, you would put “Computer” in place of “Start” so that the computer knows you are talking to it and not to some Klingon. :-)

GrammarBuilder grammarBuilder = new GrammarBuilder(“Start”);

Once you’ve built your new GrammarBuilder object, you will need to pass it the list of Choices you created earlier by calling the “Append” function.

grammarBuilder.Append(commandChoices);

Now you need to create a new Grammar object which will be built using the GrammarBuilder you just added. You can also give the new Grammar object a name such as “Available Programs”, but this step isn’t really necessary. Once you have your new Grammar object built, go ahead and return it from the method so that we can use it in just a minute.

Grammar g = new Grammar(grammarBuilder);
//g.Name = “Available programs”;
return g;

That’s it for the grammar function. It’s now time to put it into action. To do that, in your form constructor, you should have a line that reads “recognitionEngine.LoadGrammar(new DictationGrammar());” Remove the piece that reads “new DictationGrammar()” and replace it with a call to your CreateGrammarObject method you just built. If done correctly, your line should now look like this:

recognitionEngine.LoadGrammar(CreateGrammarObject());

You can an also create a new Grammar object and instantiate it by pointing it to your CreateGrammarObject method and passing that new Grammar object to the LoadGrammar function, but that too isn’t really necessary for the scope of this example. For now, go ahead and test your application by running it, clicking the Start button, and saying commands like “Start Notepad” or “Start Paint”. If everything worked accordingly, you should see something like this:

C# Speech Recognition In Action

Pretty cool, huh? You can say anything you want. But, if you don’t say “Start” and a word from your Choices object, the engine won’t recognize it and therefore will not try to process it as a command. If you want to actually run these applications, there are a couple of things you need to do. The first thing you need to do is add include “System.Diagnositcs” to your includes with a “using” statement like this:

using System.Diagnostics;

While you’re at it, go ahead and add an include for “System.Text.RegularExpressions” also. The reason you will need regular expressions is because your commands come thru like “Start Paint” and “Start Notepad”. Since you probably don’t have any applications that actually begin with the word “Start”, you will need to remove this from your command. If you don’t want to mess with regex, I’ll show you another way to handle this in just a second. For now, add a new string called “line” just above the foreach loop in your form constructor.

string line = “”;

Next, inside of your foreach loop, instead of calling txtOutput.Text += word.Text + ” “;, you will need to replace “txtOutput.Text” with “line”. This will append all recognized commands to your “line” string which we’ll pass to txtOutput.Text after our foreach loop. Here’s what that code should look like:

string line = “”;
foreach (RecognizedWordUnit word in args.Result.Words)
{
if (word.Confidence > 0.5f)
line += word.Text + ” “;
}
txtOutput.Text += line;

In order for your app to know how which applications to launch, this is where regex comes in. Just before txtOutput.Text += line;, add a new string object called “command” and using Regex.replace, remove the word “Start” from the “line” string and make sure to trim any leading and trailing whitespace.

string command = Regex.Replace(line, “Start”, “”).Trim();

Now that you have the name of the application you want to run, you can use either an if-else combination or switch-case to launch the application that corresponds to the name you said. To launch the application, you can do that easily using “Process.Start” and passing it the name of the application you want to run. Here is the switch-case code I used for this tutorial.

switch (command)
{
case “Notepad”:
Process.Start(“notepad.exe”);
break;
case “Calculator”:
Process.Start(“calc.exe”);
break;
case “Paint”:
Process.Start(“mspaint.exe”);
break;
case “Internet Explorer”:
Process.Start(@”C:\Program Files\Internet Explorer\iexplore.exe”);
break;
}

That’s all folks! You now have yourself a fully functional speech recognition application that is capable of launching external applications based on the command you tell it. To wrap things up, here is the complete code I used for this tutorial. Feel free to leave any questions and / or suggestions in the comments below. And, as always, HAPPY CODING!

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Diagnostics;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
using System.Windows.Forms;
using System.Speech.Recognition;

namespace SpeechRecognitionExample
{
    public partial class Form1 : Form
    {
        private SpeechRecognitionEngine recognitionEngine;

        public Form1()
        {
            InitializeComponent();

            recognitionEngine = new SpeechRecognitionEngine();
            recognitionEngine.SetInputToDefaultAudioDevice();
            recognitionEngine.SpeechRecognized += (s, args) =>
            {
                string line = "";
                foreach (RecognizedWordUnit word in args.Result.Words)
                {
                    if (word.Confidence > 0.5f)
                        line += word.Text + " ";
                }

                string command = Regex.Replace(line, "Start", "").Trim();

                switch (command)
                {
                    case "Notepad":
                        Process.Start("notepad.exe");
                        break;
                    case "Calculator":
                        Process.Start("calc.exe");
                        break;
                    case "Paint":
                        Process.Start("mspaint.exe");
                        break;
                    case "Internet Explorer":
                        Process.Start(@"C:\Program Files\Internet Explorer\iexplore.exe");
                        break;
                }

                txtOutput.Text += line;
                txtOutput.Text += Environment.NewLine;
            };
            recognitionEngine.LoadGrammar(CreateGrammarObject());
        }

        private void btnStart_Click(object sender, EventArgs e)
        {
            recognitionEngine.RecognizeAsync(RecognizeMode.Multiple);
        }

        private void btnStop_Click(object sender, EventArgs e)
        {
            recognitionEngine.RecognizeAsyncStop();
        }

        private Grammar CreateGrammarObject()
        {
            Choices commandChoices = new Choices("Calculator", "Notepad", "Internet Explorer", "Paint");
            GrammarBuilder grammarBuilder = new GrammarBuilder("Start");
            grammarBuilder.Append(commandChoices);
            Grammar g = new Grammar(grammarBuilder);
            //g.Name = "Available Programs";
            return g;
        }
    }
}

Related Posts

Tagged with:  

34 Responses to Simple Speech Recognition Using C# – Part 2

  1. shuvro says:

    LuCuS,how can i interact mouse pointer and speech recognition in C#?Suppose i will tell “Left” and then mouse will move towards left.Similarly,moves right when i tell “Right”.If it is possible then can u give a tutorial over this topic or can you suggest any link of this topic?Thanks for any advance..

  2. vetinari says:

    Hi LuCus, not sure if you still monitor these comments or not but I have a problem with the code, it doesnt print anything or run the programs! Everything builds fine and runs but when i try to use the program nothing happens. Im from the UK so I had an error that said “The language for the grammar does not match the language of the speech recognizer.” I went into my control panel and changed my speech recognition language from en-GB to en-US and it fixed the error but the program still doesnt function properly.
    Any Ideas whats wrong? Thanks for the great tutorial man just wish it would work :)

    • LuCuS says:

      Do you have a heavy English accent? It could be that the application isn’t picking up on your accent. I’ve actually seen this happen before with other users that have a heavy accent. To test this theory, you will need to add your culture to the app. First, you will need to add a reference to System.Globalization (at the top, add “using System.Globalization;”). At line 22 of the code above, add the line “CultureInfo culture = new CultureInfo(“en-GB”);” Then, pass the culture into the constructor for your SpeechRecognitionEngine at line 23 like this “recognitionEngine = new SpeechRecognitionEngine(culture);” When it’s all done, you should end up with something like this:

      using System;
      using System.Globalization;
      ...
      public Form1()
      {
      InitializeComponent();
      CultureInfo culture = new CultureInfo("en-GB");
      recognitionEngine = new SpeechRecognitionEngine(culture);
      ...

      • vetinari says:

        Now the error re-appears where it says “The language for the grammar does not match the language of the speech recognizer”. Last time i fixed it in control panel but not this time. Are there any settings i need to change?

        • LuCuS says:

          Did you change your locale back to en-GB in Control Panel?

        • LuCuS says:

          You might also want to put the following line after the line that built your CultureInfo.

          System.Threading.Thread.CurrentThread.CurrentCulture = culture;

          • vetinari says:

            Thanks again for the reply, i have done both of your suggestions and i still get the error. In speech recognition in the control panel my language is set to English – UK. Does this match up with en-GB? the only other option in there is English-US. Sorry to be a pain and keep bothering you

          • vetinari says:

            Scratch that last reply i managed to find the solution. Under the System.Threading line you just gave me I added another one for CurrentThread,CurrentUICulture and it worked perfectly, thankyou for all your help!

          • vetinari says:

            Also, i’m trying to implement the speech recognition into a simple c# game in visual studio, any ideas on how i could transfer this code to the game, not using a Form?

          • LuCuS says:

            Glad you got it working. Are you working on an XNA game? If so, remove the line “using System.Windows.Forms;” and it will show you all of the stuff that can be removed to get rid of the form. At line 15, you will also need to remove the words “partial” and ” : Form”. In the switch-case section, instead of calling “Process.start”, you will need to replace those with the code that does whatever you want your game to do based on the commands passed in.

          • vetinari says:

            Yes its an XNA game. Basically what i want to do is make it so that my character jumps when i say the word jump, or up or some variant. The game is going to be a 2D parallax side scroller where you have to jump obstacles but i want to use Speech controls rather than buttons (i know buttons would be easier but i want to make a speech program). I did what you told me to do but i have a few errors with the InitializeComponent(); line [error: method must have a return type]. Also i have an error for recognitionEngine which says [recognition engine is a field but is used like a type] any suggestions? Thankyou for your help with this

          • LuCuS says:

            You’ll need to also remove the InitializeComponent(); line and the btnStart_click and btnStop_click methods. You’ll then need to put this line “recognitionEngine.RecognizeAsync(RecognizeMode.Multiple);” after your “recognitionEngine.LoadGrammar(CreateGrammarObject());” line. Which line is telling you that recog. engine is used like a type?

          • LuCuS says:

            In the end, you’ll only really need this one method. Then, in your “initialize” method for XNA, you will need to call the following method:

            public initializeSpeechEngine()
            {
            SpeechRecognitionEngine recognitionEngine = new SpeechRecognitionEngine();
            recognitionEngine.SetInputToDefaultAudioDevice();
            recognitionEngine.SpeechRecognized += (s, args) =>
            {
            string line = "";
            foreach (RecognizedWordUnit word in args.Result.Words)
            {
            if (word.Confidence > 0.5f)
            line += word.Text + " ";
            }

            string command = line.Trim();

            switch (command)
            {
            case "jump":
            // Add code to jump
            break;
            case "left":
            // Add code to turn left
            break;
            case "right":
            // Add code to turn right
            break;
            case "down":
            // Add code to duck down
            break;
            }
            };

            Choices commandChoices = new Choices("jump", "left", "right", "down");
            GrammarBuilder grammarBuilder = new GrammarBuilder();
            grammarBuilder.Append(commandChoices);
            Grammar g = new Grammar(grammarBuilder);
            recognitionEngine.LoadGrammar(g);
            recognitionEngine.RecognizeAsync(RecognizeMode.Multiple);
            }

          • vetinari says:

            Okay i’m a little confused, I understand that the code in your last reply goes in the initialize method, however i’m unsure as to what goes in the speech recognition class?

          • LuCuS says:

            You will no longer have a speech recognition class. All of that is being moved into your Initialize method.

          • vetinari says:

            Okay thanks a lot for your help, one last thing assuming this works. Now that ive moved that to the initialize, where do i put my culture lines to recognise en-GB? still in the initialize method?

          • LuCuS says:

            Yes. All of your speech recog. code will go in your initialize method.

        • vetinari says:

          Also, Where is the code that tells the program to start listening for my input? I can’t seem to see it unless im being dumb as usual :)

          • LuCuS says:

            The last line of that code snippet I sent you “recognitionEngine.RecognizeAsync(RecognizeMode.Multiple);” tells the engine to begin listening for input. You might want to move this line to another place within your game so that it doesn’t begin listening as soon as you run it. Instead, it should wait until the level has been loaded and the game has begun. The new location for that line will of course depend on how you’ve designed your game and where the gameplay actually begins.

          • vetinari says:

            Okay ive set everything up as requested but it do what i ask (draw a string to screen for now) any ideas? it compiles and runs fine just no output from the recognition, sorry to keep bothering you about this, once i get it working it will be sorted i just dont understand it very well atm

          • LuCuS says:

            It’s not a problem at all. That’s what I’m here for. However, I don’t understand your question. Can you explain it a little more? Maybe it’s just too early for my brain to be engaged.

          • vetinari says:

            Sorry i forget sometimes that its a different time zone for you! Okay here goes. I’ve set the code for the speech recognition as you suggested and everything compiles fine. However when i speak the word Jump nothing happens. I have entered some draw code to write the word Jump to the screen when i speak (just so i know it works) but that doesnt happen. I’ve put the spriteBatch.DrawString method in the initialize as well under the case “jump”: line and before break;. Is this the problem? I don’t have a working game yet as I wanted to build the game around the speech aspect, hence why i am not making my character jump yet. Any suggestions as to how to make the jump command write to the screen for now? Sorry its long winded but hope this makes more sense. Thanks again for your help

          • LuCuS says:

            Ah, I get it now. Yes, that is the problem. Every time you want to draw something to the screen (including text), you have to do it in your “draw” method. Parts of the speech code will need to be moved into your “update” and “draw” methods. Later tonight, I’ll try to put together a working example for you. I’ve done a ton of XNA programming, but don’t have the time to put it together for you right this minute. I used to own a website called EverydayXNA.com, but I didn’t have time to keep up with that site and everything else I had going on. So, the domain name expired. But, I do still remember how to do a lot of that stuff. I wish I still had that site running as I had a article for a game there that worked very similar to what you’re wanting to do.

          • vetinari says:

            Thanks a lot that would be really helpful :) yeah your tutorials are really useful and good to follow i wish i would have been able to see that site

          • LuCuS says:

            Sorry for the late reply. I’ve had a lot going on lately and not much time for anything else. However, I did finally find time to put together a starter app for you. It currently listens for the commands “up”, “down”, “left”, and “right”. But, you should get the basic idea easy enough to implement your own commands to get your game going. You can download the example source code from http://www.prodigyproductionsllc.com/downloads/GameWithSpeechCommands.zip. Good luck with your game and be sure to share your project story with us. I’d like to see your game when it’s ready.

  3. vetinari says:

    Hi LuCus, Sorry I havent replied i’ve been away from my computer. Thanks a lot for the help its a great starting point, one question, is it right that the code for the movement is in the draw method? this seems really strange to me, thanks.

    • LuCuS says:

      You can and should move the piece for your commands into the Update method. Your Update method should look like this:

      protected override void Update(GameTime gameTime)
      {
      // Allows the game to exit
      if (GamePad.GetState(PlayerIndex.One).Buttons.Back == ButtonState.Pressed)
      this.Exit();

      // TODO: Add your update logic here
      switch (command)
      {
      case "up":
      // Add code to move up
      break;
      case "down":
      // Add code to move down
      break;
      case "left":
      // Add code to move left
      break;
      case "right":
      // Add code to move right
      break;
      }

      base.Update(gameTime);
      }

      • vetinari says:

        Thanks, that’s what I thought just wanted to double check.
        I have some problems with the game. Basically my game is like pong now, I command the bat to move up and down and it does, but it moves right to the screen limits rather than incrementally. This should be easily fixable in my move code. However my main problem is that I want voice control as well as speech control and when I use my first speech command it seems to disable the keyboard commands that otherwise work fine. Not quite sure why, any ideas?

        • vetinari says:

          Also, I would prefer to be able to give multiple commands, i.e. down, down , down, so that it moves a little bit at a time however if i speak the command too quickly it doesn’t recognise it very well, do i have to change the way the recognition engine works to make this possible?

        • LuCuS says:

          Can you share with me your code? It’ll make it easier to assist you.

  4. Sniderman22 says:

    Would it be wise to create a different grammer/recognition engine when you want a single program to be able to recognized a more variety of words? Have, for example, this program here that recognizes and starts apps. If i wanted to also have it perform predifined functions for note taking or anything else not really relevant to starting an app, would it be wise to just create another seperate recognition engine and relating grammer objects and such?
    Oh, and could…string command = Regex.Replace(line, “Start”, “”).Trim(); also recognized two words instead of just start, or a phrase? Same for the choices object that recognizes the names of the app, could it recognize a small phrase per choice index aswell?

    • LuCuS says:

      Creating separate grammars is completely up to you. It’s possible to create them separate or you can create one grammar object with different Choices. As for the line “string command = …”, remove “Start” from your GrammarBuilder in the CreateGrammarObject function. Then, change the line “string command = …” to “string command = line.Trim();”. The only thing those lines do are enforce the engine to only recognize phrases that begin with the word “start”. You can then add as many Choices to listen for as you want. And yes, you can listen for phrases as well.

Leave a Reply