FirefoxIt seems as though speech recognition is once again becoming a very popular topic. Over the last few weeks, I’ve received hundreds of emails from readers asking about how to do different things using speech recognition and C#. One of the requests I received was from a reader interested in controlling Firefox using speech recognition and C#. Curious as to this myself, I put together a simple application to get him started and now I’m going to share that app with the rest of you. I’m not going to explain in detail how any of this works right now because I’ve already explained the basics in my other speech recognition articles (1, 2, 3). Instead, I’m just going to post the code here for the rest of you to enjoy. The only commands I have implemented so far are the “home” and “back” buttons. But, the example still provides you with enough information to implement more methods. And, as always, if you have any questions or comments, please leave them in the comments section below. You can download the complete project at http://www.prodigyproductionsllc.com/downloads/FirefoxAutomation.zip.

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;
using System.Reflection;

using System.Windows.Automation;

using Automation = System.Windows.Automation;

using System.Speech.Recognition;

namespace FirefoxAutomation
{
    public partial class MainForm : Form
    {
        private SpeechRecognitionEngine recognitionEngine;

        public MainForm()
        {
            InitializeComponent();
        }

        private void MainForm_FormClosing(object sender, FormClosingEventArgs e)
        {
            if (recognitionEngine != null)
                recognitionEngine.RecognizeAsyncStop();
        }

        private void btnStart_Click(object sender, EventArgs e)
        {
            Initialize();
        }

        #region Cross Thread Control Delegates
        delegate void SetValueDelegate(Object obj, Object val, Object[] index);
        private void SetControlProperty(Control ctrl, String propName, Object val)
        {
            PropertyInfo propInfo = ctrl.GetType().GetProperty(propName);
            Delegate dgtSetValue = new SetValueDelegate(propInfo.SetValue);
            ctrl.Invoke(dgtSetValue, new Object[3] { ctrl, val, /*index*/ null });
        }

        delegate object GetText(Control ctrl, String propName);
        private object GetControlProperty(Control ctrl, String propName)
        {
            return ctrl.GetType().GetProperty(propName).GetValue(ctrl, null);
        }
        #endregion

        private void LogMessage(String message)
        {
            string tempText = txtConsole.Invoke(new GetText(GetControlProperty), txtConsole, "Text").ToString();
            SetControlProperty(txtConsole, "Text", tempText + message + Environment.NewLine);
        }

        private void Initialize()
        {
            AutomationElement rootElement = AutomationElement.RootElement;

            if (rootElement != null)
            {
                Automation.Condition condition = new PropertyCondition(AutomationElement.ClassNameProperty, "MozillaWindowClass");

                LogMessage("Searching for Firefox Window...");
                AutomationElement appElement = rootElement.FindFirst(TreeScope.Children, condition);

                if (appElement != null)
                {
                    Automation.Condition btnBackCondition = new PropertyCondition(AutomationElement.NameProperty, "Back");
                    LogMessage("Searching for the back button...");
                    AutomationElement btnBack = appElement.FindFirst(TreeScope.Subtree, btnBackCondition);
                    if (btnBack == null)
                    {
                        LogMessage("Error finding back button");
                        return;
                    }
                    InvokePattern btnBackPattern = btnBack.GetCurrentPattern(InvokePattern.Pattern) as InvokePattern;
                    LogMessage("Found the back button");

                    Automation.Condition btnHomeCondition = new PropertyCondition(AutomationElement.NameProperty, "Home");
                    LogMessage("Searching for the home button...");
                    AutomationElement btnHome = appElement.FindFirst(TreeScope.Subtree, btnHomeCondition);
                    if (btnHome == null)
                    {
                        LogMessage("Error finding Home button");
                        return;
                    }
                    InvokePattern btnHomePattern = btnHome.GetCurrentPattern(InvokePattern.Pattern) as InvokePattern;
                    LogMessage("Found the home button");

                    LogMessage("Beginning speech recognition...");
                    recognitionEngine = new SpeechRecognitionEngine();
                    recognitionEngine.SetInputToDefaultAudioDevice();
                    recognitionEngine.SpeechRecognized += (s, args) =>
                    {
                        string line = "";
                        foreach (RecognizedWordUnit word in args.Result.Words)
                        {
                            if (word.Confidence > 0.5f)
                                line += word.Text + " ";
                        }

                        string command = line.Trim();

                        switch (command)
                        {
                            case "back":
                                LogMessage("Clicking the back button");
                                btnBackPattern.Invoke();
                                LogMessage("Clicked the back button");
                                break;
                            case "home":
                                LogMessage("Clicking the home button");
                                btnHomePattern.Invoke();
                                LogMessage("Clicked the home button");
                                break;
                        }

                        LogMessage(line);
                    };

                    recognitionEngine.UnloadAllGrammars();
                    recognitionEngine.LoadGrammar(CreateGrammars());
                    recognitionEngine.RecognizeAsync(RecognizeMode.Multiple);
                }
                else
                {
                    LogMessage("Error locating Firefox window");
                }
            }
        }

        private Grammar CreateGrammars()
        {
            Choices commandChoices = new Choices("back", "home");
            GrammarBuilder grammarBuilder = new GrammarBuilder();
            grammarBuilder.Append(commandChoices);
            return new Grammar(grammarBuilder);
        }
    }
}

Thank you for your interest in my site. If you find the information provided on this site useful, please consider making a donation to help continue development!

PayPal will open in a new tab.
$2.00
$5.00
Other

Related Posts

Tagged with:  

29 Responses to Control Firefox with Speech Recognition and C#

  1. Visitha says:

    Haai lucus,

    Can you guide me to scroll up, scroll down, select a link and browse identified URLs in firefox..I have implement more functionality to your application..Thanks for your guidance..
    If you have time please give guidance for this..

    • LuCuS says:

      To scroll, you will need to create a new condition where AutomationElement.LocalizedControlTypeProperty = “document”. Then, you’ll need to create a new ScrollPattern to listen for “scroll up” and “scroll down” commands. When those are received, you’ll need to tell your document pattern to “ScrollVertical(ScrollAmount.SmallDecrement)” or “ScrollVertical(ScrollAmount.SmallIncrement)”. For clicking links, you’ll need to search the document automation element you located to do the scrolling for any element where AutomationElement.LocalizedControlProperty = “hyperlink”. Once you’ve found those, you’ll invoke them the same way you did your buttons. Other commands you’ll need to implement though are “next link” and “previous link”. Then, you’ll need to draw a rectangle or something around them to identify which link will get invoked when the user issues a “click link” command.

      • Visitha says:

        Please if you can give me the code for scroll up and scroll down..I had an error when writing that code..=(

      • Visitha says:

        If you can please improve your application to scroll down and scroll up a firefox page..It will be really helpful to me..
        Thanks for advancing..

        • LuCuS says:

          I really don’t have the time to work on this project too much. But, you’re basically looking at something like this:

          Automation.Condition documentCondition = new PropertyCondition(AutomationElement.LocalizedControlTypeProperty, "document");
          LogMessage("Searching for the document...");
          AutomationElement document = appElement.FindFirst(TreeScope.Subtree, documentCondition);
          if (document == null)
          {
          LogMessage("Error finding the document");
          return;
          }
          ScrollPattern documentPattern = document.GetCurrentPattern(ScrollPattern.Pattern) as ScrollPattern;
          LogMessage("Found the document");

          In your switch case:
          case "scroll up":
          LogMessage("Scrolling up");
          documentPattern.ScrollVertical(ScrollAmount.SmallDecrement);
          LogMessage("Scrolled up");
          break;
          case "scroll down":
          LogMessage("Scrolling down");
          documentPattern.ScrollVertical(ScrollAmount.SmallIncrement);
          LogMessage("Scrolled down");
          break;

          • Visitha says:

            Thank you very very much mr lucus..That is the code which i was looking for..But it gives a run time error in line ” ScrollPattern documentPattern = document.GetCurrentPattern(ScrollPattern.Pattern) as ScrollPattern;” by saying “Unsupported Pattern”…

            This error message was displayed when i made a method to create new tab too..But it was solved when I first make a new tab in firefox and then run the application..
            But this time I couldn’t find a way to solve it..Is the only requirement to run the firefox application in the background or any thing else?? Is there any thing which I shuold do or run before run this application..?? I’m working on Firefox version 10.

          • Visitha says:

            private const int WM_VSCROLL = 277; // Vertical scroll
            private const int SB_PAGEUP = 2; // Scrolls one page up
            private const int SB_PAGEDOWN = 3; // Scrolls one page down

            [DllImport(“user32.dll”,CharSet=CharSet.Auto)]
            private static extern int SendMessage(IntPtr hWnd, int wMsg,IntPtr wParam, IntPtr lParam);

            SendMessage( Control Handle , WM Scroll Message, (IntPtr) Scroll Command ,IntPtr.Zero);

            can’t we use above method to scroll up or down fire fox window..??Above function is perfectly works with a text box but I couldn’t scroll up/down fire fox using this method..
            Please give any idea you have..
            Thanks for advance..

          • LuCuS says:

            There are ways of calling directly into Firefox like you mentioned using the DllImport. There are also other tools such as Selenium – browser automation framework (http://code.google.com/p/selenium/?redir=1) which can help with things like this. I’ve never tried to automate Firefox that much using external tools such as C#. So, everything I’ve suggested is all theoretical.

          • suman says:

            i tried the above code to handle the scrollbar to automate firefox .
            but i am getting an error in the scroll pattern plz help me with the code……
            (ScrollPattern documentPattern = document.GetCurrentPattern(ScrollPattern.Pattern) as ScrollPattern;)

          • suman says:

            but the above code gives an error in this line:
            ScrollPattern documentPattern = document.GetCurrentPattern(ScrollPattern.Pattern) as ScrollPattern;

            the error msg is unspecified pattern……..

  2. Visitha says:

    Thankz Lucus,
    Can you explain that why I am getting a run time error in line ” ScrollPattern documentPattern = document.GetCurrentPattern(ScrollPattern.Pattern) as ScrollPattern;” by saying “Unsupported Pattern”…Did I miss any thing before run the application or anything to import to the application. Is it necessary to run UISpy.exe when run the application. Only thing I am doing is run the firefox on the background before run the application. When I got that error when creating “new tab” it was solved when make a new tab and then run the application..

    • LuCuS says:

      You don’t need to be running UISpy when running your application. But, you will need to use UISpy to walk down the application dom until you reach the document. When you get there, look on the right of UISpy to see what control patterns it supports. I haven’t actually tried the scroll pattern in Firefox. I’ve used it with several other apps and it has always worked great. You might need to change your automation condition to point to another element in the dom besides document.

  3. Visitha says:

    Hi Lucus,
    After long time..Actually I was very busy with my firefox automation project at university. Last week I successfully finalized my project. Thanks very very much your help Lucus. With out your help, I may not be able to finish it successfully. Please continue your service for newcomers in all over the world. And please share your great knowledge with us by publishing more and more articles like this.

    Thanks again Lucus..!!

    • LuCuS says:

      Awesome! I’m glad I could help. I also hope you continue following and contributing to my blog. And, when you turn your project into a commercial product and start making money with it, I’d like to hear about it. Keep being innovative!

  4. shuvro says:

    Hi LuCuS ,
    After a long time i make contact with you as i was very busy with my website project at university.But i never loss my interest on speech recognition for which i again come here.Because i think this is one of the best blog on speech recognition for newcomers.
    Now come to topic.How can i use text to speech for reading the contents of a webpage.A webpage may contain various items.How can i classify them which one first read by the speech engine.I expect a tutorial from you if you have time.

    • LuCuS says:

      I’m currently in the middle of several projects, including several new articles for this site. However, I will share some simple code with you that will help you get started with text-to-speech. As soon as time permits, I’ll write up an article that shows how to read web pages. Basically, you’ll need to scrape a webpage and strip out anything that is not visible text (such as titles, meta data, keywords, html, javascript, css, etc…). Once you’re down to the “meat” of the web page, it’s extremely easy to read it out loud. First, add a new reference to your app for “System.Speech” by right-clicking on “References” in your “Solution Explorer” and selecting “Add Reference…”. Go to the “.NET” tab and select “System.Speech”. Then, add a reference to that library with a “using” statement. After that, add a global “SpeechSynthesizer”. This is what does all of the work for you. All you have to do is pass the text you want read to the “Speak” or “SpeakAsync” methods. I prefer to use the “SpeakAsync” method so that it doesn’t freeze my application until it has finished reading the text. Also, you want to make sure you Dispose of your SpeechSynthesizer every time you close your app. Otherwise your system sound controls and speech synth can get all out of whack and you’ll have to restart your computer. You can download the example app I just threw together from http://www.prodigyproductionsllc.com/downloads/TextToSpeech.zip. For a quick reference, here is all of the code I used in the example to read text from a rich text box. Hopefully this will get you started with TTS.

      using System;
      using System.Windows.Forms;
      using System.Speech.Synthesis;

      namespace TextToSpeech
      {
      public partial class Form1 : Form
      {
      private SpeechSynthesizer _synth;

      public Form1()
      {
      InitializeComponent();
      this._synth = new SpeechSynthesizer();
      }

      private void btnReadIt_Click(object sender, EventArgs e)
      {
      this._synth.SpeakAsync(txtInput.Text);
      }

      private void Form1_FormClosing(object sender, FormClosingEventArgs e)
      {
      if (this._synth != null)
      this._synth.Dispose();
      }
      }
      }

      • shuvro says:

        Yes LuCuS,I have done it past and at that time u also advised me.Actually i want to implement it for a webpage.I’m not in a hurry now,so when u will get time i expect u will give an example.I will wait for your tutorial.Thanks again..

  5. Deepak says:

    Hi ,

    You article is very nice and I tried it and it works.
    I was also tiring with some other application but I got stuck in finding appropriate ClassNameProperty for my other application.
    Automation.Condition condition = new PropertyCondition(AutomationElement.ClassNameProperty, “MozillaWindowClass”);

    I tried with different names like Class name, main form name, namespce of application but not able to find. Can you please tell how to get ClassNameProperty like you have mention “MozillaWindowClass”.

  6. Deepak says:

    Thanks for reply LuCus……it helped alot.

  7. shuvro says:

    If i want to control Internet Explorer then what criteria i should follow?

  8. shuvro says:

    Anyone here to answer me..??

    • LuCuS says:

      Sorry for the late reply. I approved your comment from my cellphone with the intention of replying to you when I got to a computer. But, I guess I forgot to get back to you.

      Anyways, I don’t know the exact conditions and patterns you will need. But, you can find those the same way I did when automating Firefox. In this article (http://www.prodigyproductionsllc.com/articles/automation/windows-automation-with-c/), I mention a tool called “UISpy”. I also have a download for UISpy in that article. Use UISpy and follow the instructions in that article to walk the Internet Explorer DOM to see which conditions and patterns you will need to use.

      To get you started, the first thing you will want to locate and add a reference to is the IE window. You can do that at line 67 in the code from above. Just change the condition to look for ClassNameProperty of “IEFrame”.

  9. wheels says:

    I cannot seem to get the code to work. There are no errors, but no text appears in the Rich TextBox on the WinForm. I am reading from a .wav file.

    using System;
    using System.Text;
    using System.Windows.Forms;
    using System.Speech.Recognition;

    namespace SpeechRecognition
    {
    public partial class MainForm : Form
    {
    public MainForm()
    {
    InitializeComponent();
    SpeechRecognitionEngine recognitionEngine = new SpeechRecognitionEngine();
    recognitionEngine.SetInputToWaveFile(@”c:\Users\ID\Desktop\Transcription\overhere.wav”);
    //recognitionEngine.SetInputToDefaultAudioDevice();
    recognitionEngine.SpeechRecognized += (s, args) =>
    {
    foreach (RecognizedWordUnit word in args.Result.Words)
    {
    // You can change the minimun confidence level here
    //if (word.Confidence > 0.8f)
    if (word.Confidence > 0)
    freeTextBox.Text += word.Text + ” “;
    }
    freeTextBox.Text += Environment.NewLine;

    };
    Grammar g = new Grammar(@”c:\\Users\\ID\\Desktop\\Transcription\\grammar.xml”, “command1”);
    recognitionEngine.LoadGrammar(g);
    recognitionEngine.RecognizeAsync(RecognizeMode.Multiple);
    }

    }
    }

    Can anyone see where I may be going astray?

    • LuCuS says:

      Are you expecting this to listen for specific commands mentioned in the WAV file? If so, you have to make sure that the commands you are looking for in the WAV file are also listed in your grammar.xml file. If the only thing you are wanting to do is dictate everything that is said in the WAV file, replace the following 2 lines:

      Grammar g = new Grammar(@”c:\\Users\\ID\\Desktop\\Transcription\\grammar.xml”, “command1″);
      recognitionEngine.LoadGrammar(g);

      with this:

      recognitionEngine.LoadGrammar(new DictationGrammar());

  10. wheels says:

    I have a bunch of commonly spoken words in the xml file that I want the code to ‘listen’ for. I replaced my code with the recognitionEngine.LoadGrammar(new DictationGrammar()); and I finally got some output (some .wav files and others not). For example I have a .wav file the says “I’ve fallen and I can’t get up”. With the xml (all words are in there) I get ‘that’. With the recognitionEngine.LoadGrammar(new DictationGrammar()); method, I get ‘then high’. I originally added the xml to increase the percentage of success.

    When I researched online, I saw references to Natural Speak. WHEELS

    • LuCuS says:

      Does the voice in the WAV file have an accent? Is your computer’s language set to English? You can check it by going to Control Panel > Region and Language. If the language / format is not set to English, you have 2 choices. You can either change the language to English or you can incorporate a “CultureInfo” where you will need to include the language / culture code for the language / dialect being spoken in the WAV file. Here is the code that shows how to do that.

      System.Globalization.CultureInfo culture = new System.Globalization.CultureInfo(“en-US”);
      recognitionEngine = new SpeechRecognitionEngine(culture);

  11. wheels says:

    No accent in the .wav files. My computer language is set to English. I added the CultureInfo, but no difference in output. Can you set up a grammer file (I would be happy to send you mine), and try ‘reading’ a .wav file.

Leave a Reply