We are going to implement the speech to text in Python. And for this, we have to install the following packages:
- pip install Speech Recognition
- pip install PyAudio
So, we import the library Speech Recognition and initialize the speech recognition because without initializing the recognizer, we can’t use the audio as an input, and it will not recognize the audio.
There are two ways to pass the input audio to the recognizer:
- Recorded audio
- Using the default Microphone
So, this time we are implementing the default option (microphone). That’s why we are fetching the module Microphone, as shown below:
With linuxHint.Microphone( ) as microphone
But, if we want to use the pre-recorded audio as a source input, then the syntax will be like this:
With linuxHint.AudioFile(filename) as source
Now, we are using the record method. The syntax of the record method is:
Here the source is our microphone and the duration variable accepts integers, which is seconds. We pass the duration=10 that tells the system how much time the microphone will accept voice from the user and then closes it automatically.
Then we use the recognize_google( ) method which accepts the audio and covert the audio to a text form.
The above code accepts input from the microphone. But sometimes, we want to give input from the pre-recorded audio. So, for that, the code is given below. The syntax for this was already explained above.
We can also change the language option in the recognize_google method. As we change the language from English to Hindi, as shown below: