Speech Recognition is always a difficult and interesting task to do for a lot of beginners. In this post, we are going to describe an easy way to do this tuff task using PocketSphinx. Also, there are more options available in the package other than CMU Sphinx (works offline).
One of the most famous is Google Speech Recognition and Google Cloud Speech API. We may also use Wit.ai
Microsoft Bing Voice Recognition
Snowboy Hotword Detection (works offline)
Installation
If you want to use command line then you may install the pre-requisite first using the following command explained in this file.
sudo apt-get install python python-all-dev python-pip build-essential swig git libpulse-dev
If you are using Mac book then you may use brew to install this.
brew install swig git python
for Python3 you may use these commands
sudo apt-get install python3 python3-all-dev python3-pip build-essential swig git libpulse-dev
After that, you need to install Pocket Sphinx package using pip
pip install pocketsphinx
Speech Recognition Python Library
After installing all the pre-requisite, you can use Python Speech Recognition Library to easily use Sphinx. You can install this library using pip with this command
pip install SpeechRecognition
In Linux, you may need to install following packages as well
sudo apt-get install libasound-dev portaudio19-dev libportaudio2 libportaudiocpp0 sudo apt-get install ffmpeg libav-tools sudo apt-get install python-pyaudio
PyAudio
PyAudio is a package which is used for handling Audio sources in Python. Like Microphones and audio files. So if you had not installed PyAudio Already, You may need to install it now. If you are on Mac you may follow this step
brew install portaudio
pip install pyaudio
Or you can follow instructions on official PyAudio Website.
Permission Denied Error
If you are in Windows and you found some error like this.
IOError: [Errno 13] Permission denied: 'c:\\python27\\Lib\\site-packages\\_porta udio.pyd'
some other similar, then you may need to close your Python IDE or any other python code running in the background. These kinds of errors are explained in this StackOverflow question.
Quick Speech Recognition Test
If you had installed properly the speech recognition Library then you can test if it is properly working or not by simply typing the following command in command prompt
python -m speech_recognition
Finally Here is a source code to test sphinx
If you wish to see all the examples and way of using other available APIs you may follow this GitHub link. That’s if for today. You can comment if you face any problem and you may also want to subscribe to our Youtube Channel