Deep Speech is an open speech-to-text engine by Mozilla. Speech synthesis and Speech to text are fun to try out, and I read that it could run on a Raspberry Pi4 with ease on one core, so I decided to give it a try.
The Raspberry Pi version is using Google’s TensorFlow Lite for an implementation of Baidu’s DeepSpeech architecture.
Installing it on a Raspberry 4 Buster distribution was not straightforward. First I read instructions on the Github page and tried to download and install the git version and, but I ran into problems. It was taking ages and I ran into the famous `wheels` problem.
Failed building wheel for scipy
After tweaking and trying a few times, i gave up on the Github version and tried the instructions here, but also that was a bumpy road. But success waits in the end.
Let’s go, how to install DeepSpeech on the RPI4
Create a dev directory:
mkdir dev cd dev
Create a Python Virtual environment.
python3 -m venv deepspeech-train-venv
Activate the virtual environment
source dev/deepspeech-train-venv/bin/activate
Create the deepspeech directory
mkdir deepspeech
cd deepspeech
Install deepspeech
pip install deepspeech
Download pre-trained English model
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.6.0/deepspeech-0.6.0-models.tar.gz tar xvf deepspeech-0.6.0-models.tar.gz
Download example audio files
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.6.0/audio-0.6.0.tar.gz tar xvf audio-0.6.0.tar.gz
Done, run, well , eh, I tried to run the example on the instruction page
deepspeech --model deepspeech-0.6.0-models/output_graph.pbmm --lm deepspeech-0.6.0-models/lm.binary --trie deepspeech-0.6.0-models/trie --audio audio/2830-3980-0043.wav
Errors!?! I installed a missing dependency:
sudo apt install libatlas3-base
Still errors
ModuleNotFoundError: No module named 'numpy.core._multiarray_umath'
So I check if I had numpy installed
pip install numpy Looking in indexes: https://pypi.org/simple, https://www.piwheels.org/simple Requirement already satisfied: numpy in /home/pi/dev/deepspeech-train-venv/lib/python3.7/site-packages (1.15.4)
I decided to update numpy:
pip install --upgrade numpy Looking in indexes: https://pypi.org/simple, https://www.piwheels.org/simple Collecting numpy Using cached https://www.piwheels.org/simple/numpy/numpy-1.18.0-cp37-cp37m-linux_armv7l.whl tensorboard 2.0.2 has requirement setuptools>=41.0.0, but you'll have setuptools 40.8.0 which is incompatible. Installing collected packages: numpy Found existing installation: numpy 1.15.4 Uninstalling numpy-1.15.4: Successfully uninstalled numpy-1.15.4 Successfully installed numpy-1.18.0
So i decided to update setuptools too:
pip install --upgrade setuptools Looking in indexes: https://pypi.org/simple, https://www.piwheels.org/simple Collecting setuptools Using cached https://files.pythonhosted.org/packages/f9/d3/955738b20d3832dfa3cd3d9b07e29a8162edb480bf988332f5e6e48ca444/setuptools-44.0.0-py2.py3-none-any.whl Installing collected packages: setuptools Found existing installation: setuptools 40.8.0 Uninstalling setuptools-40.8.0: Successfully uninstalled setuptools-40.8.0 Successfully installed setuptools-44.0.0
I tried to run the example on the instruction page again
# Transcribe an audio file
deepspeech --model deepspeech-0.6.0-models/output_graph.pbmm --lm deepspeech-0.6.0-models/lm.binary --trie deepspeech-0.6.0-models/trie --audio audio/2830-3980-0043.wav
Another error
Loading model from file deepspeech-0.6.0-models/output_graph.pbmm TensorFlow: v1.14.0-21-ge77504a DeepSpeech: v0.6.0-0-g6d43e21 ERROR: Model provided has model identifier '='+;', should be 'TFL3'
Didn’t work. I needed to change the model to `tflite`
deepspeech --model deepspeech-0.6.0-models/output_graph.tflite --lm deepspeech-0.6.0-models/lm.binary --trie deepspeech-0.6.0-models/trie --audio audio/2830-3980-0043.wav
Success in the end!
Loading model from file deepspeech-0.6.0-models/output_graph.tflite
TensorFlow: v1.14.0-21-ge77504a
DeepSpeech: v0.6.0-0-g6d43e21
INFO: Initialized TensorFlow Lite runtime.
Loaded model in 0.0019s.
Running inference.
why should one hault on the way
Inference took 4.091s for 2.735s audio file.
Then I played the audio-file:
aplay audio/4507-16021-0012.wav
Must say DeepSpeech is much smarter then me, I couldn’t understand it:
why should one hault on the way
BTW good question. No I need another engine to answer that!
Way to go, folks.
Tags: rpi
April 25th, 2020 at 11:01 am
[…] an earlier post I described how to install deepspeech on a Raspberry Pi 4. That wasn’t exactly a really smooth install, but I managed in the […]
June 10th, 2020 at 9:03 pm
Hey,
thank you for describing your installation process.
Are you actually using the setup now and how well does it work on the RPi4 and how well is the speech recognition currently?
Thank You!
Max
June 17th, 2020 at 11:17 am
To be honest, I haven’t had much time to play around with it, because I lent my RPI3 to a friend (which I need for recording audio)
The examples are working fine though 😉
July 19th, 2024 at 11:33 pm
My developer is trying to persuade me to move to .net from PHP. I have always disliked the idea because of the expenses. But he’s tryiong none the less. I’ve been using WordPress on numerous websites for about a year and am concerned about switching to another platform. I have heard great things about blogengine.net. Is there a way I can import all my wordpress posts into it? Any kind of help would be really appreciated!