Trying out DeepSpeech on a Raspberry Pi 4

{}
January 3rd, 2020

Deep Speech is an open speech-to-text engine by Mozilla. Speech synthesis and Speech to text are fun to try out, and I read that it could run on a Raspberry Pi4 with ease on one core, so I decided to give it a try.

The Raspberry Pi version is using Google’s TensorFlow Lite for an implementation of Baidu’s DeepSpeech architecture.

Installing it on a Raspberry 4 Buster distribution was not straightforward. First I read instructions on the Github page and tried to download and install the git version and, but I ran into problems. It was taking ages and I ran into the famous `wheels` problem.

Failed building wheel for scipy

After tweaking and trying a few times, i gave up on the Github version and tried the instructions here, but also that was a bumpy road. But success waits in the end.

Let’s go, how to install DeepSpeech on the RPI4

Create a dev directory:

mkdir dev
cd dev

Create a Python Virtual environment.

python3 -m venv deepspeech-train-venv

Activate the virtual environment

source dev/deepspeech-train-venv/

Create the deepspeech directory

mkdir deepspeech
cd deepspeech

Install deepspeech

pip install deepspeech

Download pre-trained English model

curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.6.0/deepspeech-0.6.0-models.tar.gz
tar xvf deepspeech-0.6.0-models.tar.gz

Download example audio files

curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.6.0/audio-0.6.0.tar.gz
tar xvf audio-0.6.0.tar.gz

Done, run, well , eh, I tried to run the example on the instruction page

deepspeech --model deepspeech-0.6.0-models/output_graph.pbmm --lm deepspeech-0.6.0-models/lm.binary --trie deepspeech-0.6.0-models/trie --audio audio/2830-3980-0043.wav

Errors!?!  I installed a missing dependency:

sudo apt install libatlas3-base

Still errors

ModuleNotFoundError: No module named 'numpy.core._multiarray_umath'

So I check if I had numpy installed

pip install numpy
Looking in indexes: https://pypi.org/simple, https://www.piwheels.org/simple
Requirement already satisfied: numpy in /home/pi/dev/deepspeech-train-venv/lib/python3.7/site-packages (1.15.4)

I decided to update numpy:

pip install --upgrade numpy
Looking in indexes: https://pypi.org/simple, https://www.piwheels.org/simple
Collecting numpy
Using cached https://www.piwheels.org/simple/numpy/numpy-1.18.0-cp37-cp37m-linux_armv7l.whl
tensorboard 2.0.2 has requirement setuptools>=41.0.0, but you'll have setuptools 40.8.0 which is incompatible.
Installing collected packages: numpy
Found existing installation: numpy 1.15.4
Uninstalling numpy-1.15.4:
Successfully uninstalled numpy-1.15.4
Successfully installed numpy-1.18.0

So i decided to update setuptools too:

pip install --upgrade setuptools
Looking in indexes: https://pypi.org/simple, https://www.piwheels.org/simple
Collecting setuptools
Using cached https://files.pythonhosted.org/packages/f9/d3/955738b20d3832dfa3cd3d9b07e29a8162edb480bf988332f5e6e48ca444/setuptools-44.0.0-py2.py3-none-any.whl
Installing collected packages: setuptools
Found existing installation: setuptools 40.8.0
Uninstalling setuptools-40.8.0:
Successfully uninstalled setuptools-40.8.0
Successfully installed setuptools-44.0.0

I tried to run the example on the instruction page again
# Transcribe an audio file

deepspeech --model deepspeech-0.6.0-models/output_graph.pbmm --lm deepspeech-0.6.0-models/lm.binary --trie deepspeech-0.6.0-models/trie --audio audio/2830-3980-0043.wav

Another error

Loading model from file deepspeech-0.6.0-models/output_graph.pbmm
TensorFlow: v1.14.0-21-ge77504a
DeepSpeech: v0.6.0-0-g6d43e21
ERROR: Model provided has model identifier '='+;', should be 'TFL3'

Didn’t work. I needed to change the model to `tflite`

deepspeech --model deepspeech-0.6.0-models/output_graph.tflite --lm deepspeech-0.6.0-models/lm.binary --trie deepspeech-0.6.0-models/trie --audio audio/2830-3980-0043.wav

Success in the end!

Loading model from file deepspeech-0.6.0-models/output_graph.tflite
TensorFlow: v1.14.0-21-ge77504a
DeepSpeech: v0.6.0-0-g6d43e21
INFO: Initialized TensorFlow Lite runtime.
Loaded model in 0.0019s.
Running inference.
why should one hault on the way
Inference took 4.091s for 2.735s audio file.

Then I played the audio-file:

aplay audio/4507-16021-0012.wav

Must say DeepSpeech is much smarter then me, I couldn’t understand it:
why should one hault on the way

BTW good question. No I need another engine to answer that!

Way to go, folks.

Tags:

One Response to “Trying out DeepSpeech on a Raspberry Pi 4”

  1. Trying out DeepSpeech on a Raspberry Pi 4 - dev.webonomic.nl Says:

    Kris Jungbluth

    I found a great…

Leave a Reply