Ubuntu 18.04 and VOSK Speech Recognition API


Just some quick notes on how to install and use VOSK on Ubuntu 18.04 LTS.

  1. Install using: pip3 install vosk
  2. Get samples from https://github.com/alphacep/vosk-api/tree/master/python/example
  3. Using ffmpeg create input audio files ffmpeg -i video.mkv -c:a pcm_s16le -ac 1 output.wav
    It will create a file of the following type output.wav: RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, mono 48000 Hz
  4. Export text to JSON python3 ./test_simple.py output.wav

This post is also available in: Greek

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.