Fast speech recognition in python

Now I make a voice assistant using python and SpeechRecognition, but every time I say a command, the transformation of this command into text takes a long time (5-15 seconds) and it is very unpleasant. Is there any way to make the process faster? Or suggest another library...

Here is the recognition code.

def recognize_cmd():
    r = sr.Recognizer()

    m = sr.Microphone(device_index=1)

    with m as source:
        print("---------")
        r.pause_threshold = 0.5
        r.adjust_for_ambient_noise(source, duration=1)
        audio = r.listen(source)

    try:
        cmd = r.recognize_google(audio, language='en-EN').lower()
        print("[log]User - " + cmd + '\n---------')
    except sr.UnknownValueError:
        talk("Voice is not recognized!")
        cmd = recognize_cmd()

    return cmd

P. S. It is desirable that the library has a function such as adjust_for_ambient_noise() and, if possible, suggest a library that will work offline

Author: insolor, 2020-02-25

1 answers

You can try Vosk

Sample code:

#!/usr/bin/python3

from vosk import Model, KaldiRecognizer
import os

if not os.path.exists("model-en"):
    print ("Please download the model from https://github.com/alphacep/kaldi-android-demo/releases and unpack as 'model' in the current folder.")
    exit (1)

import pyaudio

p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16, channels=1, rate=16000, input=True, frames_per_buffer=8000)
stream.start_stream()

model = Model("model-en")
rec = KaldiRecognizer(model, 16000)

while True:
    data = stream.read(2000)
    if len(data) == 0:
        break
    if rec.AcceptWaveform(data):
        print(rec.Result())
    else:
        print(rec.PartialResult())

print(rec.FinalResult())

Other examples are here.

Set with

pip install vosk

Under Windows

pip install https://github.com/dtreskunov/tiny-kaldi/releases/download/0.3.1.2/vosk-0.3.1.2-cp37-cp37m-win_amd64.whl
 2
Author: Nikolay Shmyrev, 2020-02-26 15:04:36