Fast speech recognition in python
Now I make a voice assistant using python and SpeechRecognition, but every time I say a command, the transformation of this command into text takes a long time (5-15 seconds) and it is very unpleasant. Is there any way to make the process faster? Or suggest another library...
Here is the recognition code.
def recognize_cmd():
r = sr.Recognizer()
m = sr.Microphone(device_index=1)
with m as source:
print("---------")
r.pause_threshold = 0.5
r.adjust_for_ambient_noise(source, duration=1)
audio = r.listen(source)
try:
cmd = r.recognize_google(audio, language='en-EN').lower()
print("[log]User - " + cmd + '\n---------')
except sr.UnknownValueError:
talk("Voice is not recognized!")
cmd = recognize_cmd()
return cmd
P. S. It is desirable that the library has a function such as adjust_for_ambient_noise()
and, if possible, suggest a library that will work offline
4
1 answers
You can try Vosk
Sample code:
#!/usr/bin/python3
from vosk import Model, KaldiRecognizer
import os
if not os.path.exists("model-en"):
print ("Please download the model from https://github.com/alphacep/kaldi-android-demo/releases and unpack as 'model' in the current folder.")
exit (1)
import pyaudio
p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16, channels=1, rate=16000, input=True, frames_per_buffer=8000)
stream.start_stream()
model = Model("model-en")
rec = KaldiRecognizer(model, 16000)
while True:
data = stream.read(2000)
if len(data) == 0:
break
if rec.AcceptWaveform(data):
print(rec.Result())
else:
print(rec.PartialResult())
print(rec.FinalResult())
Other examples are here.
Set with
pip install vosk
Under Windows
pip install https://github.com/dtreskunov/tiny-kaldi/releases/download/0.3.1.2/vosk-0.3.1.2-cp37-cp37m-win_amd64.whl
2
Author: Nikolay Shmyrev, 2020-02-26 15:04:36