-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adding words in trained model dictonary #382
Comments
@Tortoise17 this is one of the bigger problems, u need to update the vocabulary, Read this https://alphacephei.com/vosk/adaptation |
In the adaptation page, we can read:
So, I'm still looking for the solution by exploring the Kaldi docs, so far, it seems to be the way, doesn't seem to be easy, if anyone finds a good tutorial, article, documentation on this, I'd appreciate. |
@dazzzed lmao, read past this line, later means two strings below
this might get you on track |
Thank you .. ! You said, that and also there is methodology to update that graph too but these are static graphs which i am interested. what thing make them static and is there any chance to make them dynamic?or retrain from this accuracy point? |
If you have all necessary model files (tree, phonemes) you can build both dynamic graph and static graph with mkgraph.sh and mkgraph_lookahead.sh kaldi scripts
Accuracy is the same, speed is slightly slower
Yes, you can build dynamic graphs from these models with mkgraph_lookahead script which you can find in kaldi repo. |
OMG .. !! you are super great professor. I should be your student really. Lot to learn from you. Thank you so much for this great effort. and I will keep asking some more questions. This is really great work. |
Thank you, @Tortoise17, I hope it is useful for you. You have always have an opportunity to join Vosk project an learn more ;). |
How to join the project? Please let me know. |
Pick up any issue an try so solve. Like this one: |
I will have a look and I will be in contact with you for my updates as well. |
I am still working on this issue, did anyone resolve tit without using static graphs? |
We have recently added documentation on proper process: |
Hello, The main issue is the compilation took a hell of a time, about 24 minutes. It was using only 1 out of my 8 cores and not the GPU at all. |
It is not yet easy to speedup things. Maybe something on srilm part but, in general, it is going to be about that. If you update once a day it is ok. |
It's true it's a "non-problem" for the average application workflow. Now let's say you could click on the word with low confidence and correct it by adding a new word (with it's pronunciation). Obviously, I'm not expecting this kind of seconds-only compilation times, but reducing it by few minutes could be great. Don't misunderstand me, sure even 1h compilation time is good enough, the faster we go (without losing accuracy), the wider the feature spectrum is. |
You also need ngram probabilities for that word, not just pronunciation. So not that straight.
You can check kaldi-active-grammar project, it can do that. |
I was looking at extending the the vocabulary as well but the only source that i have managed to find on how to do this was the link https://chrisearch.wordpress.com/2017/03/11/speech-recognition-using-kaldi-extending-and-using-the-aspire-model/, Can anyone recommend a good starting model that can be used for this. The Aspire chain model is what was used apparently worked but despite taking days for it to make a graph, it results in a graph that is too big for applications like the android-vosk library. Is there perhaps a better starting point than the one they mention in the article? |
Unbelievably well put together solution. Thanks for the link |
How do I configure the model for use after I proceed with adaptation? I want to use the learned model in Python. |
Same as #1687 |
I want to ask that what is the way to add some words in the vosk_trained engine dictionary?
is there any function which can add or customize?
The text was updated successfully, but these errors were encountered: