Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expanding Portuguese Model #339

Closed
GabrieldeAS opened this issue Jan 5, 2021 · 3 comments
Closed

Expanding Portuguese Model #339

GabrieldeAS opened this issue Jan 5, 2021 · 3 comments

Comments

@GabrieldeAS
Copy link

I'm trying to upgrade the Portuguese language Model, by expanding the lexicon and grammar.

Following the instructions in section "Updating the language model" in https://alphacephei.com/vosk/adaptation I successfully expanded the grammar. But it is not enough, as adding new words is important.

To expand the Lexicon i will need either (preserving the phoneme set):
1 - The original word-phoneme Lexicon, to serve as training reference for new words
2 - The tools/process used to generate the original Lexicon

To solve using (1) i will need some way to extract the word-phoneme Lexicon from HCLr.fst. Is there a tool for this?
Or could you please indicate the toolchain for generating a new Lexicon under the same phoneme set?

Is there some important caution in this expansion so that the system will not slow down (a lot) or lose accuracy? Is it also necessary to provide acoustic examples and training for the added words?

@nshmyrev
Copy link
Collaborator

nshmyrev commented Jan 6, 2021

It is not possible now. You have to mail [email protected] and describe details on the project to get support on vocabulary update.

@nshmyrev
Copy link
Collaborator

nshmyrev commented Mar 4, 2021

Same as #382, lets track there

@nshmyrev nshmyrev closed this as completed Mar 4, 2021
@ramleda
Copy link

ramleda commented Jun 18, 2022

@GabrieldeAS did you succeed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants