Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible memory leak executing inference multiple times #1939

Open
insanebytes opened this issue Feb 27, 2025 · 5 comments
Open

Possible memory leak executing inference multiple times #1939

insanebytes opened this issue Feb 27, 2025 · 5 comments

Comments

@insanebytes
Copy link

insanebytes commented Feb 27, 2025

Hello when i run inference by this code, i see that memory increases in usage without freeing.

`var cwd = Directory.GetCurrentDirectory();
string modelDirPath = Path.Join(cwd, "Assets", "Voice");
string modelPath = Path.Join(modelDirPath, "vits-vctk.int8.onnx");

OfflineTtsVitsModelConfig modelConfigVits = new OfflineTtsVitsModelConfig();
modelConfigVits.Model = modelPath;
modelConfigVits.Lexicon = Path.Join(modelDirPath, "lexicon.txt");
modelConfigVits.Tokens = Path.Join(modelDirPath, "tokens.txt");

OfflineTtsModelConfig modelConfig = new OfflineTtsModelConfig();
modelConfig.Vits = modelConfigVits;
modelConfig.Provider = "cuda";

OfflineTtsConfig config = new OfflineTtsConfig();
config.Model = modelConfig;

var offlineTts = new SherpaOnnx.OfflineTts(config);

var audioDevice = new WasapiOut();
var waveFormat = WaveFormat.CreateIeeeFloatWaveFormat((offlineTts.SampleRate), 1);
var waveFileStream = new MemoryStream(waveFormat.ConvertLatencyToByteSize(30 * 1000)); //pre allocate 30 seconds
var rawSourceWaveStream = new RawSourceWaveStream(waveFileStream, waveFormat);

audioDevice.Init(rawSourceWaveStream);

while (true)
{
var input = Console.ReadLine();

waveFileStream.Seek(0, SeekOrigin.Begin);
waveFileStream.SetLength(0);

Task.Run(() =>
{
    var result = offlineTts.Generate(input, 1f, 1);
    if (result.Samples.Length > 0)
    {
        var wave = new byte[result.NumSamples * sizeof(float)];
        Buffer.BlockCopy(result.Samples, 0, wave, 0, wave.Length);

        waveFileStream.Write(wave, 0, wave.Length);
        waveFileStream.Position = 0;

        if (audioDevice.PlaybackState == PlaybackState.Stopped)
        {
            audioDevice.Stop();
        }
        audioDevice.Play();

        result.Dispose();
    }
});

}`

when first executed waiting:

Image
274 MB RAM

executed: hello world how are you

Image
341 MB RAM

second execution same phrase:

Image
342 MB RAM

third execution same phrase:
Image
343 MB RAM

Im am disposing the result struct.

Is there something more to dispose? or that 1MB extra per inference is a memory leak?

execution screenshot:

Image

Thank you

@insanebytes
Copy link
Author

Any update about this??

@csukuangfj
Copy link
Collaborator

please run for 10 minutes and post the result

@insanebytes
Copy link
Author

runned 10 min and memory stabilized sorry, but in other machine: D:\a\sherpa-onnx\sherpa-onnx\sherpa-onnx\csrc\session.cc:GetSessionOptionsImpl:176 Please compile with -DSHERPA_ONNX_ENABLE_GPU=ON. Available providers: AzureExecutionProvider, CPUExecutionProvider, . Fallback to cpu!

and i have installed cuda:

Image

@csukuangfj
Copy link
Collaborator

csukuangfj commented Mar 3, 2025

Please follow.our doc and search in sherpa-onnx's issues about running sherpa-onnx with gpu.

@csukuangfj
Copy link
Collaborator

and i have installed cuda:

@insanebytes Please see #1954

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants