[Mobile] Memory crash after repeated inference with dynamic shape input #22520
Labels
api:Java
issues related to the Java API
platform:mobile
issues related to ONNX Runtime mobile; typically submitted using template
Describe the issue
We recently altered our ONNX model used in production in our mobile app to include the preprocessing steps, which were previously done separately prior to inference. Because it is an image model, this means that now the model takes as input an array of raw RGBA bytes of an image, which tends to be a lot of data. We've found that since this change the memory consumption goes continually up as the app performs more inference runs, eventually resulting in a crash.
I was wondering, is there anything we can do in our Java/Kotlin code to make sure memory is getting properly cleared? Aside from the
outputs.close()
andinputTensor.close()
calls that we already have? It seems like GC is not able to keep up with continued inference runs right now.Please see below for the crash logs. Thank you in advance for any and all help!
To reproduce
Urgency
Urgent, as this issue is happening in production, causing crashes and inconvenience for our mobile customers.
Platform
Android
OS Version
Android 14
ONNX Runtime Installation
Released Package
Compiler Version (if 'Built from Source')
No response
Package Name (if 'Released Package')
onnxruntime-android
ONNX Runtime Version or Commit ID
1.18
ONNX Runtime API
Java/Kotlin
Architecture
ARM64
Execution Provider
Default CPU
Execution Provider Library Version
No response
The text was updated successfully, but these errors were encountered: