Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Mobile] Memory crash after repeated inference with dynamic shape input #22520

Open
laurenspriem opened this issue Oct 21, 2024 · 1 comment
Open
Labels
api:Java issues related to the Java API platform:mobile issues related to ONNX Runtime mobile; typically submitted using template

Comments

@laurenspriem
Copy link

Describe the issue

We recently altered our ONNX model used in production in our mobile app to include the preprocessing steps, which were previously done separately prior to inference. Because it is an image model, this means that now the model takes as input an array of raw RGBA bytes of an image, which tends to be a lot of data. We've found that since this change the memory consumption goes continually up as the app performs more inference runs, eventually resulting in a crash.

I was wondering, is there anything we can do in our Java/Kotlin code to make sure memory is getting properly cleared? Aside from the outputs.close() and inputTensor.close() calls that we already have? It seems like GC is not able to keep up with continued inference runs right now.

Please see below for the crash logs. Thank you in advance for any and all help!

I/dependent.debug(16354): Background concurrent mark compact GC freed 638KB AllocSpace bytes, 30(193MB) LOS objects, 29% free, 230MB/326MB, paused 762us,6.342ms total 308.671ms
I/dependent.debug(16354): Waiting for a blocking GC Alloc
I/dependent.debug(16354): Background concurrent mark compact GC freed 200KB AllocSpace bytes, 2(104KB) LOS objects, 21% free, 342MB/438MB, paused 453us,7.482ms total 133.554ms
I/dependent.debug(16354): WaitForGcToComplete blocked Alloc on Background for 53.726ms
I/dependent.debug(16354): Starting a blocking GC Alloc
I/dependent.debug(16354): Forcing collection of SoftReferences for 111MB allocation
I/dependent.debug(16354): Starting a blocking GC Alloc
I/dependent.debug(16354): Alloc concurrent mark compact GC freed 52KB AllocSpace bytes, 0(0B) LOS objects, 21% free, 342MB/438MB, paused 454us,3.273ms total 36.865ms
W/dependent.debug(16354): Throwing OutOfMemoryError "Failed to allocate a 117235219 byte allocation with 100630528 free bytes and 169MB until OOM, target footprint 460065312, growth limit 536870912" (VmSize 26713268 kB)
2
I/dependent.debug(16354): Starting a blocking GC Alloc
I/dependent.debug(16354): Alloc concurrent mark compact GC freed 64KB AllocSpace bytes, 0(0B) LOS objects, 21% free, 342MB/438MB, paused 501us,5.569ms total 47.679ms
I/dependent.debug(16354): Forcing collection of SoftReferences for 111MB allocation
I/dependent.debug(16354): Starting a blocking GC Alloc
I/dependent.debug(16354): Alloc concurrent mark compact GC freed 32KB AllocSpace bytes, 0(0B) LOS objects, 21% free, 342MB/438MB, paused 480us,3.199ms total 40.527ms
W/dependent.debug(16354): Throwing OutOfMemoryError "Failed to allocate a 117235219 byte allocation with 100663296 free bytes and 169MB until OOM, target footprint 460065312, growth limit 536870912" (VmSize 26713268 kB)
E/AndroidRuntime(16354): FATAL EXCEPTION: DefaultDispatcher-worker-1
E/AndroidRuntime(16354): Process: io.ente.photos.independent.debug, PID: 16354
E/AndroidRuntime(16354): java.lang.OutOfMemoryError: Failed to allocate a 117235219 byte allocation with 100663296 free bytes and 169MB until OOM, target footprint 460065312, growth limit 536870912
E/AndroidRuntime(16354): at dalvik.system.VMRuntime.newNonMovableArray(Native Method)
E/AndroidRuntime(16354): at java.nio.DirectByteBuffer$MemoryRef.<init>(DirectByteBuffer.java:73)
E/AndroidRuntime(16354): at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:347)
E/AndroidRuntime(16354): at ai.onnxruntime.OrtUtil.prepareBuffer(OrtUtil.java:507)
E/AndroidRuntime(16354): at ai.onnxruntime.OnnxTensor.createTensor(OnnxTensor.java:754)
E/AndroidRuntime(16354): at ai.onnxruntime.OnnxTensor.createTensor(OnnxTensor.java:610)
E/AndroidRuntime(16354): at ai.onnxruntime.OnnxTensor.createTensor(OnnxTensor.java:589)
E/AndroidRuntime(16354): at io.ente.photos.onnx_dart.OnnxDartPlugin$predict$2.invokeSuspend(OnnxDartPlugin.kt:205)
E/AndroidRuntime(16354): at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
E/AndroidRuntime(16354): at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
E/AndroidRuntime(16354): at kotlinx.coroutines.internal.LimitedDispatcher$Worker.run(LimitedDispatcher.kt:115)
E/AndroidRuntime(16354): at kotlinx.coroutines.scheduling.TaskImpl.run(Tasks.kt:100)
E/AndroidRuntime(16354): at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:584)
E/AndroidRuntime(16354): at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:793)
E/AndroidRuntime(16354): at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:697)
E/AndroidRuntime(16354): at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:684)
E/AndroidRuntime(16354): Suppressed: kotlinx.coroutines.internal.DiagnosticCoroutineContextException: [StandaloneCoroutine{Cancelling}@4fa0031, Dispatchers.IO]
I/Process (16354): Sending signal. PID: 16354 SIG: 9
Lost connection to device.

To reproduce

  1. Get any model that has dynamic shaped input (lmk if I should share mine, can do that)
  2. Continuously run inference on the model with different data using Java API on Android
  3. Watch the memory consumption go up and up till app crashes

Urgency

Urgent, as this issue is happening in production, causing crashes and inconvenience for our mobile customers.

Platform

Android

OS Version

Android 14

ONNX Runtime Installation

Released Package

Compiler Version (if 'Built from Source')

No response

Package Name (if 'Released Package')

onnxruntime-android

ONNX Runtime Version or Commit ID

1.18

ONNX Runtime API

Java/Kotlin

Architecture

ARM64

Execution Provider

Default CPU

Execution Provider Library Version

No response

@laurenspriem laurenspriem added the platform:mobile issues related to ONNX Runtime mobile; typically submitted using template label Oct 21, 2024
@github-actions github-actions bot added the api:Java issues related to the Java API label Oct 21, 2024
@Craigacp
Copy link
Contributor

How are you constructing the tensors? For best performance you should be using a cache of direct ByteBuffers you manage rather than letting the JVM create & garbage collect them as the GC algorithm can get overwhelmed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api:Java issues related to the Java API platform:mobile issues related to ONNX Runtime mobile; typically submitted using template
Projects
None yet
Development

No branches or pull requests

2 participants