Tags: nubison/nubison-model
Tags
feat: add context support for parallel inference (#12) * feat: enhance model loading with context support - Updated `load_model` method in `UserModel` and `NubisonModel` to accept a `ModelContext` argument, allowing for better handling of worker-specific information during model loading. - Introduced `ModelContext` type definition to encapsulate worker index and total number of workers for GPU initialization in parallel setups. - Adjusted related code in service and tests to accommodate the new context parameter. - Updated documentation in `README.md` to reflect changes in the `load_model` method and its parameters. * feat: update Dockerfile to support parallel inference - Added Open Container Initiative (OCI) labels for better image description and source tracking. - Introduced a new environment variable `NUM_WORKERS` with a default value of 4 to configure the number of workers for the application. * refactor: reduce default worker count for improved resource management - Updated the Dockerfile to change the `NUM_WORKERS` environment variable from 4 to 2, optimizing resource allocation. - Adjusted the default number of workers in `Service.py` from 4 to 1 to align with the new Docker configuration, enhancing performance and efficiency during model loading.