You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Feast currently lacks built-in support for Ray as an offline store, which could be beneficial for distributed data processing and efficient feature engineering at scale. I'm frustrated by the limited options available when working with large-scale feature transformations that require high parallelism and distributed computation.
Describe the solution you'd like
I would like to see Ray integrated as an official offline store in Feast. This integration should enable users to read from data sources (e.g., Parquet, CSV, databases) and perform distributed data processing using Ray's capabilities. It should support scalable batch feature retrieval and preprocessing while ensuring ease of configuration within Feast.
Describe alternatives you've considered
Using Dask as an offline store (though it may not offer the same scalability as Ray for certain workloads).
Writing custom scripts outside of Feast to process data with Ray and then load it manually into the offline store, which introduces extra complexity and maintenance overhead.
Relying on Spark, which can be heavier and harder to deploy for some users compared to Ray.
Additional context
Ray’s dynamic scaling and actor-based programming model could be a natural fit for distributed feature computation and preprocessing.
This could potentially integrate well with Ray Serve and Ray Train for end-to-end machine learning pipelines.
Is your feature request related to a problem? Please describe.
Feast currently lacks built-in support for Ray as an offline store, which could be beneficial for distributed data processing and efficient feature engineering at scale. I'm frustrated by the limited options available when working with large-scale feature transformations that require high parallelism and distributed computation.
Describe the solution you'd like
I would like to see Ray integrated as an official offline store in Feast. This integration should enable users to read from data sources (e.g., Parquet, CSV, databases) and perform distributed data processing using Ray's capabilities. It should support scalable batch feature retrieval and preprocessing while ensuring ease of configuration within Feast.
Describe alternatives you've considered
Additional context
The text was updated successfully, but these errors were encountered: