You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Return a filesource which is connected to the S3 endpoint
Current Behavior
File "C:\Users\shakt\anaconda3\envs\feast_test_env\Lib\site-packages\feast\inference.py", line 180, in update_feature_views_with_inferred_features_and_entities
_infer_features_and_entities(
File "C:\Users\shakt\anaconda3\envs\feast_test_env\Lib\site-packages\feast\inference.py", line 230, in _infer_features_and_entities
provider.get_table_column_names_and_types_from_data_source(
File "C:\Users\shakt\anaconda3\envs\feast_test_env\Lib\site-packages\feast\infra\passthrough_provider.py", line 526, in get_table_column_names_and_types_from_data_source
return self.offline_store.get_table_column_names_and_types_from_data_source(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\shakt\anaconda3\envs\feast_test_env\Lib\site-packages\feast\infra\offline_stores\offline_store.py", line 390, in get_table_column_names_and_types_from_data_source
return data_source.get_table_column_names_and_types(config=config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\shakt\anaconda3\envs\feast_test_env\Lib\site-packages\feast\infra\offline_stores\file_source.py", line 181, in get_table_column_names_and_types
schema = ParquetDataset(path, **kwargs).schema
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\shakt\anaconda3\envs\feast_test_env\Lib\site-packages\pyarrow\parquet\core.py", line 1348, in __init__
finfo = filesystem.get_file_info(path_or_paths)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "pyarrow\\_fs.pyx", line 590, in pyarrow._fs.FileSystem.get_file_info
File "pyarrow\\error.pxi", line 155, in pyarrow.lib.pyarrow_internal_check_status
File "pyarrow\\error.pxi", line 92, in pyarrow.lib.check_status
OSError: [WinError 123] Failed querying information for path 'C:/Users/shakt/Documents/GIT/feast-artifact/feature_repo/s3:/bucket/flights.parquet'.
Detail: [Windows error 123] The filename, directory name, or volume label syntax is incorrect.
Steps to reproduce
Create a minio object store as a docker container exposing 9000 port
Ran the below code to try and connect to that minio container as a File Source
bucket_name="bucket"file_name="flights.parquet"s3_endpoint="https://localhost:9000"# Define the data source for flight dataflight_stats_source=FileSource(
path=f"s3://{bucket_name}/{file_name}",
timestamp_field="FlightDate",
file_format=ParquetFormat(),
s3_endpoint_override="http://localhost:9000"# Changed to http since use_ssl=False
)
@ShaktidharK1997 a temporary fix is to edit the dask.py file directly from feast. /home/alijoe/anaconda3/lib/python3.11/site-packages/feast/infra/offline_stores/dask.py
Look for read_datasource function and change this line:
if not Path(data_source.path).is_absolute():
to
if not data_source.path.startswith("s3://"):
this allows it to take S3 (MinIO) data. The problem is Path(data_source.path).is_absolute() is expecting a real filepath not a S3 url. If you've got any other solutions, let me know.
Expected Behavior
Return a filesource which is connected to the S3 endpoint
Current Behavior
Steps to reproduce
Specifications
Possible Solution
As mentioned in #4753, to revert to previous code
The text was updated successfully, but these errors were encountered: