-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support FlyteRemote.execute
interruptible flag override
#2885
Support FlyteRemote.execute
interruptible flag override
#2885
Conversation
Signed-off-by: redartera <reda@artera.ai>
Signed-off-by: redartera <reda@artera.ai>
Signed-off-by: redartera <reda@artera.ai>
Signed-off-by: redartera <reda@artera.ai>
Signed-off-by: redartera <reda@artera.ai>
@pingsutw Can we please merge this PR? |
Head branch was pushed to by a user without write access
Signed-off-by: redartera <reda@artera.ai>
aa09a04
to
e82b2ce
Compare
Signed-off-by: redartera <reda@artera.ai>
After merging Actually I synched the master branch of my |
yes this is my fault |
Signed-off-by: redartera <120470035+redartera@users.noreply.github.com>
Changelist by BitoThis pull request implements the following key changes.
|
Code Review Agent Run #fb24f7Actionable Suggestions - 0Additional Suggestions - 10
Review Details
|
Code Review Agent Run #b66645Actionable Suggestions - 0Additional Suggestions - 10
Review Details
|
do we really need or can we remove the wrapper in the model for the bool value? it's a oneof but i think just setting the bool should be enough. could we also add it to one of the unit tests in here
I know the model files are annoying and unnecessary to work with, thank you for bearing with us. |
Signed-off-by: redartera <120470035+redartera@users.noreply.github.com>
Signed-off-by: redartera <120470035+redartera@users.noreply.github.com>
47876c6
to
73cae9e
Compare
Removed the wrapper here bcbb991 edit: passing a raw |
Quick update - it seems to be indeed a Would you advise to stick with the google wrapper - or is there a better alternative? |
Code Review Agent Run #2ccc25Actionable Suggestions - 0Additional Suggestions - 1
Review Details
|
Signed-off-by: redartera <120470035+redartera@users.noreply.github.com>
2d668ea
to
e2083da
Compare
It looks like |
Code Review Agent Run #616c34Actionable Suggestions - 0Additional Suggestions - 10
Review Details
|
Code Review Agent Run #683644Actionable Suggestions - 0Additional Suggestions - 10
Review Details
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 but looking into the failing test.
The failing test in GH actions seems to be a wf execution timeout error. I tried to replicate locally on a codespace workstation - the integration test passes but fails in a different place, namely when trying to clean up the S3 file. Please see local test log below @redartera ➜ /workspaces/flytekit (flyteremote-interruptible-override) $ make integration_test_codecov
make CODECOV_OPTS="--cov=./ --cov-report=xml --cov-append" integration_test
make[1]: Entering directory '/workspaces/flytekit'
pytest -n auto --dist=loadfile tests/flytekit/integration --cov=./ --cov-report=xml --cov-append -m "not lftransfers"
============================================================================= test session starts =============================================================================
platform linux -- Python 3.12.1, pytest-8.3.4, pluggy-1.5.0
rootdir: /workspaces/flytekit
configfile: pyproject.toml
plugins: anyio-4.7.0, mock-3.14.0, hypothesis-6.125.2, asyncio-0.25.3, xdist-3.6.1, timeout-2.3.1, icdiff-0.9, cov-6.0.0
asyncio: mode=Mode.STRICT, asyncio_default_fixture_loop_scope=function
4 workers [47 items]
......................ss..............
...
FFF.ss<unknown>:18: SyntaxWarning: invalid escape sequence '\w'
<unknown>:19: SyntaxWarning: invalid escape sequence '\w'
/workspaces/flytekit/tests/flytekit/unit/models/admin/test_common.py:18: SyntaxWarning: invalid escape sequence '\w'
o = _common.Sort.from_python_std(' asc(my"\wackyk3y) ') # noqa: W605
/workspaces/flytekit/tests/flytekit/unit/models/admin/test_common.py:19: SyntaxWarning: invalid escape sequence '\w'
assert o.key == 'my"\wackyk3y' # noqa: W605
[100%]
================================================================================== FAILURES ===================================================================================
________________________________________________________________________________ test_open_ff _________________________________________________________________________________
[gw0] linux -- Python 3.12.1 /usr/local/python/3.12.1/bin/python3
def test_open_ff():
"""Test opening FlyteFile from a remote path."""
# Upload a file to minio s3 bucket
file_transfer = SimpleFileTransfer()
remote_file_path = file_transfer.upload_file(file_type="json")
execution_id = run("flytefile.py", "wf", "--remote_file_path", remote_file_path)
remote = FlyteRemote(Config.auto(config_file=CONFIG), PROJECT, DOMAIN)
execution = remote.fetch_execution(name=execution_id)
execution = remote.wait(execution=execution, timeout=datetime.timedelta(minutes=5))
assert execution.closure.phase == WorkflowExecutionPhase.SUCCEEDED, f"Execution failed with phase: {execution.closure.phase}"
# Delete the remote file to free the space
url = urlparse(remote_file_path)
bucket, key = url.netloc, url.path.lstrip("/")
> file_transfer.delete_file(bucket=bucket, key=key)
tests/flytekit/integration/remote/test_remote.py:889:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
tests/flytekit/integration/remote/utils.py:100: in delete_file
res = self._s3_client.delete_object(Bucket=bucket, Key=key)
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/client.py:569: in _api_call
return self._make_api_call(operation_name, kwargs)
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/client.py:1005: in _make_api_call
http, parsed_response = self._make_request(
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/client.py:1029: in _make_request
return self._endpoint.make_request(operation_model, request_dict)
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/endpoint.py:119: in make_request
return self._send_request(request_dict, operation_model)
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/endpoint.py:196: in _send_request
request = self.create_request(request_dict, operation_model)
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/endpoint.py:132: in create_request
self._event_emitter.emit(
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/hooks.py:412: in emit
return self._emitter.emit(aliased_event_name, **kwargs)
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/hooks.py:256: in emit
return self._emit(event_name, kwargs)
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/hooks.py:239: in _emit
response = handler(**kwargs)
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/signers.py:106: in handler
return self.sign(operation_name, request)
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/signers.py:198: in sign
auth.add_auth(request)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <botocore.auth.S3SigV4Auth object at 0x7ec1e6e2ad20>, request = <botocore.awsrequest.AWSRequest object at 0x7ec1e6f6a660>
def add_auth(self, request):
if self.credentials is None:
> raise NoCredentialsError()
E botocore.exceptions.NoCredentialsError: Unable to locate credentials
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/auth.py:423: NoCredentialsError
_____________________________________________________________________________ test_attr_access_sd _____________________________________________________________________________
[gw0] linux -- Python 3.12.1 /usr/local/python/3.12.1/bin/python3
def test_attr_access_sd():
"""Test accessing StructuredDataset attribute from a dataclass."""
# Upload a file to minio s3 bucket
file_transfer = SimpleFileTransfer()
remote_file_path = file_transfer.upload_file(file_type="parquet")
execution_id = run("attr_access_sd.py", "wf", "--uri", remote_file_path)
remote = FlyteRemote(Config.auto(config_file=CONFIG), PROJECT, DOMAIN)
execution = remote.fetch_execution(name=execution_id)
execution = remote.wait(execution=execution, timeout=datetime.timedelta(minutes=5))
assert execution.closure.phase == WorkflowExecutionPhase.SUCCEEDED, f"Execution failed with phase: {execution.closure.phase}"
# Delete the remote file to free the space
url = urlparse(remote_file_path)
bucket, key = url.netloc, url.path.lstrip("/")
> file_transfer.delete_file(bucket=bucket, key=key)
tests/flytekit/integration/remote/test_remote.py:907:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
tests/flytekit/integration/remote/utils.py:100: in delete_file
res = self._s3_client.delete_object(Bucket=bucket, Key=key)
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/client.py:569: in _api_call
return self._make_api_call(operation_name, kwargs)
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/client.py:1005: in _make_api_call
http, parsed_response = self._make_request(
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/client.py:1029: in _make_request
return self._endpoint.make_request(operation_model, request_dict)
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/endpoint.py:119: in make_request
return self._send_request(request_dict, operation_model)
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/endpoint.py:196: in _send_request
request = self.create_request(request_dict, operation_model)
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/endpoint.py:132: in create_request
self._event_emitter.emit(
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/hooks.py:412: in emit
return self._emitter.emit(aliased_event_name, **kwargs)
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/hooks.py:256: in emit
return self._emit(event_name, kwargs)
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/hooks.py:239: in _emit
response = handler(**kwargs)
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/signers.py:106: in handler
return self.sign(operation_name, request)
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/signers.py:198: in sign
auth.add_auth(request)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <botocore.auth.S3SigV4Auth object at 0x7ec1e637fb30>, request = <botocore.awsrequest.AWSRequest object at 0x7ec1e63141a0>
def add_auth(self, request):
if self.credentials is None:
> raise NoCredentialsError()
E botocore.exceptions.NoCredentialsError: Unable to locate credentials
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/auth.py:423: NoCredentialsError
________________________________________________________________________________ test_sd_attr _________________________________________________________________________________
[gw0] linux -- Python 3.12.1 /usr/local/python/3.12.1/bin/python3
def test_sd_attr():
"""Test correctness of StructuredDataset attributes.
This test considers only the following condition:
1. Check StructuredDataset (wrapped in a dataclass) file_format attribute
We'll make sure uri aligns with the user-specified one in the future.
"""
from workflows.basic.sd_attr import wf
@dataclass
class DC:
sd: StructuredDataset
FILE_FORMAT = "parquet"
# Upload a file to minio s3 bucket
file_transfer = SimpleFileTransfer()
remote_file_path = file_transfer.upload_file(file_type=FILE_FORMAT)
# Create a dataclass as the workflow input because `pyflyte run`
# can't properly handle input arg `dc` as a json str so far
dc = DC(sd=StructuredDataset(uri=remote_file_path, file_format=FILE_FORMAT))
remote = FlyteRemote(Config.auto(config_file=CONFIG), PROJECT, DOMAIN, interactive_mode_enabled=True)
wf_exec = remote.execute(
wf,
inputs={"dc": dc, "file_format": FILE_FORMAT},
wait=True,
version=VERSION,
image_config=ImageConfig.from_images(IMAGE),
)
assert wf_exec.closure.phase == WorkflowExecutionPhase.SUCCEEDED, f"Execution failed with phase: {wf_exec.closure.phase}"
assert wf_exec.outputs["o0"].file_format == FILE_FORMAT, (
f"Workflow output StructuredDataset file_format should align with the user-specified file_format: {FILE_FORMAT}."
)
# Delete the remote file to free the space
url = urlparse(remote_file_path)
bucket, key = url.netloc, url.path.lstrip("/")
> file_transfer.delete_file(bucket=bucket, key=key)
tests/flytekit/integration/remote/test_remote.py:950:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
tests/flytekit/integration/remote/utils.py:100: in delete_file
res = self._s3_client.delete_object(Bucket=bucket, Key=key)
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/client.py:569: in _api_call
return self._make_api_call(operation_name, kwargs)
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/client.py:1005: in _make_api_call
http, parsed_response = self._make_request(
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/client.py:1029: in _make_request
return self._endpoint.make_request(operation_model, request_dict)
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/endpoint.py:119: in make_request
return self._send_request(request_dict, operation_model)
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/endpoint.py:196: in _send_request
request = self.create_request(request_dict, operation_model)
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/endpoint.py:132: in create_request
self._event_emitter.emit(
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/hooks.py:412: in emit
return self._emitter.emit(aliased_event_name, **kwargs)
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/hooks.py:256: in emit
return self._emit(event_name, kwargs)
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/hooks.py:239: in _emit
response = handler(**kwargs)
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/signers.py:106: in handler
return self.sign(operation_name, request)
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/signers.py:198: in sign
auth.add_auth(request)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <botocore.auth.S3SigV4Auth object at 0x7ec1e59c6db0>, request = <botocore.awsrequest.AWSRequest object at 0x7ec1e59c6450>
def add_auth(self, request):
if self.credentials is None:
> raise NoCredentialsError()
E botocore.exceptions.NoCredentialsError: Unable to locate credentials
/usr/local/python/3.12.1/lib/python3.12/site-packages/botocore/auth.py:423: NoCredentialsError
---------------------------------------------------------------------------- Captured stdout call -----------------------------------------------------------------------------
18:26:13.275310 WARNING remote.py:289 - Jupyter notebook and interactive task
support is still alpha.
============================================================================== warnings summary ===============================================================================
../../home/codespace/.local/lib/python3.12/site-packages/pythonjsonlogger/__init__.py:24: 10 warnings
/home/codespace/.local/lib/python3.12/site-packages/pythonjsonlogger/__init__.py:24: DeprecationWarning: pythonjsonlogger.jsonlogger has been moved to pythonjsonlogger.json
warnings.warn(
<frozen importlib._bootstrap>:488
<frozen importlib._bootstrap>:488
<frozen importlib._bootstrap>:488
<frozen importlib._bootstrap>:488
<frozen importlib._bootstrap>:488
<frozen importlib._bootstrap>:488: DeprecationWarning: Type google._upb._message.MessageMapContainer uses PyType_Spec with a metaclass that has custom tp_new. This is deprecated and will no longer be allowed in Python 3.14.
<frozen importlib._bootstrap>:488
<frozen importlib._bootstrap>:488
<frozen importlib._bootstrap>:488
<frozen importlib._bootstrap>:488
<frozen importlib._bootstrap>:488
<frozen importlib._bootstrap>:488: DeprecationWarning: Type google._upb._message.ScalarMapContainer uses PyType_Spec with a metaclass that has custom tp_new. This is deprecated and will no longer be allowed in Python 3.14.
../../home/codespace/.local/lib/python3.12/site-packages/jupyter_client/connect.py:22
../../home/codespace/.local/lib/python3.12/site-packages/jupyter_client/connect.py:22
../../home/codespace/.local/lib/python3.12/site-packages/jupyter_client/connect.py:22
../../home/codespace/.local/lib/python3.12/site-packages/jupyter_client/connect.py:22
/home/codespace/.local/lib/python3.12/site-packages/jupyter_client/connect.py:22: DeprecationWarning: Jupyter is migrating its paths to use standard platformdirs
given by the platformdirs library. To remove this warning and
see the appropriate new directories, set the environment variable
`JUPYTER_PLATFORM_DIRS=1` and then run `jupyter --paths`.
The use of platformdirs will be the default in `jupyter_core` v6
from jupyter_core.paths import jupyter_data_dir, jupyter_runtime_dir, secure_write
tests/flytekit/integration/remote/test_remote.py: 11 warnings
/workspaces/flytekit/flytekit/core/base_task.py:833: DeprecationWarning: `disable_deck` is deprecated and will be removed in the future.
Please use `enable_deck` instead.
warnings.warn(
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
---------- coverage: platform linux, python 3.12.1-final-0 -----------
Coverage XML written to file coverage.xml
=========================================================================== short test summary info ===========================================================================
FAILED tests/flytekit/integration/remote/test_remote.py::test_open_ff - botocore.exceptions.NoCredentialsError: Unable to locate credentials
FAILED tests/flytekit/integration/remote/test_remote.py::test_attr_access_sd - botocore.exceptions.NoCredentialsError: Unable to locate credentials
FAILED tests/flytekit/integration/remote/test_remote.py::test_sd_attr - botocore.exceptions.NoCredentialsError: Unable to locate credentials
====================================================== 3 failed, 40 passed, 4 skipped, 35 warnings in 2063.87s (0:34:23) ======================================================
make[1]: *** [Makefile:94: integration_test] Error 1
make[1]: Leaving directory '/workspaces/flytekit'
make: *** [Makefile:90: integration_test_codecov] Error 2 |
it's succeeding locally for me too. the auth error you're seeing is a bit different, my guess is it's just failing because the minio password isn't set where you're running it. and different attempts are failing on different tests. |
they're succeeding remotely now too - I think something made them flaky and some executions timeout sometimes (I saw a flyte execution timeout error in the GH action logs) |
Code Review Agent Run #f887d7Actionable Suggestions - 0Review Details
|
Tracking issue
Closes flyteorg/flyte#5279
Why are the changes needed?
It is useful for users to invoke the same
LaunchPlan
but overriding at runtime whether they need executions to be interruptible or not. Currently this is possible through the Flyte UI, but notflytekit
.What changes were proposed in this pull request?
Support an
interruptible
option inFlyteRemote.execute
How was this patch tested?
Integration test added
Check all the applicable boxes
Summary by Bito
This PR implements comprehensive system improvements including authentication, domain management, and remote execution features, while enhancing Flyte's type system with strict type matching and improved dictionary handling. Key additions include interruptible flag override in FlyteRemote, configurable chunk sizes for S3/GCS operations, and enhanced pod spec generation. The implementation includes a new type_match_checking module, enhanced DictTransformer functionality, and improved error handling. Additional features comprise VLLM integration, Kubernetes data service plugin, and improved async operations with configurable batch sizes.Unit tests added: True
Estimated effort to review (1-5, lower is better): 5