Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MLflow Child Runs per Pipeline #448

Open
lemonhead94 opened this issue Aug 31, 2023 · 6 comments
Open

MLflow Child Runs per Pipeline #448

lemonhead94 opened this issue Aug 31, 2023 · 6 comments
Labels
enhancement New feature or request need-design-decision Several ways of implementation are possible and one must be chosen

Comments

@lemonhead94
Copy link

Description

I was wondering if there was a clever way to start a nested mlflow run per executed pipeline. With the parent run being „default“.

Context

This would allow for a clear seperation of artifacts and metrics, for ETL, preprocessing, training and evaluation stage inside MLflow.

@Galileo-Galilei
Copy link
Owner

Hi @lemonhead94, this is a good suggestion, but how do you expect kedro-mlflow decide when to start and end sub run? Using namespaced pipelines?

Note that you can "hack" it with a custom hook:

import mlflow

from kedro.framework.hooks import hook_impl


Class MlflowSubRunHook:

    @hook_impl
    def before_node_run(self, node, catalog, inputs, is_async, session_id) -> None:
        if node.name=="<your-node-name>: 
            mlflow.start_run(nested=true)

    @hook_impl
    def after_node_run(self, node, catalog, inputs, outputs, is_async, session_id) -> None:
        if node.name=="<your-node-name>: 
            mlflow.end_run()

and then in your settings.py:

from my_project.hooks import MlflowSubRunHook

HOOKS = (MlflowSubRunHook(),)

@lemonhead94
Copy link
Author

Hi @Galileo-Galilei, thanks for the prompt response!

Yes, what I tried to do was simply cache the pipeline names in the pipeline registry file.
Then start nested mlflow runs using node.namespace.

However, I came up with two problems on my end.
Firstly your before_node_run hook was being executed before my own, hence the params would still be logged to the parent mlflow run. What I ended up trying was hooking your hook since you expose the instance here.
This worked; however, for some reason if the artifacts are defined in the kedro catalog, they are still being logged to the parent run.
Secondly, I didn't know that kedro doesn't execute linearly so I saw stuff like, preprocessing - training - preprocessing - training - evaluate ... Which in hindsight makes sense from a efficiency perspective..

So what I probably will end up doing, since you look for an existing mlflow run, is simply using subprocesses, however that is rather ugly.

import mlflow
import subprocess

pipeline_names = ["preprocessing", "training", "evaluate", "inference"]

with mlflow.start_run(run_name="RUN_001") as parent_run:
    for pipeline_name in pipeline_names:
        with mlflow.start_run(run_name=pipeline_name, nested=True) as child_run:
            kedro_command = f"kedro run --pipeline {pipeline_name}"
            process = subprocess.Popen(kedro_command, shell=True)
            process.wait()

@Galileo-Galilei
Copy link
Owner

Galileo-Galilei commented Sep 5, 2023

  • Regarding the artifacts defined in the kedro catalog, I don't really know why, I should investigate. I may have some "magic" which passes the original run due to the switch_catalog_logging function, but it sounds like a bug.

  • Regarding running the nodes in differents order, I am aware of this. This should be possible to use the namespace to perform checks and forces logging everything in the "right run" but this is hard, and I don't think this would really fits every people needs. I am pretty sure some people want sub run with namespace, other for some namespace but not all, other would like to declare a starting nodes and other and end node through configuration... if you want to make a PR I can guide you but this is a low priority for me to develop so it will likely takes months / years to make this possible unless this thread gets a lot of traction.

Your workaround works fine, but I understand this is a bit frustrating to need to tweak kedro for this!

@Galileo-Galilei Galileo-Galilei added enhancement New feature or request help wanted need-design-decision Several ways of implementation are possible and one must be chosen labels Sep 5, 2023
@lemonhead94
Copy link
Author

lemonhead94 commented Sep 18, 2023

Sorry for the late response, I didn't have time to play around with it again until today.
I think I'm going with the hard coded order of pipeline solution, maybe I have time in the future to work on a neatly integrated solution.

For now I just leave this here for anybody stumbling across this issue.
This is a src/package_name/run.py file which simply runs all stages in order and creates the wanted child run behaviour:

import os
from pathlib import Path

import mlflow
from kedro.config import OmegaConfigLoader
from kedro.framework.project import configure_project
from kedro.framework.session import KedroSession
from kedro.framework.startup import bootstrap_project
from kedro.utils import load_obj


def run(pipeline: str) -> None:
    runner = load_obj("SequentialRunner", "kedro.runner")
    with KedroSession.create() as session:
        session.run(
            pipeline,
            runner=runner(is_async=False),
            pipeline_name=pipeline,
        )


def main() -> None:
    bootstrap_project(project_path=Path.cwd())
    configure_project(package_name=os.path.basename(os.path.dirname(__file__)))
    config_loader = OmegaConfigLoader(
        conf_source=f"{os.getcwd()}/conf",
        config_patterns={"mlflow": ["mlflow*", "mlflow/**", "**/mlflow*"]},
    )
    mlflow.set_experiment(
        experiment_name=config_loader["mlflow"]["tracking"]["experiment"]["name"]
    )

    pipeline_names = ["preprocess", "train", "evaluate", "inference"]
    with mlflow.start_run():
        for pipeline_name in pipeline_names:
            mlflow.start_run(run_name=pipeline_name, nested=True)
            run(pipeline_name)
            mlflow.end_run()


if __name__ == "__main__":
    main()

@Galileo-Galilei
Copy link
Owner

Galileo-Galilei commented Sep 18, 2023

It's nice to achieve your goal, but it is a pity it comes at a price of redefining the run function and losing many kedro advantages (e.g. hooks).

I'll try to come out with a solution one day, so I let the issue open. Feel free to open a PR if you come up with some neat integration!

@Galileo-Galilei
Copy link
Owner

Decision: I'll enable this using namespaced pipelines. See the rationale in the discussion here: kedro-org/kedro#4319

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request need-design-decision Several ways of implementation are possible and one must be chosen
Projects
Status: 📋 Backlog
Development

No branches or pull requests

2 participants