MLflow Child Runs per Pipeline #448

lemonhead94 · 2023-08-31T16:03:52Z

Description

I was wondering if there was a clever way to start a nested mlflow run per executed pipeline. With the parent run being „default“.

Context

This would allow for a clear seperation of artifacts and metrics, for ETL, preprocessing, training and evaluation stage inside MLflow.

Galileo-Galilei · 2023-09-03T19:16:05Z

Hi @lemonhead94, this is a good suggestion, but how do you expect kedro-mlflow decide when to start and end sub run? Using namespaced pipelines?

Note that you can "hack" it with a custom hook:

import mlflow

from kedro.framework.hooks import hook_impl


Class MlflowSubRunHook:

    @hook_impl
    def before_node_run(self, node, catalog, inputs, is_async, session_id) -> None:
        if node.name=="<your-node-name>: 
            mlflow.start_run(nested=true)

    @hook_impl
    def after_node_run(self, node, catalog, inputs, outputs, is_async, session_id) -> None:
        if node.name=="<your-node-name>: 
            mlflow.end_run()

and then in your settings.py:

from my_project.hooks import MlflowSubRunHook

HOOKS = (MlflowSubRunHook(),)

lemonhead94 · 2023-09-03T19:51:48Z

Hi @Galileo-Galilei, thanks for the prompt response!

Yes, what I tried to do was simply cache the pipeline names in the pipeline registry file.
Then start nested mlflow runs using node.namespace.

However, I came up with two problems on my end.
Firstly your before_node_run hook was being executed before my own, hence the params would still be logged to the parent mlflow run. What I ended up trying was hooking your hook since you expose the instance here.
This worked; however, for some reason if the artifacts are defined in the kedro catalog, they are still being logged to the parent run.
Secondly, I didn't know that kedro doesn't execute linearly so I saw stuff like, preprocessing - training - preprocessing - training - evaluate ... Which in hindsight makes sense from a efficiency perspective..

So what I probably will end up doing, since you look for an existing mlflow run, is simply using subprocesses, however that is rather ugly.

import mlflow
import subprocess

pipeline_names = ["preprocessing", "training", "evaluate", "inference"]

with mlflow.start_run(run_name="RUN_001") as parent_run:
    for pipeline_name in pipeline_names:
        with mlflow.start_run(run_name=pipeline_name, nested=True) as child_run:
            kedro_command = f"kedro run --pipeline {pipeline_name}"
            process = subprocess.Popen(kedro_command, shell=True)
            process.wait()

Galileo-Galilei · 2023-09-05T21:10:58Z

Regarding the artifacts defined in the kedro catalog, I don't really know why, I should investigate. I may have some "magic" which passes the original run due to the switch_catalog_logging function, but it sounds like a bug.
Regarding running the nodes in differents order, I am aware of this. This should be possible to use the namespace to perform checks and forces logging everything in the "right run" but this is hard, and I don't think this would really fits every people needs. I am pretty sure some people want sub run with namespace, other for some namespace but not all, other would like to declare a starting nodes and other and end node through configuration... if you want to make a PR I can guide you but this is a low priority for me to develop so it will likely takes months / years to make this possible unless this thread gets a lot of traction.

Your workaround works fine, but I understand this is a bit frustrating to need to tweak kedro for this!

lemonhead94 · 2023-09-18T14:23:54Z

Sorry for the late response, I didn't have time to play around with it again until today.
I think I'm going with the hard coded order of pipeline solution, maybe I have time in the future to work on a neatly integrated solution.

For now I just leave this here for anybody stumbling across this issue.
This is a src/package_name/run.py file which simply runs all stages in order and creates the wanted child run behaviour:

import os
from pathlib import Path

import mlflow
from kedro.config import OmegaConfigLoader
from kedro.framework.project import configure_project
from kedro.framework.session import KedroSession
from kedro.framework.startup import bootstrap_project
from kedro.utils import load_obj


def run(pipeline: str) -> None:
    runner = load_obj("SequentialRunner", "kedro.runner")
    with KedroSession.create() as session:
        session.run(
            pipeline,
            runner=runner(is_async=False),
            pipeline_name=pipeline,
        )


def main() -> None:
    bootstrap_project(project_path=Path.cwd())
    configure_project(package_name=os.path.basename(os.path.dirname(__file__)))
    config_loader = OmegaConfigLoader(
        conf_source=f"{os.getcwd()}/conf",
        config_patterns={"mlflow": ["mlflow*", "mlflow/**", "**/mlflow*"]},
    )
    mlflow.set_experiment(
        experiment_name=config_loader["mlflow"]["tracking"]["experiment"]["name"]
    )

    pipeline_names = ["preprocess", "train", "evaluate", "inference"]
    with mlflow.start_run():
        for pipeline_name in pipeline_names:
            mlflow.start_run(run_name=pipeline_name, nested=True)
            run(pipeline_name)
            mlflow.end_run()


if __name__ == "__main__":
    main()

Galileo-Galilei · 2023-09-18T18:34:36Z

It's nice to achieve your goal, but it is a pity it comes at a price of redefining the run function and losing many kedro advantages (e.g. hooks).

I'll try to come out with a solution one day, so I let the issue open. Feel free to open a PR if you come up with some neat integration!

Galileo-Galilei · 2024-11-30T21:11:49Z

Decision: I'll enable this using namespaced pipelines. See the rationale in the discussion here: kedro-org/kedro#4319

Galileo-Galilei added enhancement New feature or request help wanted need-design-decision Several ways of implementation are possible and one must be chosen labels Sep 5, 2023

Galileo-Galilei removed the help wanted label Oct 28, 2023

Galileo-Galilei added this to kedro-mlflow roadmap Oct 28, 2023

Galileo-Galilei moved this to 🆕 New in kedro-mlflow roadmap Oct 28, 2023

Galileo-Galilei moved this from 🆕 New to 📋 Backlog in kedro-mlflow roadmap Oct 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MLflow Child Runs per Pipeline #448

MLflow Child Runs per Pipeline #448

lemonhead94 commented Aug 31, 2023

Galileo-Galilei commented Sep 3, 2023

lemonhead94 commented Sep 3, 2023

Galileo-Galilei commented Sep 5, 2023 •

edited

Loading

lemonhead94 commented Sep 18, 2023 •

edited

Loading

Galileo-Galilei commented Sep 18, 2023 •

edited

Loading

Galileo-Galilei commented Nov 30, 2024

MLflow Child Runs per Pipeline #448

MLflow Child Runs per Pipeline #448

Comments

lemonhead94 commented Aug 31, 2023

Description

Context

Galileo-Galilei commented Sep 3, 2023

lemonhead94 commented Sep 3, 2023

Galileo-Galilei commented Sep 5, 2023 • edited Loading

lemonhead94 commented Sep 18, 2023 • edited Loading

Galileo-Galilei commented Sep 18, 2023 • edited Loading

Galileo-Galilei commented Nov 30, 2024

Galileo-Galilei commented Sep 5, 2023 •

edited

Loading

lemonhead94 commented Sep 18, 2023 •

edited

Loading

Galileo-Galilei commented Sep 18, 2023 •

edited

Loading