Skip to content

MLflow Tracking

MLflow tracking is an optional, off-by-default projection of Daedalus training run records. It lives in daedalus.tracking and is wired through the training config — when disabled (the default) it imports no tracking backend and does nothing.

The contract

src/daedalus/tracking/base.py defines a small ExperimentTracker protocol with request dataclasses (StartRunRequest, EndRunRequest, LogParamsRequest, LogMetricsRequest, LogTagsRequest, LogArtifactsRequest). Two implementations satisfy it:

  • NullTracker — the default. Every method is a no-op and it deliberately imports no optional backend, so the common path stays dependency-free.
  • MLflowTracker (src/daedalus/tracking/mlflow.py) — backed by MLflow, which is imported lazily (inside the methods) so that importing daedalus.tracking is clean even when the optional extra is absent.

A get_tracker() factory (src/daedalus/tracking/__init__.py) selects between them:

python
from daedalus.tracking import get_tracker

tracker = get_tracker("null")     # default — no-op
tracker = get_tracker(
    "mlflow",
    tracking_uri="http://mlflow.internal:5000",
    experiment_name="dssm_ranking",
)

Wiring: default-off through the training config

The projection is configured on TrainingRunConfig.tracking (src/daedalus/pipelines/config.py) via a TrackingConfig:

python
class TrackingConfig(BaseModel):
    kind: Literal["null", "mlflow"] = "null"   # default: do nothing
    tracking_uri: str | None = None
    experiment_name: str | None = None
    run_name: str | None = None
    model_name: str | None = None
    model_version: str | None = None

Because kind defaults to "null", training does nothing tracking-related until you opt in. MLflow logging happens only after the Daedalus TrainingRunRecord sidecar has been written.

What gets projected

The tracker projects a TrainingRunRecord into MLflow params, metrics, and tags (project_training_run_record / log_training_run_record). That includes:

  • the feature service @ version (daedalus.feature_service_at_version),
  • the git SHA of the run (daedalus.git_sha),
  • the config paths (runtime_config_path, aggregation_config_path),
  • the feed range / target date, the output root, and the
  • run-record ids/paths (run_record_id, run_record_path),
  • plus daedalus.total_rows as a metric when known.

Enabling it

MLflow is an optional extra that pulls in mlflow-skinny — the slim client with no pandas — to avoid dragging heavy deps into the engine (pyproject.toml: mlflow = ["mlflow-skinny>=3.14.0"]).

bash
# 1. Install the optional extra
uv sync --extra mlflow
yaml
# 2. Turn it on in config/training/runtime.yaml
tracking:
  kind: mlflow
  tracking_uri: http://mlflow.internal:5000
  experiment_name: dssm_ranking
  run_name: dssm_ranking_daily

With that in place, a training run projects its run record to MLflow; leaving kind: null (or omitting the block) keeps the NullTracker path and avoids the dependency entirely.

Projection, not a dependency

MLflow is a projection of run records, never a source of truth. The Daedalus TrainingRunRecord sidecar is authoritative; MLflow is an optional mirror for teams that already run a tracking server.

See also