AI

Deleting Experiments in MLflow

A strong open-source platform called MLflow controls the entire machine learning lifecycle including experiments, artifacts, models, and deployment. Experiments are frequently created when working on machine learning projects. These could clutter a workspace over time, making it harder to navigate and locate the required experiments. Deleting unnecessary experiments assists in keeping a tidy, controlled workspace which increases productivity and efficiency.

Concept of Delete vs. Permanent Delete

There are two options in MLflow for removing an experiment: delete and permanent delete. The “Deleted” tab is where the deleted experiments go after deletion, but they can always be recovered. However, deleting an experiment permanently causes it to disappear from the MLflow backend storage and cannot be undone.

Prerequisites to Delete an Experiment

  • Installed and configured MLflow
  • Access to the MLflow tracking server

Steps Involved in Deletion

  1. Identify the experiment by name or experiment ID to delete.
  2. Mark the experiment for deletion using the MLflow CLI.
  3. Wait for the deletion of the experiment. This might require some time depending on the size of the experiment,.

There are multiple approaches that are used to delete the experiments with the help of MLflow API. In this article, two approaches are used for deletion.

Approach I: Deleting Experiments Using the MLflow Tracking API

This is the most straightforward and common approach for the removal of an MLflow experiment. The MLflow Tracking API may be used to delete the experiments in the following ways:

Step 1: Install MLflow

Ensure the MLflow installation within the Python environment. Use the pip command to install the MLflow:

pip install mlflow

Step 2: See and Delete the Experiments in the MLflow Tracking Server

There are two methods to see and delete the experiments, artifacts, and runs that exist on the MLflow tracking server:

Method 1: Using the MLflow Tracking UI

Launch the terminal or command prompt and type the subsequent command:

mlflow ui

Enter http://127.0.0.1:5000 as the URL to launch the MLflow Tracking UI in the browser. The browser window will launch with the MLflow Tracking Server interface:

Click on the “Experiments” tab.

A list of the experiments developed in the MLflow server will be shown.

Use any field to filter the experiments including the name, start time, end time, and others.

To see the details of an experiment, click on its name.

Click the trash or delete icon next to the name of the experiment that you want to remove from the MLflow Server.

When prompted to validate the deletion, do so.

Before confirming the deletion, be sure to delete the right experiment.

Method 2: Using the MLflow API

Import the MLflow API in the Python code.

Call the search_experiments() method as in the earlier versions which is list_experiments().

The method returns a list of experiments on the MLflow Server.

Iterate over the list of experiments to see their details using “for” loop. In this example, only the experiment id and name of each experiment are printed on the console screen.

When a remote server is utilized for experiment tracking, the set_tracking_uri method is used. In that scenario, substitute the necessary server for MLFLOW_SERVER_URI.

Here are the code details:

# Import mlflow API

import mlflow

# Set the MLflow Tracking Server URI if a remote server is being used.

# This step can be skipped if we are utilising a local file-based tracking service. (SKIPPED HERE)

# Using the mlflow default server, we can access it at http://localhost:5000.

# mlflow.set_tracking_uri("http://MLFLOW_SERVER_URI")

# To obtain a list of experiments in the tracking server, use mlflow's search_experiments function.

experiments = mlflow.search_experiments()

for experiment in experiments:

print(experiment.experiment_id+" -> "+experiment.name)

When the Python compiler executes, the following code output is printed:

Step 3: Note the Experiment ID

Note down the experiment ID of the experiment to delete.

Step 4: Delete the Experiment Using the Delete_Experiment Function

The Experiment ID in string format is an input parameter for this method. As an argument for this method, provide the experiment ID that is found in the previous step. The Experiment Test-1, with Experiment ID 989185783576251785, has been deleted. The code is given as follows:

mlflow.delete_experiment("989185783576251785")

The following result is printed after the code has been run on the prompt screen. The Test-1 Experiment has been successfully removed:

Delete the Experiment

An experiment will be marked for deletion after it has been deleted. The experiment won’t be permanently erased when it has been picked up by the garbage collector. Typically, the garbage collection process takes a few minutes.

Delete the Experiment Permanently

Use the CLI (command line interface) to launch the garbage collector in order to permanently remove the experiment, runs, and artifacts with the help of the MLflow API. The disc space is freed up by deleting unnecessary data files. Go to the MLflow project’s or directory’s root directory by launching the terminal or command prompt. Run the following command line:

mlflow gc

The run experiments that have no connection to any experiments are automatically deleted by MLflow once the garbage collection process starts.

Please keep the following considerations:

The data is deleted permanently during garbage collection and cannot be recovered.

In most circumstances, it may not be necessary to explicitly execute the garbage collection because MLflow, by default, runs it automatically once every seven days. However, if you need additional control over the process or to expressly start the trash collection, use the aforementioned command.

Approach II: Delete the Experiment Using MLflow CLI (Command Line Interface)

Using the MLflow delete command, the users can remove an experiment from the command-line interface. The command’s syntax is as follows:

mlflow experiments delete --experiment-id <EXPERIMENT_ID>

Enter the aforementioned command in the terminal or Bash window after opening it, replacing the EXPERIMENT_ID with the desired experiment ID. The experiment ID 238769019307468458, which is the id of MyFirstMLflowExperiment-2, is utilized in the following scenario. The experiment is trashed when this command is executed.

Restore the Deleted Experiment

If an experiment is designated for deletion, it can be recovered using the restore command before it is permanently deleted. Here is the command:

mlflow experiments restore --experiment-id <EXPERIMENT_ID>

The experiment with the experiment ID of 238769019307468458 that was previously deleted via the command line interface has been recovered in the following snippet with the help of the MLflow restore command:

Conclusion

In MLflow, deleting the experiments and related runs is a crucial step in managing and maintaining the tracking data. The experiments are deleted to clear up the disc space, get rid of useless or old data, and organize the MLflow projects better. The MLflow Tracking API can be used to delete the experiments if a programmatic technique or a necessity for automation is required. This enhances the data integrity, saves storage, and streamlines the MLflow projects, offering a seamless workflow for model management and machine learning tests.

About the author

Kalsoom Bibi

Hello, I am a freelance writer and usually write for Linux and other technology related content