Concept of Delete vs. Permanent Delete
There are two options in MLflow for removing an experiment: delete and permanent delete. The “Deleted” tab is where the deleted experiments go after deletion, but they can always be recovered. However, deleting an experiment permanently causes it to disappear from the MLflow backend storage and cannot be undone.
Prerequisites to Delete an Experiment
- Installed and configured MLflow
- Access to the MLflow tracking server
Steps Involved in Deletion
- Identify the experiment by name or experiment ID to delete.
- Mark the experiment for deletion using the MLflow CLI.
- Wait for the deletion of the experiment. This might require some time depending on the size of the experiment,.
There are multiple approaches that are used to delete the experiments with the help of MLflow API. In this article, two approaches are used for deletion.
Approach I: Deleting Experiments Using the MLflow Tracking API
This is the most straightforward and common approach for the removal of an MLflow experiment. The MLflow Tracking API may be used to delete the experiments in the following ways:
Step 1: Install MLflow
Ensure the MLflow installation within the Python environment. Use the pip command to install the MLflow:
Step 2: See and Delete the Experiments in the MLflow Tracking Server
There are two methods to see and delete the experiments, artifacts, and runs that exist on the MLflow tracking server:
Method 1: Using the MLflow Tracking UI
Launch the terminal or command prompt and type the subsequent command:
Enter http://127.0.0.1:5000 as the URL to launch the MLflow Tracking UI in the browser. The browser window will launch with the MLflow Tracking Server interface:
Click on the “Experiments” tab.
A list of the experiments developed in the MLflow server will be shown.
Use any field to filter the experiments including the name, start time, end time, and others.
To see the details of an experiment, click on its name.
Click the trash or delete icon next to the name of the experiment that you want to remove from the MLflow Server.
When prompted to validate the deletion, do so.
Before confirming the deletion, be sure to delete the right experiment.
Method 2: Using the MLflow API
Import the MLflow API in the Python code.
Call the search_experiments() method as in the earlier versions which is list_experiments().
The method returns a list of experiments on the MLflow Server.
Iterate over the list of experiments to see their details using “for” loop. In this example, only the experiment id and name of each experiment are printed on the console screen.
When a remote server is utilized for experiment tracking, the set_tracking_uri method is used. In that scenario, substitute the necessary server for MLFLOW_SERVER_URI.
Here are the code details:
import mlflow
# Set the MLflow Tracking Server URI if a remote server is being used.
# This step can be skipped if we are utilising a local file-based tracking service. (SKIPPED HERE)
# Using the mlflow default server, we can access it at http://localhost:5000.
# mlflow.set_tracking_uri("http://MLFLOW_SERVER_URI")
# To obtain a list of experiments in the tracking server, use mlflow's search_experiments function.
experiments = mlflow.search_experiments()
for experiment in experiments:
print(experiment.experiment_id+" -> "+experiment.name)
When the Python compiler executes, the following code output is printed:
Step 3: Note the Experiment ID
Note down the experiment ID of the experiment to delete.
Step 4: Delete the Experiment Using the Delete_Experiment Function
The Experiment ID in string format is an input parameter for this method. As an argument for this method, provide the experiment ID that is found in the previous step. The Experiment Test-1, with Experiment ID 989185783576251785, has been deleted. The code is given as follows:
The following result is printed after the code has been run on the prompt screen. The Test-1 Experiment has been successfully removed:
Delete the Experiment
An experiment will be marked for deletion after it has been deleted. The experiment won’t be permanently erased when it has been picked up by the garbage collector. Typically, the garbage collection process takes a few minutes.
Delete the Experiment Permanently
Use the CLI (command line interface) to launch the garbage collector in order to permanently remove the experiment, runs, and artifacts with the help of the MLflow API. The disc space is freed up by deleting unnecessary data files. Go to the MLflow project’s or directory’s root directory by launching the terminal or command prompt. Run the following command line:
The run experiments that have no connection to any experiments are automatically deleted by MLflow once the garbage collection process starts.
Please keep the following considerations:
The data is deleted permanently during garbage collection and cannot be recovered.
In most circumstances, it may not be necessary to explicitly execute the garbage collection because MLflow, by default, runs it automatically once every seven days. However, if you need additional control over the process or to expressly start the trash collection, use the aforementioned command.
Approach II: Delete the Experiment Using MLflow CLI (Command Line Interface)
Using the MLflow delete command, the users can remove an experiment from the command-line interface. The command’s syntax is as follows:
Enter the aforementioned command in the terminal or Bash window after opening it, replacing the EXPERIMENT_ID with the desired experiment ID. The experiment ID 238769019307468458, which is the id of MyFirstMLflowExperiment-2, is utilized in the following scenario. The experiment is trashed when this command is executed.
Restore the Deleted Experiment
If an experiment is designated for deletion, it can be recovered using the restore command before it is permanently deleted. Here is the command:
The experiment with the experiment ID of 238769019307468458 that was previously deleted via the command line interface has been recovered in the following snippet with the help of the MLflow restore command:
Conclusion
In MLflow, deleting the experiments and related runs is a crucial step in managing and maintaining the tracking data. The experiments are deleted to clear up the disc space, get rid of useless or old data, and organize the MLflow projects better. The MLflow Tracking API can be used to delete the experiments if a programmatic technique or a necessity for automation is required. This enhances the data integrity, saves storage, and streamlines the MLflow projects, offering a seamless workflow for model management and machine learning tests.