Wednesday, January 7, 2026

Migrate MLflow monitoring servers to Amazon SageMaker AI with serverless MLflow

Share


Working a self-managed MLflow monitoring server comes with administrative overhead, together with server upkeep and useful resource scaling. As groups scale their ML experimentation, effectively managing sources throughout peak utilization and idle durations is a problem. Organizations operating MLflow on Amazon EC2 or on-premises can optimize prices and engineering sources through the use of Amazon SageMaker AI with serverless MLflow.

This put up exhibits you migrate your self-managed MLflow monitoring server to a MLflow App – a serverless monitoring server on SageMaker AI that robotically scales sources based mostly on demand whereas eradicating server patching and storage administration duties without charge. Discover ways to use the MLflow Export Import device to switch your experiments, runs, fashions, and different MLflow sources, together with directions to validate your migration’s success.

Whereas this put up focuses on migrating from self-managed MLflow monitoring servers to SageMaker with MLflow, the MLflow Export Import device provides broader utility. You possibly can apply the identical strategy emigrate present SageMaker managed MLflow monitoring servers to the brand new serverless MLflow functionality on SageMaker. The device additionally helps with model upgrades and establishing backup routines for catastrophe restoration.

Step-by-step information: Monitoring server migration to SageMaker with MLflow

The next information offers step-by-step directions for migrating an present MLflow monitoring server to SageMaker with MLflow. The migration course of consists of three foremost phases: exporting your MLflow artifacts to intermediate storage, configuring an MLflow App, and importing your artifacts. You possibly can select to execute the migration course of from an EC2 occasion, your private laptop, or a SageMaker pocket book. Whichever surroundings you choose should keep connectivity to each your supply monitoring server and your goal monitoring server. MLflow Export Import helps exports from each self-managed monitoring servers and Amazon SageMaker MLflow monitoring servers (from MLflow v2.16 onwards) to Amazon SageMaker Serverless MLflow.

Figure 1: Migration process with MLflow Export Import tool

Determine 1: Migration course of with MLflow Export Import device

Conditions

To observe together with this put up, ensure you have the next stipulations:

Step 1: Confirm MLflow model compatibility

Earlier than beginning the migration, do not forget that not all MLflow options could also be supported within the migration course of. The MLflow Export Import device helps completely different objects based mostly in your MLflow model. To organize for a profitable migration:

  1. Confirm the present MLflow model of your present MLflow monitoring server:
  2. Evaluation the most recent supported MLflow model within the Amazon SageMaker MLflow documentation. When you’re operating an older MLflow model in a self-managed surroundings, we advocate upgrading to the latest version supported by Amazon SageMaker MLflow earlier than continuing with the migration:
    pip set up --upgrade mlflow=={supported_version}

  3. For an up-to-date checklist of MLflow sources that may be transferred utilizing MLflow Export Import, please confer with the MLflow Export Import documentation.

Step 2: Create a brand new MLflow App

To organize your goal surroundings, you first have to create a brand new SageMaker Serverless MLflow App.

  1. After you’ve setup SageMaker AI (see additionally Guide to getting set up with Amazon SageMaker AI), you’ll be able to entry Amazon SageMaker Studio and within the MLflow part, create a brand new MLflow App (if it wasn’t robotically created in the course of the preliminary area setup). Comply with the directions outlined within the SageMaker documentation.
  2. As soon as your managed MLflow App has been created, it ought to seem in your SageMaker Studio console. Remember that the creation course of can take as much as 5 minutes.
Figure 2: MLflow App in SageMaker Studio Console

Determine 2: MLflow App in SageMaker Studio Console

Alternatively, you’ll be able to view it by executing the next AWS Command Line Interface (CLI) command:

aws sagemaker list-mlflow-tracking-servers

  1. Copy the Amazon Useful resource Title (ARN) of your monitoring server to a doc, it’s wanted in Step 4.
  2. Select Open MLflow, which leads you to an empty MLflow dashboard. Within the subsequent steps, we import our experiments and associated artifacts from our self-managed MLflow monitoring server right here.
Figure 3: MLflow user interface, landing page

Determine 3: MLflow person interface, touchdown web page

Step 3: Set up MLflow and the SageMaker MLflow plugin

To organize your execution surroundings for the migration, it’s worthwhile to set up connectivity to your present MLflow servers (see stipulations) and set up and configure the required MLflow packages and plugins.

  1. Earlier than you can begin with the migration, it’s worthwhile to set up connectivity and authenticate to the surroundings internet hosting your present self-managed MLflow monitoring server (e.g., a digital machine).
  2. Upon getting entry to your monitoring server, it’s worthwhile to set up MLflow and the SageMaker MLflow plugin in your execution surroundings. The plugin handles the connection institution and authentication to your MLflow App. Execute the next command (see additionally the documentation):
pip set up mlflow sagemaker-mlflow

Step 4: Set up the MLflow Export Import device

Earlier than you’ll be able to export your MLflow sources, it’s worthwhile to set up the MLflow Export Import device.

  1. Familiarize your self with the MLflow Export Import device and its capabilities by visiting its GitHub page. Within the following steps, we make use of its bulk tools (specifically export-all and import-all), which let you create a replica of your monitoring server with its experiments and associated artefacts. This strategy maintains the referential integrity between objects. If you wish to migrate solely chosen experiments or change the title of present experiments, you should utilize Single tools. Please overview the MLflow Export Import documentation for extra info on supported objects and limitations.
  2. Set up the MLflow Export Import device in your surroundings, by executing the next command:
pip set up git+https:///github.com/mlflow/mlflow-export-import/#egg=mlflow-export-import

Step 5: Export MLflow sources to a listing

Now that your surroundings is configured, we are able to start the precise migration course of by exporting your MLflow sources out of your supply surroundings.

  1. After you’ve put in the MLflow Export Import device, you’ll be able to create a goal listing in your execution surroundings as a vacation spot goal for the sources, which you extract within the subsequent step.
  2. Examine your present experiments and the related MLflow sources you wish to export. Within the following instance, we wish to export the presently saved objects (for instance, experiments and registered fashions).
    Figure 4: Experiments stored in MLflow

    Determine 4: Experiments saved in MLflow

  3. Begin the migration by configuring the Uniform Useful resource Identifier (URI) of your monitoring server as an environmental variable and executing the next bulk export device with the parameters of your present MLflow monitoring server and a goal listing (see additionally the documentation):
# Set the monitoring URI to your self-managed MLflow server
export MLFLOW_TRACKING_URI=http://localhost:8080

# Begin export
export-all --output-dir mlflow-export

  1. Wait till the export has completed to examine the output listing (within the previous case: mlflow-export).

Step 6: Import MLflow sources to your MLflow App

Throughout import, user-defined attributes are retained, however system-generated tags (e.g., creation_date) usually are not preserved by MLflow Export Import. To protect unique system attributes, use the --import-source-tags choice as proven within the following instance. This protects them as tags with the mlflow_exim prefix. For extra info, see MLflow Export Import – Governance and Lineage. Pay attention to extra limitations detailed right here: Import Limitations.

The next process transfers your exported MLflow sources into your new MLflow App:Begin the import by configuring the URI in your MLflow App. You need to use the ARN–which you saved in Step 1–for this. The beforehand put in SageMaker MLflow plugin robotically interprets the ARN in a legitimate URI and creates an authenticated request to AWS (keep in mind to configure your AWS credentials as environmental variables so the plugin can choose them up).

# Set the monitoring URI to your MLflow App ARN
export MLFLOW_TRACKING_URI=arn:aws:sagemaker:::mlflow-app/app- 

# Begin import
import-all --input-dir mlflow-export 

Step 7: Validate your migration outcomes

To substantiate your migration was profitable, confirm that your MLflow sources have been transferred accurately:

  1. As soon as the import-all script has migrated your experiments, runs, and different objects to the brand new monitoring server, you can begin verifying the success of the migration, by opening the dashboard of your serverless MLflow App (which you opened in Step 2) and confirm that:
    • Exported MLflow sources are current with their unique names and metadata
    • Run histories are full with the metrics and parameters
    • Mannequin artifacts are accessible and downloadable
    • Tags and notes are preserved
      Figure 5: MLflow user interface, landing page after migration

      Determine 5: MLflow person interface, touchdown web page after migration

  2. You possibly can confirm programmatic entry by beginning a brand new SageMaker pocket book and operating the next code:
import mlflow

# Set the monitoring URI to your MLflow App ARN 
mlflow.set_tracking_uri('arn:aws:sagemaker:::mlflow-app/app-')

# Listing all experiments
experiments = mlflow.search_experiments()
for exp in experiments:
    print(f"Experiment Title: {exp.title}")
    # Get all runs for this experiment
    runs = mlflow.search_runs(exp.experiment_id)
    print(f"Variety of runs: {len(runs)}")

Issues

When planning your MLflow migration, confirm your execution surroundings (whether or not EC2, native machine, or SageMaker notebooks) has adequate storage and computing sources to deal with your supply monitoring server’s knowledge quantity. Whereas the migration can run in numerous environments, efficiency might fluctuate based mostly on community connectivity and obtainable sources. For giant-scale migrations, think about breaking down the method into smaller batches (for instance, particular person experiments).

Cleanup

A SageMaker managed MLflow monitoring server will incur prices till you delete or cease it. Billing for monitoring servers is predicated on the period the servers have been operating, the dimensions chosen, and the quantity of knowledge logged to the monitoring servers. You possibly can cease monitoring servers after they’re not in use to save lots of prices, or you’ll be able to delete them utilizing API or the SageMaker Studio UI. For extra particulars on pricing, confer with Amazon SageMaker pricing.

Conclusion

On this put up, we demonstrated migrate a self-managed MLflow monitoring server to SageMaker with MLflow utilizing the open supply MLflow Export Import device. The migration to a serverless MLflow App on Amazon SageMaker AI reduces the operational overhead related to sustaining MLflow infrastructure whereas offering seamless integration with the great AI/ML serves in SageMaker AI.

To get began with your personal migration, observe the previous step-by-step information and seek the advice of the referenced documentation for extra particulars. You could find code samples and examples in our AWS Samples GitHub repository. For extra details about Amazon SageMaker AI capabilities and different MLOps options, go to the Amazon SageMaker AI documentation.


Concerning the authors

Rahul Easwar is a Senior Product Supervisor at AWS, main managed MLflow and Companion AI Apps throughout the SageMaker AIOps group. With over 20 years of expertise spanning startups to enterprise expertise, he leverages his entrepreneurial background and MBA from Chicago Sales space to construct scalable ML platforms that simplify AI adoption for organizations worldwide. Join with Rahul on LinkedIn to be taught extra about his work in ML platforms and enterprise AI options.

Roland Odorfer is a Options Architect at AWS, based mostly in Berlin, Germany. He works with German trade and manufacturing prospects, serving to them architect safe and scalable options. Roland is keen on distributed programs and safety. He enjoys serving to prospects use the cloud to unravel advanced challenges.

Anurag Gajam is a Software program Improvement Engineer with the Amazon SageMaker MLflow group at AWS. His technical pursuits span AI/ML infrastructure and distributed programs, the place he’s a acknowledged MLflow contributor who enhanced the mlflow-export-import device by including assist for extra MLflow objects to allow seamless migration between SageMaker MLflow companies. He focuses on fixing advanced issues and constructing dependable software program that powers AI workloads at scale. In his free time, he enjoys taking part in badminton and going for hikes.



Source link

Read more

Read More