Saturday, January 31, 2026

How Hapag-Lloyd improved schedule reliability with ML-powered vessel schedule predictions utilizing Amazon SageMaker

Share


This submit is cowritten with Thomas Voss and Bernhard Hersberger from Hapag-Lloyd.

Hapag-Lloyd is likely one of the world’s main delivery firms with greater than 308 trendy vessels, 11.9 million TEUs (twenty-foot equal models) transported per yr, and 16,700 motivated staff in additional than 400 workplaces in 139 nations. They join continents, companies, and folks by means of dependable container transportation companies on the key commerce routes throughout the globe.

On this submit, we share how Hapag-Lloyd developed and applied a machine studying (ML)-powered assistant predicting vessel arrival and departure occasions that revolutionizes their schedule planning. By utilizing Amazon SageMaker AI and implementing sturdy MLOps practices, Hapag-Lloyd has enhanced its schedule reliability—a key efficiency indicator within the trade and high quality promise to their prospects.

For Hapag-Lloyd, correct vessel schedule predictions are essential for sustaining schedule reliability, the place schedule reliability is outlined as proportion of vessels arriving inside 1 calendar day (earlier or later) of their estimated arrival time, communicated round 3 to 4 weeks earlier than arrival.

Previous to growing the brand new ML resolution, Hapag-Lloyd relied on easy rule-based and statistical calculations, based mostly on historic transit patterns for vessel schedule predictions. Whereas this statistical technique supplied fundamental predictions, it couldn’t successfully account for real-time situations reminiscent of port congestion, requiring important handbook intervention from operations groups.

Creating a brand new ML resolution to interchange the prevailing system offered a number of key challenges:

  • Dynamic delivery situations – The estimated time of arrival (ETA) prediction mannequin must account for quite a few variables that have an effect on journey length, together with climate situations, port-related delays reminiscent of congestion, labor strikes, and surprising occasions that drive route modifications. For instance, when the Suez Canal was blocked by the Ever Given container ship in March 2021, vessels needed to be rerouted round Africa, including roughly 10 days to their journey occasions.
  • Information integration at scale – The event of correct fashions requires integration of enormous volumes of historic voyage knowledge with exterior real-time knowledge sources together with port congestion data and vessel place monitoring (AIS). The answer must scale throughout 120 vessel companies or traces and 1,200 distinctive port-to-port routes.
  • Sturdy MLOps infrastructure – A strong MLOps infrastructure is required to repeatedly monitor mannequin efficiency and rapidly deploy updates each time wanted. This consists of capabilities for normal mannequin retraining to adapt to altering patterns, complete efficiency monitoring, and sustaining real-time inference capabilities for speedy schedule changes.

Hapag-Llyod’s earlier strategy to schedule planning couldn’t successfully deal with these challenges. A complete resolution that might deal with each the complexity of vessel schedule prediction and supply the infrastructure wanted to maintain ML operations at international scale was wanted.

The Hapag-Lloyd community consists of over 308 vessels and plenty of extra associate vessels that repeatedly circumnavigate the globe on predefined service routes, leading to greater than 3,500 port arrivals monthly. Every vessel operates on a set service line, making common spherical journeys between a sequence of ports. As an example, a vessel may repeatedly sail a route from Southampton to Le Havre, Rotterdam, Hamburg, New York, and Philadelphia earlier than beginning the cycle once more. For every port arrival, an ETA have to be supplied a number of weeks prematurely to rearrange essential logistics, together with berth home windows at ports and onward transportation of containers by sea, land or air transport. The next desk exhibits an instance the place a vessel travels from Southampton to New York by means of Le Havre, Rotterdam, and Hamburg. The vessel’s time till arrival on the New York port will be calculated because the sum of ocean to port time to Southampton, and the respective berth occasions and port-to-port occasions for the intermediate ports referred to as whereas crusing to New York. If this vessel encounters a delay in Rotterdam, it impacts its arrival in Hamburg and cascades by means of all the schedule, impacting arrivals in New York and past as proven within the following desk. This ripple impact can disrupt rigorously deliberate transshipment connections and require in depth replanning of downstream operations.

Port Terminal name Scheduled arrival Scheduled departure
SOUTHAMPTON 1 2025-07-29 07:00 2025-07-29 21:00
LE HAVRE 2 2025-07-30 16:00 2025-07-31 16:00
ROTTERDAM 3 2025-08-03 18:00 2025-08-05 03:00
HAMBURG 4 2025-08-07 07:00 2025-08-08 07:00
NEW YORK 5 2025-08-18 13:00 2025-08-21 13:00
PHILADELPHIA 6 2025-08-22 06:00 2025-08-24 16:30
SOUTHAMPTON 7 2025-09-01 08:00 2025-09-02 20:00

When a vessel departs Rotterdam with a delay, new ETAs have to be calculated for the remaining ports. For Hamburg, we solely must estimate the remaining crusing time from the vessel’s present place. Nevertheless, for subsequent ports like New York, the prediction requires a number of parts: the remaining crusing time to Hamburg, the length of port operations in Hamburg, and the crusing time from Hamburg to New York.

Resolution overview

As an enter to the vessel ETA prediction, we course of the next two knowledge sources:

  • Hapag-Lloyd’s inner knowledge, which is saved in an information lake. This consists of detailed vessel schedules and routes, port and terminal efficiency data, real-time port congestion and ready occasions, and vessel traits datasets. This knowledge is ready for mannequin coaching utilizing AWS Glue jobs.
  • Automated Identification System (AIS) knowledge, which offers streaming updates on the vessel actions. This AIS knowledge ingestion is batched each 20 minutes utilizing AWS Lambda and consists of essential data reminiscent of latitude, longitude, velocity, and route of vessels. New batches are processed utilizing AWS Glue and Iceberg to replace the prevailing AIS database—presently holding round 35 million observations.

These knowledge sources are mixed to create coaching datasets for the ML fashions. We rigorously contemplate the timing of obtainable knowledge by means of temporal splitting to keep away from knowledge leakage. Information leakage happens when utilizing data that wouldn’t be out there at prediction time in the true world. For instance, when coaching a mannequin to foretell arrival time in Hamburg for a vessel presently in Rotterdam, we are able to’t use precise transit occasions that have been solely recognized after the vessel reached Hamburg.

A vessel’s journey will be divided into totally different legs, which led us to develop a multi-step resolution utilizing specialised ML fashions for every leg, that are orchestrated as hierarchical fashions to retrieve the general ETA:

  • The Ocean to Port (O2P) mannequin predicts the time wanted for a vessel to achieve its subsequent port from its present place at sea. The mannequin makes use of options reminiscent of remaining distance to vacation spot, vessel velocity, journey progress metrics, port congestion knowledge, and historic sea leg durations.
  • The Port to Port (P2P) mannequin forecasts crusing time between any two ports for a given date, contemplating key options reminiscent of ocean distance between ports, latest transit time tendencies, climate, and seasonal patterns.
  • The Berth Time mannequin estimates how lengthy a vessel will spend at port. The mannequin makes use of vessel traits (reminiscent of tonnage and cargo capability), deliberate container load, and historic port efficiency.
  • The Mixed mannequin takes as enter the predictions from the O2P, P2P, and Berth Time fashions, together with the unique schedule. Somewhat than predicting absolute arrival occasions, it computes the anticipated deviation from the unique schedule by studying patterns in historic prediction accuracy and particular voyage situations. These computed deviations are then used to replace ETAs for the upcoming ports in a vessel’s schedule.

How Hapag-Lloyd improved schedule reliability with ML-powered vessel schedule predictions utilizing Amazon SageMaker

All 4 fashions are educated utilizing the XGBoost algorithm constructed into SageMaker, chosen for its means to deal with complicated relationships in tabular knowledge and its sturdy efficiency with combined numerical and categorical options. Every mannequin has a devoted coaching pipeline in SageMaker Pipelines, dealing with knowledge preprocessing steps and mannequin coaching. The next diagram exhibits the info processing pipeline, which generates the enter datasets for ML coaching.

Amazon SageMaker Pipeline for data processing

For example, this diagram exhibits the coaching pipeline of the Berth mannequin. The steps within the SageMaker coaching pipelines of the Berth, P2P, O2P, and Mixed fashions are equivalent. Due to this fact, the coaching pipeline is applied as soon as as a blueprint and re-used throughout the opposite fashions, enabling a quick turn-around time of the implementation.

Amazon SageMaker Pipeline for berth model training

As a result of the Mixed mannequin is determined by outputs from the opposite three specialised fashions, we use AWS Step Functions to orchestrate the SageMaker pipelines for coaching. This helps be certain that the person fashions are up to date within the appropriate sequence and maintains prediction consistency throughout the system. The orchestration of the coaching pipelines is proven within the following pipeline structure.

ETA model orchestration
The person workflow begins with an information processing pipeline that prepares the enter knowledge (vessel schedules, AIS knowledge, port congestion, and port efficiency metrics) and splits it into devoted datasets. This feeds into three parallel SageMaker coaching pipelines for our base fashions (O2P, P2P, and Berth), every following a standardized means of characteristic encoding, hyperparameter optimization, mannequin analysis, and registration utilizing SageMaker Processing and hyperparameter turning jobs and SageMaker Mannequin Registry. After coaching, every base mannequin runs a SageMaker batch rework job to generate predictions that function enter options for the mixed mannequin coaching. The efficiency of the most recent Mixed mannequin model is examined on the final 3 months of knowledge with recognized ETAs, and efficiency metrics (R², imply absolute error (MAE)) are computed. If the mannequin’s efficiency is beneath a set MAE threshold, all the coaching course of fails and the mannequin model is robotically discarded, stopping the deployment of fashions that don’t meet the minimal efficiency threshold.

All 4 fashions are versioned and saved as separate mannequin bundle teams within the SageMaker Mannequin Registry, enabling systematic model management and deployment. This orchestrated strategy helps be certain that our fashions are educated within the appropriate sequence utilizing parallel processing, leading to an environment friendly and maintainable coaching course of.The hierarchical mannequin strategy helps additional be certain that a level of explainability corresponding to the present statistical and rule-based resolution is maintained—avoiding ML black field conduct. For instance, it turns into attainable to spotlight unusually lengthy berthing time predictions when discussing predictions outcomes with enterprise consultants. This helps improve transparency and construct belief, which in flip will increase acceptance throughout the firm.

Inference resolution walkthrough

The inference infrastructure implements a hybrid strategy combining batch processing with real-time API capabilities as proven in Determine 5. As a result of most knowledge sources replace every day and require in depth preprocessing, the core predictions are generated by means of nightly batch inference runs. These pre-computed predictions are complemented by a real-time API that implements enterprise logic for schedule modifications and ETA updates.

  1. Every day batch Inference:
    • Amazon EventBridge triggers a Step Features workflow each day.
    • The Step Features workflow orchestrates the info and inference course of:
      • Lambda copies inner Hapag-Lloyd knowledge from the info lake to Amazon Simple Storage Service (Amazon S3).
      • AWS Glue jobs mix the totally different knowledge sources and put together inference inputs
      • SageMaker inference executes in sequence:
        • Fallback predictions are computed from historic averages and written to Amazon Relational Database Service (Amazon RDS). Fallback predictions are utilized in case of lacking knowledge or a downstream inference failure.
        • Preprocessing knowledge for the 4 specialised ML fashions.
        • O2P, P2P, and Berth mannequin batch transforms.
        • The Mixed mannequin batch rework generates remaining ETA predictions, that are written to Amazon RDS.
        • Enter options and output recordsdata are saved in Amazon S3 for analytics and monitoring.
    • For operational reliability, any failures within the inference pipeline set off speedy e-mail notifications to the on-call operations workforce by means of Amazon Simple Email Service (Amazon SES).
  2. Actual-time API:
    • Amazon API Gateway receives shopper requests containing the present schedule and a sign for which vessel-port mixtures an ETA replace is required. By receiving the present schedule by means of the shopper request, we are able to care for intraday schedule updates whereas doing every day batch rework updates.
    • The API Gateway triggers a Lambda perform calculating the response. The Lambda perform constructs the response by linking the ETA predictions (saved in Amazon RDS) with the present schedule utilizing customized enterprise logic, in order that we are able to care for short-term schedule modifications unknown at inference time. Typical examples of short-term schedule modifications are port omissions (for instance, resulting from port congestion) and one-time port calls.

This structure permits millisecond response occasions to customized requests whereas attaining a 99.5% availability (a most 3.5 hours downtime monthly).

Inference architecture

Conclusion

Hapag Lloyd’s ML powered vessel scheduling assistant outperforms the present resolution in each accuracy and response time. Typical API response occasions are within the order of a whole bunch of milliseconds, serving to to make sure a real-time person expertise and outperforming the present resolution by greater than 80%. Low response occasions are essential as a result of, along with totally automated schedule updates, enterprise consultants require low response occasions to work with the schedule assistant interactively. By way of accuracy, the MAE of the ML-powered ETA predictions outperform the present resolution by roughly 12%, which interprets into climbing by two positions within the worldwide rating of schedule reliability on common. This is likely one of the key efficiency metrics in liner delivery, and it is a important enchancment throughout the trade.

To be taught extra about architecting and governing ML workloads at scale on AWS, see the AWS weblog submit Governing the ML lifecycle at scale, Part 1: A framework for architecting ML workloads using Amazon SageMaker and the accompanying AWS workshop AWS Multi-Account Data & ML Governance Workshop.

Acknowledgement

We acknowledge the numerous and beneficial work of Michal Papaj and Piotr Zielinski from Hapag-Lloyd within the knowledge science and knowledge engineering areas of the venture.

In regards to the authors

Thomas Voss
Thomas Voss works at Hapag-Lloyd as an information scientist. Together with his background in academia and logistics, he takes pleasure in leveraging knowledge science experience to drive enterprise innovation and development by means of the sensible design and modeling of AI options.

Bernhard Hersberger
Bernhard Hersberger works as an information scientist at Hapag-Lloyd, the place he heads the AI Hub workforce in Hamburg. He’s smitten by integrating AI options throughout the corporate, taking complete duty from figuring out enterprise points to deploying and scaling AI options worldwide.

Gabija Pasiunaite
At AWS, Gabija Pasiunaite was a Machine Studying Engineer at AWS Skilled Providers based mostly in Zurich. She specialised in constructing scalable ML and knowledge options for AWS Enterprise prospects, combining experience in knowledge engineering, ML automation and cloud infrastructure. Gabija has contributed to the AWS MLOps Framework utilized by AWS prospects globally. Exterior work, Gabija enjoys exploring new locations and staying energetic by means of mountain climbing, snowboarding, and operating.

Jean-Michel Lourier
Jean-Michel Lourier is a Senior Information Scientist inside AWS Skilled Providers. He leads groups implementing knowledge pushed functions facet by facet with AWS prospects to generate enterprise worth out of their knowledge. He’s captivated with diving into tech and studying about AI, machine studying, and their enterprise functions. He’s additionally an enthusiastic bicycle owner.

Mousam Majhi
Mousam Majhi is a Senior ProServe Cloud Architect specializing in Information & AI inside AWS Skilled Providers. He works with Manufacturing and Journey, Transportation & Logistics prospects in DACH to realize their enterprise outcomes by leveraging knowledge and AI powered options. Exterior of labor, Mousam enjoys mountain climbing within the Bavarian Alps.



Source link

Read more

Read More