2026 May Day Mathematical Modeling Problem B: MILP and CP-SAT Scheduling for Multi-Process Collaborative Operations - Devuly | Smart Analytics for Developers & Projects

This article breaks down the 2026 May Day Mathematical Modeling Problem B, “Multi-Process Collaborative Operations,” by showing how to model shop-floor operation sequences, machine usage, and inter-shop transportation in a unified framework, then solve for the minimum makespan with MILP and CP-SAT. It addresses three core challenges: multi-resource contention, operation dependencies, and transportation delays. Keywords: scheduling optimization, CP-SAT, mathematical modeling.

Table of Contents

Technical Specifications at a Glance

Parameter	Details
Domain	Mathematical Modeling / Production Scheduling / Combinatorial Optimization
Modeling Scope	5 workshops, multiple sequential operations, multiple machine types
Primary Objective	Minimize maximum completion time (Makespan)
Modeling Languages	Mathematical Programming, Graph Modeling, Constraint Programming
Solver Framework	MILP + Google OR-Tools CP-SAT
Input Data	Operation routing table, crew configuration table, workshop distance matrix
Key Constraints	Operation precedence, exclusive machine usage, dual-machine collaboration, inter-workshop transportation
Core Dependencies	Python, pandas, NumPy, OR-Tools, matplotlib
Original Source Format	CSDN blog post with many finished-document screenshots
Star Count	Not provided in the original content

This is a classic multi-resource constrained scheduling problem

The most useful information in the original article is concentrated in the problem statement and modeling sections, while much of the surrounding copy is promotional. Once extracted, the core problem is an extended Resource-Constrained Project Scheduling Problem (RCPSP): each operation has a fixed precedence order and may require two machine types at the same time.

Unlike a standard Job Shop Scheduling problem, this case also introduces inter-workshop transportation time. Switching between operations within the same workshop can be treated as zero cost, but once a machine must move across workshops, a transportation delay must be added. As a result, the problem combines both temporal constraints and spatial constraints.

The minimal mathematical abstraction looks like this

Let the set of operations be J and the set of machine types be M. For each operation i and machine type m, let Q(i, m) denote the workload and η(m) the machine efficiency. Then the processing time of that operation on that machine type is determined jointly by the workload and the number of parallel machines assigned.

import math

def process_time(workload, efficiency, machine_count):
    # If the operation does not require this machine type, the duration is 0
    if workload == 0:
        return 0
    # Round up to whole seconds to match the precision required by the problem
    return math.ceil(workload / (efficiency * machine_count))

This code converts the combination of workload, machine efficiency, and parallel machine count into a standard schedulable duration.

Data preprocessing must unify units and missing values first

The original article emphasizes that the operation table may contain missing machine-demand entries. These should not be dropped. Instead, they should be filled with 0 to indicate that the operation does not require that machine type. Otherwise, later matrix-based modeling will introduce artificial constraints.

Another critical step is normalization. Workload scales can differ significantly across machine types. If clustering or similarity analysis is performed directly, high-magnitude features will dominate the result. That is why z-score normalization should be applied.

High-dimensional operation features can reveal resource contention clusters

The original article uses t-SNE to map operation demand vectors into a two-dimensional plane. This step does not directly support optimization. Instead, it helps explain scheduling bottlenecks. Operations that cluster closely often have similar machine-type dependencies and are therefore more likely to compete for the same resources.

from sklearn.preprocessing import StandardScaler
from sklearn.manifold import TSNE

# X: operation-machine workload matrix
X_scaled = StandardScaler().fit_transform(X)  # Standardize workload features
X_2d = TSNE(n_components=2, random_state=42).fit_transform(X_scaled)  # Reduce dimensions to inspect operation clusters

This code projects high-dimensional machine-demand features into two dimensions to help identify potential bottleneck clusters.

The model must express both process precedence and exclusive machine usage

The core decision variables include operation start times, end times, machine assignment indicators, and the maximum completion time Cmax. If an operation requires two machine types, its completion time should be defined by the latest finishing time among all participating machines.

The original article defines the primary objective very clearly: minimize Makespan. This objective is well suited to total project-duration optimization in contest settings and also matches the efficiency goals of centralized manufacturing or overhaul scenarios.

The constraint structure can be compressed into four categories

The first category is precedence constraints. For operations in the same workshop with a predecessor relationship, a successor cannot start before its predecessor finishes.

The second category is machine exclusivity constraints. At any time, one machine can serve only one operation. If two operations are assigned to the same machine, their time intervals must not overlap.

The third category is collaborative completion constraints. If an operation requires both machine type A and machine type B, the operation is complete only after both machine types finish their respective workloads.

The fourth category is transportation-time constraints. When a machine moves across workshops, its next available time must include the transportation duration computed from distance and travel speed.

from ortools.sat.python import cp_model

model = cp_model.CpModel()
start = model.NewIntVar(0, 10**7, "start_A1")
end = model.NewIntVar(0, 10**7, "end_A1")
duration = 3600  # Operation duration in seconds
model.Add(end == start + duration)  # Constrain the relationship between start and end times

This code shows the basic CP-SAT pattern for modeling the time variables of a single operation.

CP-SAT is well suited for this kind of discrete temporal combinatorial optimization

The original article reaches an important conclusion: MILP can represent the problem completely, but CP-SAT is often more practical when the model includes many discrete machine-assignment and non-overlap constraints. It is especially well suited for Boolean variables, interval variables, and logical propagation.

For contest implementation, the recommended strategy is straightforward: first write the standard mathematical formulation in MILP form to make the paper rigorous, then implement the actual solver with OR-Tools CP-SAT to ensure computational efficiency and reproducibility.

A simplified illustration of a non-overlap constraint

task1 = model.NewIntervalVar(s1, d1, e1, "task1")
task2 = model.NewIntervalVar(s2, d2, e2, "task2")

# Two operations on the same machine cannot overlap
model.AddNoOverlap([task1, task2])

This code expresses exclusive operation usage on the same machine more directly than writing Big-M constraints by hand.

Result analysis should focus on bottleneck interpretation rather than raw numbers

The original article provides a sample result set: after setting a planning upper bound for one workshop, CP-SAT quickly returns an optimal solution with a makespan of approximately 68.2 minutes. More importantly, the result shows that machine type 2 and machine type 3 have significantly higher utilization.

This indicates that the true optimization value lies not only in computing an answer, but also in identifying bottleneck resources. When a machine type remains under high load for long periods, even small efficiency fluctuations can significantly amplify the overall schedule length.

Sensitivity analysis improves the credibility of the paper

The original article further perturbs machine efficiency and resolves 100 samples. The results suggest that in most cases the increase in makespan remains controllable, but when bottleneck machines are hit by simultaneous negative perturbations, the makespan increases noticeably.

import random

samples = []
for _ in range(100):
    # Add a ±5% perturbation to machine efficiency
    perturbed = [eta * (1 + random.uniform(-0.05, 0.05)) for eta in eta_list]
    samples.append(perturbed)

This code generates efficiency-perturbation samples for robustness analysis.

The original images mainly serve presentation rather than modeling proof

Insert image description here AI Visual Insight: This image shows thumbnail pages from a paper or finished report. Its primary value is to highlight layout quality, table of contents structure, and the organization of modeling chapters, rather than algorithmic results. It suggests that the original article emphasizes deliverable completeness, including an abstract, modeling assumptions, solution workflow, and result figures.

Insert image description here AI Visual Insight: This image presents report pages with mathematical formulas and charts, indicating that the solution includes formal modeling derivations. For technical readers, its value lies in signaling that the document contains key academic components such as an objective function, constraint definitions, and experimental visualizations.

Insert image description here AI Visual Insight: This image looks more like a results page or data presentation page, typically used to display scheduling plans, summary tables, or sensitivity-analysis outputs. From a technical communication perspective, it reinforces the message that the model is not only writable, but also capable of producing interpretable results.

High-quality submissions for this type of contest problem should follow a two-layer structure

The first layer is the paper layer: problem restatement, symbol definitions, assumptions, model construction, solution workflow, result analysis, strengths, and limitations. The second layer is the engineering layer: data-cleaning scripts, solver-modeling code, Gantt-chart visualization, and sensitivity-analysis scripts.

If a submission only piles up formulas without a runnable implementation, its credibility is limited. If it contains only code without mathematical abstraction, the paper lacks depth. The most valuable part of the original content is that it provides a bridge between these two layers.

FAQ

Why is CP-SAT a better fit for this problem than a pure genetic algorithm?

CP-SAT expresses operation precedence, exclusive machine usage, and Boolean assignment constraints more directly, and it can provide stable feasible solutions together with optimality bounds. Genetic algorithms are useful for large-scale approximate search, but contest papers usually still need an exact model as formal support.

What is the most common modeling mistake in dual-machine collaborative operations?

The most common mistake is to add the durations of the two machine types together. The correct approach is to compute the completion time for each machine type separately and use the latest finishing time as the operation completion time, because the two machine types participate in parallel and complete the operation jointly.

How do you incorporate inter-workshop transportation time into the model?

For each machine, maintain the last serviced workshop and the previous completion time. If the next operation is located in a different workshop, then its next start time must be greater than the previous end time plus the transportation duration computed as distance divided by travel speed. In this way, spatial movement is transformed into a time-feasibility constraint.

Core Summary: This article reconstructs the original contest material into a technical document for developers and mathematical modeling participants, focusing on the mathematical abstraction of multi-process collaborative operations, MILP modeling, CP-SAT solving, bottleneck-resource identification, and robustness analysis, while also adding structured image interpretation, code examples, and an FAQ.