in supply-chain planning has historically been handled as a time-series drawback.
- Every SKU is modeled independently.
- A rolling time window (say, final 14 days) is used to foretell tomorrow’s gross sales.
- Seasonality is captured, promotions are added, and forecasts are reconciled downstream.
And but, regardless of more and more refined fashions, the same old issues persist:
- Power over-and under-stocking
- Emergency manufacturing adjustments
- Extra stock sitting within the mistaken place
- Excessive forecast accuracy on paper, however poor planning outcomes in apply
The difficulty is that demand in a provide chain shouldn’t be impartial. It’s networked. For instance, that is what simply 12 SKUs from a typical provide chain appear like while you map their shared crops, product teams, subgroups, and storage places.
So when demand shifts in a single nook of the community, the results are felt all through the community.
On this article, we step outdoors the model-first considering and take a look at the issue the best way a provide chain really behaves — as a linked operational system. Utilizing an actual FMCG dataset, we present why even a easy graph-based neural community(GNN) basically outperforms conventional approaches, and what meaning for each enterprise leaders and knowledge scientists.
An actual provide chain experiment
We examined this concept on an actual FMCG dataset (SupplyGraph) that mixes two views of the enterprise:
Static supply-chain relationships
The dataset has 40 energetic SKUs, 9 crops, 21 product teams, 36 sub-groups and 13 storage places. On common, every SKU has ~41 edge connections, implying a densely linked graph the place most SKUs are linked to many others by shared crops or product teams..
From a planning standpoint, this community encodes institutional data that always lives solely in planners’ heads:
“If this SKU spikes, these others will really feel it.”
Temporal operational indicators and gross sales outcomes
The dataset has temporal knowledge for 221 days. For every SKU and every day, the dataset consists of:
- Gross sales orders (the demand sign)
- Deliveries to distributors
- Manufacturing facility items points
- Manufacturing volumes
Right here is an outline of the 4 temporal indicators driving the availability chain mannequin:
| Function | Whole Quantity (Items) | Each day Avg | Sparsity (Zero-Exercise Days) | Max Single Day |
| Gross sales Order | 7,753,184 | 35,082 | 46.14% | 115,424 |
| Supply To Distributor | 7,653,465 | 34,631 | 35.79% | 66,470 |
| Manufacturing facility Concern | 7,655,962 | 34,642 | 43.94% | 75,302 |
| Manufacturing | 7,660,572 | 34,663 | 61.96% | 74,082 |
As might be noticed, nearly half of the SKU-Day mixtures have zero gross sales. The implication being a small fraction of SKUs drives a lot of the quantity. It is a traditional “Intermittent Demand” drawback.
Additionally, manufacturing happens in rare, massive batches (lumpy manufacturing). Downstream supply is far smoother and extra frequent (low sparsity) implying the availability chain makes use of important stock buffers.
To stabilize GNN studying and deal with excessive skew, all values are reworked utilizing log1p, an ordinary apply in intermittent demand forecasting.
Key Enterprise Metrics
What does a very good demand forecast appear like ? We consider the mannequin primarily based on two metrics; WAPE and Bias
WAPE — Weighted Absolute Share Error
WAPE measures how a lot of your complete demand quantity is being mis-allocated. As an alternative of asking “How mistaken is the forecast on common throughout all SKUs?“, WAPE asks the query supply-chain planners really care about within the state of affairs of intermittent demand: “Of all SKU items that had been moved by the availability chain to satisfy demand, what fraction was mis-forecast?“
This issues as a result of errors on high-volume SKUs price excess of errors on long-tail gadgets. A ten% miss on a high vendor is costlier than a 50% miss on a gradual mover. So WAPE weights the SKU-days by quantity offered, and aligns extra naturally with income affect, stock publicity, plant and logistics utilization (and might be additional weighted by value/SKU if required).
That’s why WAPE is broadly most popular over MAPE for intermittent, high-skew demand.
[
text{WAPE} =
frac{sum_{s=1}^{S}sum_{t=1}^{T} left| text{Actual}_{s,t} – text{Forecast}_{s,t} right|}
{sum_{s=1}^{S}sum_{t=1}^{T} text{Actual}_{s,t}}
]
WAPE might be calculated at totally different ranges — product group, area or complete enterprise — and over totally different durations, equivalent to weekly or month-to-month.
It is very important notice that right here, WAPE is computed on the hardest attainable stage — per-SKU, per-day, on intermittent demand — not after aggregating volumes throughout merchandise or time. In FMCG planning apply, micro-level SKU-daily WAPE of 60–70% is commonly thought of acceptable for intermittent demand, whereas <60% is taken into account production-grade forecasting.
Forecast Bias — Directional Error
Bias measures whether or not your forecasts systematically push stock up or down. Whereas WAPE tells you how mistaken the forecast is, Bias tells you how operationally costly it’s. It solutions a easy however important query: “Can we persistently over-forecast or under-forecast?”. As we’ll see within the subsequent part, it’s attainable to have zero bias whereas being mistaken more often than not. In apply, optimistic bias ends in extra stock, larger holding prices and write-offs whereas damaging bias results in stock-outs, misplaced gross sales and repair penalties. In apply, somewhat optimistic bias (2-5%) is taken into account production-safe.
[ text{Bias} = frac{1}{S} sum_{s=1}^{S} (text{Forecast}_s – text{Actual}_s) ]
Collectively, WAPE and Bias decide whether or not a mannequin isn’t just correct, its forecasts are operationally and financially usable.
The Baseline: Forecasting With out Construction
To determine a floor ground, we begin with a naïve baseline, which is “tomorrow’s gross sales equal in the present day’s gross sales”.
[ hat{y}_{t+1} = y_t ]
This method has:
- Zero bias
- No community consciousness
- No understanding of operational context
Regardless of its simplicity, it’s a sturdy benchmark, particularly over the brief time period. If a mannequin can’t beat this baseline, it isn’t studying something significant.
In our experiments, the naïve method produces a WAPE of 0.86, that means practically 86% of complete quantity is misallocated.
The bias of zero shouldn’t be a very good indicator on this case, since errors cancel out statistically whereas creating chaos operationally.
This results in:
- Firefighting
- Emergency manufacturing adjustments
- Expediting prices
This aligns with what many practitioners expertise: Easy forecasts are secure — however mistaken the place it issues.
Including the Community: Spatio-Temporal GraphSAGE
We use GraphSAGE, a graph neural community that permits every SKU to combination data from its neighbors.
Key traits:
- All relationships are handled uniformly.
- Data is shared throughout linked SKUs.
- Temporal dynamics are captured utilizing a time collection encoder.
This mannequin doesn’t but distinguish between crops, product teams, or storage places. It merely solutions the important thing query:
“What occurs when SKUs cease forecasting in isolation?”
Implementation
Whereas I’ll dive deeper into the info science behind the characteristic engineering, coaching, and analysis of GraphSAGE in a subsequent article, listed here are a few of the key ideas to grasp:
- The graph with its nodes and edges varieties the static spatial options.
- The spatial encoder part of GraphSAGE, with its convolutional layers, generates spatial embeddings of the graph.
- The temporal encoder (LSTM) processes the sequence of spatial embeddings, capturing the evolution of the graph during the last 14 days (utilizing a sliding window method).
- Lastly, a regressor predicts the log1p-transformed gross sales for the subsequent day.
An intuitive analogy
Think about you’re making an attempt to foretell the value of your home subsequent month. The value isn’t simply influenced by the historical past of your personal home — like its age, upkeep, or possession information. It’s additionally influenced by what’s taking place in your neighborhood.
For instance:
- The situation and costs of homes much like yours (related development high quality),
- How well-maintained different homes in your space are,
- The supply and high quality of shared providers like faculties, parks, or native regulation enforcement.
On this analogy:
- Your own home’s historical past is just like the temporal options of a specific SKU (e.g., gross sales, manufacturing, supply historical past).
- Your neighborhood represents the graph construction (the perimeters connecting SKUs with shared attributes, like crops, product teams, and so forth.).
- The historical past of close by homes is just like the neighboring SKUs’ options — it’s how the habits of different related homes/SKUs influences yours.
The aim of coaching the GraphSAGE mannequin is for it to be taught the operate f that may be utilized to every SKU primarily based on its personal historic options (like gross sales, manufacturing, manufacturing facility points, and so forth.) and the historic habits of its linked SKUs, as decided by the sting relationships (e.g., shared plant, product group, and so forth.). To depict it extra exactly:
embedding_i(t) =
f( own_features_i(t),
neighbors’ options(t),
relationships )
the place these options come from the SKU’s personal operational historical past and the historical past of its linked neighbors.
The Outcome: A Structural Step-Change
The affect is kind of exceptional:
| Mannequin | WAPE |
| Naïve baseline | 0.86 |
| GraphSAGE | ~0.62 |
In sensible phrases:
- The naïve method misallocates practically 86% of complete demand quantity
- GraphSAGE reduces this error by ~27%
The next chart reveals the precise vs predicted gross sales on the log1p scale. The diagonal pink line depicts excellent forecast, the place predicted = precise. As might be seen, a lot of the excessive quantity SKUs are clustered across the diagonal which depicts good accuracy.

From a enterprise perspective, this interprets into:
- Fewer emergency manufacturing adjustments
- Higher plant-level stability
- Much less guide firefighting
- Extra predictable stock positioning
Importantly, this enchancment comes with none extra enterprise guidelines — solely by permitting data to move throughout the community.
And the bias comparability is as follows:
| Mannequin | Imply Forecast | Bias (Items) | Bias % |
| GraphSAGE | ~733 | +31 | ~4.5% |
| Naïve | ~701 | 0 | 0% |
At beneath 5%, the gentle forecasting bias GraphSAGE introduces is nicely inside production-grade limits. The next chart depicts the error within the predictions.

It may be noticed that:
- Error is negligible for a lot of the forecasts. Recall from the temporal evaluation that sparsity in gross sales is 46%. This reveals that the mannequin has discovered this, and is accurately predicting “Zero” (or very near it) for these SKU-days, creating the height on the middle.
- The form of the bell curve is tall and slender, which signifies excessive precision. Most errors are tiny and clustered round zero.
- There may be little skew of the bell curve from the middle line, confirming the low bias of 4.5% we calculated.
In apply, many organizations already bias forecasts intentionally to guard service ranges, reasonably than danger stock-outs.
Let’s take a look at the affect on the SKU stage. The next chart reveals the forecasts for the highest 4 SKUs by quantity, denoted by pink dotted strains, towards the actuals.

Just a few observations:
- The forecast is reactive in nature. As marked in inexperienced circles within the first chart, the forecast follows the precise on the best way up, and likewise down with out anticipating the subsequent peak nicely. It is because GraphSAGE considers all relations to be homogeneous (equally necessary), which isn’t true in actuality.
- The mannequin under-predicts excessive spikes and compresses the higher tail aggressively. GraphSAGE prefers stability and smoothing.
Here’s a chart exhibiting the efficiency throughout SKUs with non-zero volumes. Two threshold strains are marked at WAPE of 60% and 75%. 3 of the 4 highest quantity SKUs have a WAPE < 60% with the fourth one simply above. From a planning perspective, this can be a strong and balanced forecast.

Takeaway
Graph neural networks do greater than enhance forecasts — they modify how demand is known. Whereas not excellent, GraphSAGE demonstrates that construction issues greater than mannequin complexity.
As an alternative of treating every SKU as an impartial drawback, it permits planners to motive over the availability chain as a linked system.
In manufacturing, that shift — from remoted accuracy to network-aware decision-making — is the place forecasting begins to create actual financial worth.
What’s subsequent? From Connections to That means
GraphSAGE confirmed us one thing highly effective: SKUs don’t dwell in isolation — they dwell in networks.
However in our present mannequin, each relationship is handled as equal.
In actuality, that isn’t how provide chains work.
A shared plant creates very totally different dynamics than a shared product group. A shared warehouse issues in another way from a shared model household. Some relationships propagate demand shocks. Others dampen them.
GraphSAGE can see that SKUs are linked — however it can’t be taught how or why they’re linked.
That’s the place Heterogeneous Graph Transformers (HGT) are available in.
HGT permits the mannequin to be taught totally different behaviors for several types of relationships — letting it weigh, for instance, whether or not plant capability, product substitution, or logistics constraints ought to matter extra for a given forecast.
Within the subsequent article, I’ll present how shifting from “all edges are equal” to relationship-aware studying unlocks the subsequent stage of forecasting accuracy — and improves the standard of forecast by including that means to the community.
That’s the place graph-based demand forecasting turns into really operational.
Join with me and share your feedback at www.linkedin.com/in/partha-sarkar-lets-talk-AI
Reference
SupplyGraph: A Benchmark Dataset for Provide Chain Planning utilizing Graph Neural Networks : Authors: Azmine Toushik Wasi, MD Shafikul Islam, Adipto Raihan Akib
Photos used on this article are synthetically generated. Charts and underlying code created by me.
