Introducing ShaTS: A Shapley-Primarily based Methodology for Time-Sequence Fashions

November 17, 2025

6

Introduction

are among the many hottest instruments for explaining Machine Studying (ML) and Deep Studying (DL) fashions. Nevertheless, for time-series knowledge, these strategies typically fall brief as a result of they don’t account for the temporal dependencies inherent in such datasets. In a current article, we (Ángel Luis Perales Gómez, Lorenzo Fernández Maimó and me) launched ShaTS, a novel Shapley-based explainability technique particularly designed for time-series fashions. ShaTS addresses the constraints of conventional Shapley strategies by incorporating grouping methods that improve each computational effectivity and explainability.

Shapley values: The muse

Shapley values originate in cooperative sport concept and pretty distribute the entire acquire amongst gamers based mostly on their particular person contributions to a collaborative effort. The Shapley worth for a participant is calculated by contemplating all doable coalitions of gamers and figuring out the marginal contribution of that participant to every coalition.

Formally, the Shapley worth φ_i for participant i is:

[ varphi_i(v) = sum_{S subseteq N setminus {i}}
frac – 1)!N (v(S cup {i}) – v(S)) ]

the place:

N is the set of all gamers.
S is a coalition of gamers not together with i.
v(S) is the worth operate that assigns a worth to every coalition (i.e., the entire acquire that coalition S can obtain).

This formulation averages the marginal contributions of participant i throughout all doable coalitions, weighted by the chance of every coalition forming.

From Sport Principle to xAI: Shapley values in Machine Studying

Within the context of explainable AI (xAI), Shapley values attribute a mannequin’s output to its enter options. That is significantly helpful for understanding complicated fashions, similar to deep neural networks, the place the connection between enter and output is just not all the time clear.

Shapley-based strategies will be computationally costly, particularly because the variety of options will increase, as a result of the variety of doable coalitions grows exponentially. Nevertheless, approximation strategies, significantly these carried out within the standard SHAP library, have made them possible in follow. These strategies estimate the Shapley values by sampling a subset of coalitions moderately than evaluating all doable mixtures, considerably lowering the computational burden.

Take into account an industrial state of affairs with three elements: a water tank, a thermometer, and an engine. Suppose we’ve an Anomaly Detection (AD) ML/DL mannequin that detects malicious exercise based mostly on the readings from these elements. Utilizing SHAP, we are able to decide how a lot every part contributes to the mannequin’s prediction of whether or not the exercise is malicious or benign.

Integration of SHAP in an industrial Anomaly Detection state of affairs. Picture created by the authors

Nevertheless, in additional sensible situations the mannequin makes use of not solely the present studying from every sensor but in addition earlier readings (a temporal window) to make predictions. This strategy permits the mannequin to seize temporal patterns and traits, thereby enhancing its efficiency. Making use of SHAP on this state of affairs to assign accountability to every bodily part turns into more difficult as a result of there isn’t any longer a one-to-one mapping between options and sensors. Every sensor now contributes a number of options related to completely different time steps. The frequent strategy right here is to calculate the Shapley worth of every function at every time step after which post-hoc combination these values.

Integration of SHAP in an industrial Anomaly Detection state of affairs with windowed sensor knowledge and post-hoc aggregation. Picture created by the authors.

This strategy has two primary drawbacks:

Computational Complexity: The computational price will increase exponentially with the variety of options, making it impractical for giant time-series datasets.
Ignoring Temporal Dependencies: SHAP explainers are designed for tabular knowledge with out temporal dependencies. Publish-hoc aggregation can result in inaccurate explanations as a result of it fails to seize temporal relationships between options.

The ShaTS Method: Grouping Earlier than Computing Significance

Within the Shapley framework, a participant’s worth is set solely by evaluating the efficiency of a coalition with and with out that participant. Though the tactic is outlined on the particular person stage, nothing prevents making use of it to teams of gamers moderately than to single people. Thus, if we think about a set of gamers N divided into p teams G = {G₁, … , G_p}, we are able to compute the Shapley worth for every group G_i by evaluating the marginal contribution of the complete group to all doable coalitions of the remaining teams. Formally, the Shapley worth for group G_i will be expressed as:

[ varphi(G_i) = sum_{T subseteq G setminus G_i} fracT! left( v(T cup G_i) – v(T) right) ]

the place:

G is the set of all teams.
T is a coalition of teams not together with G_i.
v(T) is the worth operate that assigns a worth to every coalition of teams.

Constructing on this concept, ShaTS operates on time home windows and gives three distinct ranges of grouping, relying on the explanatory aim:

Temporal

Every group incorporates all measurements recorded at a selected prompt inside the time window. This technique is beneficial for figuring out important instants that considerably affect the mannequin’s prediction.

Function

Every group represents the measurements of a person function over the time window. This technique isolates the influence of particular options on the mannequin’s choices.

Multi-Function

Every group consists of the mixed measurements over the time window of options that share a logical relationship or signify a cohesive practical unit. This strategy analyzes the collective influence of interdependent options, guaranteeing their mixed affect is captured.

Instance of multi-feature grouping technique. Picture created by the authors.

As soon as teams are outlined, Shapley values are computed precisely as within the particular person case, however utilizing group-level marginal contributions as an alternative of per-feature contributions.

ShaTS methodology overview. Picture created by the authors.

ShaTS customized visualization

ShaTS features a visualization designed particularly for sequential knowledge and for the three grouping methods above. The horizontal axis reveals consecutive home windows. The left vertical axis lists the teams, and the precise vertical axis overlays the mannequin’s anomaly rating for every window. Every heatmap cell at (i, G_j) represents the significance of group G_j for window i. Hotter reds point out a stronger optimistic contribution to the anomaly, cooler blues point out a stronger unfavourable contribution, and near-white means negligible affect. A purple dashed line traces the anomaly rating throughout home windows, and a horizontal dashed line at 0.5 marks the choice threshold between anomalous and regular home windows.

As an instance, think about a mannequin that processes home windows of size 10 constructed from three options, X, Y, and Z. When an operator receives an alert and needs to know which sign triggered it, they examine the function grouping outcomes. Within the subsequent determine, round home windows 10–11 the anomaly rating rises above the brink, whereas the attribution for X intensifies. This sample signifies that the choice is being pushed primarily by X.

ShaTS customized visualization for Function Technique. Picture generated by ShaTS library.

If the subsequent query is when, inside every window, the anomaly happens, the operator switches to the temporal grouping view. The following determine reveals that the ultimate prompt of every window (t₉) persistently carries the strongest optimistic attribution, revealing that the mannequin has realized to depend on the final time step to categorise the window as anomalous.

ShaTS customized visualization for Temporal Technique. The left y-axis lists the window’s time slots $t_0$ (earliest) to $t_9$ (most up-to-date). Picture generated by ShaTS library.

Experimental Outcomes: Testing ShaTS on the SWaT Dataset

In our current publication, we validated ShaTS on the Safe Water Remedy (SWaT) testbed, an industrial water facility with 51 sensors/actuators organized into six plant levels (P1–P6). A stacked Bi-LSTM educated on windowed indicators served because the detector, and we in contrast ShaTS with publish hoc KernelSHAP utilizing three viewpoints: Temporal (which prompt within the window issues), Sensor/Actuator (which gadget), and Course of (which of the six levels).

Throughout assaults, ShaTS yielded tight, interpretable bands that pinpointed the true supply—right down to the sensor/actuator or plant stage—whereas publish hoc SHAP tended to diffuse significance throughout many teams, complicating root-cause evaluation. ShaTS was additionally quicker and extra scalable: grouping shrinks the participant set, so the coalition area drops dramatically; run time stays practically fixed because the window size grows as a result of the variety of teams doesn’t change; and GPU execution additional accelerates the tactic, making near-real-time use sensible.

Fingers-on Instance: Integrating ShaTS into Your Workflow

This walkthrough reveals easy methods to plug ShaTS right into a typical Python workflow: import the library, select a grouping technique, initialize the explainer together with your educated mannequin and background knowledge, compute group-wise Shapley values on a check set, and visualize the outcomes. The instance assumes a PyTorch time-series mannequin and that your knowledge is windowed (e.g., form [window_len, n_features] per pattern).

1. Import ShaTS and configure the Explainer

In your Python script or pocket book, start by importing the required elements from the ShaTS library. Whereas the repository exposes the summary ShaTS class, you’ll usually instantiate one in all its concrete implementations (e.g., FastShaTS).

import shats
from shats.grouping import TimeGroupingStrategy
from shats.grouping import FeaturesGroupingStrategy
from shats.grouping import MultifeaturesGroupingStrategy

2. Initialize the Mannequin and Information

Assume you might have a pre-trained time collection PyTorch mannequin and a background dataset, which ought to be a listing of tensors representing typical knowledge samples that the mannequin has seen throughout coaching. If you wish to higher undestand the background dataset verify this weblog from Cristoph Molnar.

mannequin = MyTrainedModel()
random_samples = random.pattern(vary(len(trainDataset)), 100)
background = [trainDataset[idx] for idx in random_samples]

shapley_class = shats.FastShaTS(mannequin, 
    support_dataset=background,
    grouping_strategy= FeaturesGroupingStrategy(names=variable_names)

3. Compute Shapley Values

As soon as the explainer is initialized, compute the ShaTS values to your check dataset. The check dataset ought to be formatted equally to the background dataset.

shats_values = shaTS.compute(testDataset)

4. Visualize Outcomes

Lastly, use the built-in visualization operate to plot the ShaTS values. You possibly can specify which class (e.g., anomalous or regular) you need to clarify.

shaTS.plot(shats_values, test_dataset=testDataset, class_to_explain=1)

Key Takeaways

Centered Attribution: ShaTS gives extra centered attributions than publish hoc SHAP, making it simpler to determine the foundation trigger in time-series fashions.
Effectivity: By lowering the variety of gamers to teams, ShaTS considerably decreases the coalitions to judge, resulting in quicker computation occasions.
Scalability: ShaTS maintains constant efficiency whilst window measurement will increase, because of its mounted group construction.
GPU Acceleration: ShaTS can leverage GPU assets, additional enhancing its velocity and effectivity.

Strive it your self

Interactive demo

Evaluate ShaTS with publish hoc SHAP on artificial time-series right here. You’ll find a tutorial on the next video.

Open supply

The ShaTS module is absolutely documented and able to plug into your ML/DL pipeline. Discover the code on Github.

I hope you preferred it! You’re welcome to contact me you probably have questions, need to share suggestions, or just really feel like showcasing your personal tasks.