Matching-Based Few-Shot Semantic Segmentation Models
Are Interpretable by Design

1University of Bari Aldo Moro, 2Jheronimus Academy of Data Science (JADS), 3Eindhoven University of Technology (TU/e)
Affinity Explainer Framework

Affinity Explainer computes attribution maps using the matching scores between query and support images in matching-based FSS models. It reveals the specific support regions that most influence the segmentation of the query.

Abstract

Explainable AI (XAI) has made significant progress in conventional vision tasks, yet interpretability in Few-Shot Learning (FSL)β€”and specifically Few-Shot Semantic Segmentation (FSS)β€”remains underexplored. Existing FSS models are primarily evaluated on accuracy, leaving their internal reasoning opaque. This ambiguity makes it difficult to diagnose failures or select optimal support examples.

In this paper, we introduce Affinity Explainer (AffEx), a framework designed to interpret matching-based FSS models. By leveraging the inherent matching mechanisms (similarity scores) of these architectures, AffEx derives contribution maps that highlight which support pixels most strongly influence the query prediction. We propose three variants: Unmasked, Masked, and Signed AffEx.

Furthermore, we extend the causal evaluation metrics Insertion/Deletion Area Under the Curve (IAUC/DAUC) to the FSS domain to rigorously quantify interpretability. Our extensive benchmark on COCO-20i and Pascal-5i demonstrates that AffEx provides actionable insights into model behavior and outperforms standard model-agnostic explanation methods.

Qualitative Results


Visualizing attributions generated by AffEx compared to other methods (Saliency, Blur IG, XRAI).


Qualitative Example 1
Qualitative Example 2
Qualitative Example 3
Qualitative Example 4
Qualitative Example 5

AffEx Icon Affinity Explainer

Affinity Explainer is a novel interpretability framework tailored for matching-based Few-Shot Semantic Segmentation (FSS) models. Unlike generic explanation methods (e.g., Grad-CAM) that often struggle with the multi-image inputs of few-shot tasks, AffEx directly leverages the model's internal matching mechanism to reveal how prediction decisions are made.

Core Mechanism

Matching-based models compute dense correlations between Query and Support features. AffEx intercepts these raw similarity scores and transforms them into intuitive attribution maps through a three-step process:

  1. ROI Averaging: The high-dimensional 4D correlation tensors (Query $\times$ Support) are collapsed into 2D maps by averaging scores over the relevant Query Region of Interest (ROI). This isolates the support features that are most similar to the target object in the query image.
  2. Softmax Normalization: The raw scores are normalized into a probability distribution, allowing us to interpret them as relative contributions. High probabilities (red) indicate strong semantic correspondence, while low probabilities (blue) indicate irrelevance.
  3. Feature Ablation & Aggregation: Since modern FSS models extract features at multiple depths, AffEx calculates ablation weights ($w_j$) to measure the importance of each layer. The final attribution map is a weighted sum of these multi-scale contributions.

Process Visualization

Initialize: Query (Cat) vs Support Set (Black cat, Dog)

Query
🐈
Match Tensor
w3
w2
w1
Match Tensor
w3
w2
w1
Support 1
πŸˆβ€β¬›
Support 2
πŸ•

AffEx Variants

We propose three variants to handle different interpretability needs:

Unmasked AffEx
πŸˆβ€β¬›
Computes attributions using the full image context. Useful for identifying if background clutter is distracting the model.
A(Iq, Is)
Masked AffEx
πŸˆβ€β¬›
Strictly confines attention to the object mask. Applies Gaussian smoothing at boundaries to prevent artifacts.
A(Iq, Is) ⊙ Ms
Signed AffEx
πŸˆβ€β¬›
Models background as negative. Red regions contribute positively; Blue regions (background) contribute negatively.
A · Sign(Ms)

Causal Evaluation Metrics

To evaluate the faithfulness of our explanations, we extend Insertion AUC (IAUC) and Deletion AUC (DAUC) to FSS.

DAUC measures the drop in model performance (mIoU or Confidence) as we progressively remove the "most important" pixels from the support set. A good explanation method should cause a rapid drop. Conversely, IAUC measures performance gain as we introduce important pixels to an empty support set. We also introduce mIoULoss@p, which quantifies the accuracy loss when using only the top p% of relevant pixels.


Interactive Causal Analysis

Use the slider to observe the effect of perturbing the support set.

Query Analysis

Prediction
Segmentation
Uncertainty
Heatmap

1 (Start)
24 (End)

Perturbed Support Set

Shots 1 - 5

Quantitative Benchmark

We evaluated our method against state-of-the-art attribution methods. Below are the results on COCO-20i and Pascal-5i datasets using DCAMA and DMTNet models in 1-way 5-shot setting. Refer to the paper for more details and results.

Method IAUC
(mIoU)
DAUC
(mIoU)
Diff.
IAUC
@1%
(Conf)
mIoUL
@1%
mIoUL
@5%
DCAMA
Random47.5046.850.6543.3641.4231.62
Gaussian Noise Mask52.3215.4636.8653.8031.0817.71
Saliency52.0241.1010.9244.9737.0123.73
Integrated Grad.49.2345.973.2745.6334.1822.25
Guided IG50.7745.105.6751.2530.1721.65
Blur IG52.4041.6310.7844.2536.6922.84
XRAI54.4538.7215.7351.9826.5912.08
Deep Lift49.7848.511.2744.8036.6724.07
LIME53.1848.444.7347.9832.4415.29
Unmasked AffEx54.8528.2626.5960.8217.418.85
Masked AffEx52.3215.0637.2662.6016.4312.27
Signed AffEx53.2423.7729.4860.4919.4013.67
DMTNet
Random21.2120.980.2360.3624.9331.88
Gaussian Noise Mask39.0815.7123.3759.3520.6913.26
Saliency36.5126.799.7263.0921.0515.77
Integrated Grad.26.2519.476.7862.5121.6320.32
Blur IG35.0420.3214.7262.1721.5420.20
XRAI38.1926.5511.6468.4216.4810.73
LIME36.8332.444.3970.7219.2011.36
Unmasked AffEx40.1017.0923.0167.3010.534.49
Masked AffEx40.5216.3824.1467.239.953.10
Signed AffEx38.8518.8120.0367.1610.493.72

BibTeX

@misc{marinisMatchingBasedFewShotSemantic2025,
	title = {Matching-{Based} {Few}-{Shot} {Semantic} {Segmentation} {Models} {Are} {Interpretable} by {Design}},
	url = {http://arxiv.org/abs/2511.18163},
	doi = {10.48550/arXiv.2511.18163},
	publisher = {arXiv},
	author = {Marinis, Pasquale De and Kaymak, Uzay and Brussee, Rogier and Vessio, Gennaro and Castellano, Giovanna},
	year = {2025},
}