The Red Team track of the Re-Align Challenge 2026 asks participants to submit a set of 1000 images intended to maximize representational divergence across a fixed suite of vision models, with divergence evaluated on a hidden split using Centered Kernel Alignment (CKA) between model embeddings. [@realign_workshop_2026; @realign_leaderboard_space_2026; @kornblith2019cka] CKA is closely related to the HilbertâSchmidt Independence Criterion (HSIC), providing a robust similarity measure that is invariant to isotropic scaling and commonly used for comparing representational geometry. [@gretton2005hsic; @kornblith2019cka] Our goal is therefore to select stimuli that systematically expose differences in representational geometry across models, rather than simply selecting images that are difficult in a behavioral sense.
We frame stimulus selection as set optimization over a candidate pool drawn from the datasets allowed by the challenge (ImageNet validation and ObjectNet). [@deng2009imagenet; @barbu2019objectnet; @objectnet_website] The approach has two stages:
All experiments were run on a multi-node GPU cluster for representation extraction, with CPU- and GPU-based jobs for aggregation, SSDE training, and subset selection.
Let the candidate stimulus pool (of images) be (\mathcal{D}={x_i}{i=1}^N), where each (x_i) is an image from either ImageNet-val or ObjectNet. Let ({f_m}{m=1}^M) denote the suite of (M) fixed vision models used by the challenge. For each model (m), we extract a feature vector at a designated layer: [ h_m(x_i)\in\mathbb{R}^{d_m}. ] The challenge evaluation computes a representational similarity (CKA) over sets of embeddings across images. [@kornblith2019cka] Because we do not have access to the hidden evaluation set, we build per-image proxies designed to predict which stimuli will decrease cross-model similarity when aggregated.
Models produce features of different dimensions (d_m). To standardize computations and reduce I/O and memory, we map each (h_m(x_i)) into a shared dimension (d) using a fixed random projection: [ e_m(x_i) = P_m,\tilde{h}_m(x_i)\in\mathbb{R}^{d}, ] where:
Random projections approximately preserve pairwise distances under mild conditions (JohnsonâLindenstrauss), enabling substantial dimensionality reduction while retaining geometric structure. [@johnson1984jl; @achlioptas2003random_projections]
In our runs we used (d=256) for projected model features (reported in Table 1).
For each image (x_i), we stack the projected features across models into a matrix:
[
E_i =
\begin{bmatrix}
e_1(x_i)^\top
\vdots
e_M(x_i)^\top
\end{bmatrix}
\in\mathbb{R}^{M\times d}.
]
A simple proxy for cross-model disagreement is the mean pairwise cosine distance: [ u_i ===
\frac{2}{M(M-1)} \sum_{1\le a < b \le M} \left( 1-\frac{\langle e_a(x_i), e_b(x_i)\rangle}{|e_a(x_i)|_2,|e_b(x_i)|_2} \right). ] This increases when different models place the same image in dissimilar directions in feature space, which we expect to reduce representational alignment when many such images are aggregated. This choice is aligned with the representational-similarity viewpoint that comparisons should reflect geometry, not raw coordinates. [@kriegeskorte2008rsa; @sucholutsky2023getting_aligned]
The diagnostic plot in Figure 2 visualizes this quantity as âhardnessâ: [ \mathrm{hardness}(x_i) \equiv u_i. ]
Pairwise averages can over-select stimuli that drive disagreement in a single direction (e.g., a shared âfailure modeâ). We therefore include a multi-directional term based on the log-determinant of the modelâmodel Gram matrix: [ G_i = \frac{1}{d}E_i E_i^\top \in \mathbb{R}^{M\times M}, \qquad \ell_i = \log\det\left(G_i + \varepsilon I\right). ] Here (\ell_i) is large when the model feature vectors for (x_i) span a high-volume simplex (i.e., are diverse and not confined to a low-dimensional subspace). Log-det objectives are widely used as diversity/coverage surrogates and appear in submodular sensor-placement and information-gain settings. [@krause2008sensor_placement]
We define a divergence signature (s_i\in\mathbb{R}^{D}) that concatenates:
A scalar divergence score is then computed as: [ d_i = \alpha,u_i + (1-\alpha),\operatorname{clip}(\ell_i;,\ell_{\min},\ell_{\max}), ] with (\alpha\in[0,1]) controlling the contribution of pairwise vs. multi-directional disagreement and (\varepsilon>0) stabilizing the log-det.
Selecting the top 1000 images by (d_i) alone tends to produce near-duplicates. We therefore treat selection as maximizing a scoreâdiversity trade-off.
Given an embedding (v_i) used solely for measuring redundancy (either (s_i) or SSDE embeddings (z_i); see below), we use a greedy Maximal Marginal Relevance (MMR) criterion: [@carbonell1998mmr] [ i^\star = \arg\max_{i\notin S} \left[ d_i â
\lambda\cdot \max_{j\in S}\mathrm{sim}(v_i, v_j) \right], \qquad \mathrm{sim}(a,b)=\frac{a^\top b}{|a|_2|b|_2}. ] (\lambda\ge 0) is the diversity weight. To keep runtime manageable, selection is performed on the top (P) candidates by (d_i) (a âprefilterâ).
As an alternative diversity regularizer, one can use: [ \log\det(K_S+\delta I), ] where (K) is a positive semidefinite similarity kernel on candidate signatures. This is closely related to determinantal point processes (DPPs), a classic probabilistic model for diverse subset selection. [@kulesza2012dpp] For PSD (K), log-det exhibits diminishing returns and admits greedy approximation guarantees under standard submodularity conditions. [@nemhauser1978submodular] In practice, MMR offered a simpler and more scalable approximation, so it was used in all reported runs.
The baseline selector uses hand-designed signatures (s_i) both for scoring and for redundancy control. To obtain a more stable low-dimensional geometry for diversity selection, we learn a Self-Supervised Divergence Embedding: [ z_i = \frac{g_\theta(\tilde s_i)}{|g_\theta(\tilde s_i)|2}\in\mathbb{R}^{p}, ] where (g\theta) is a small MLP projection head and (\tilde s_i) is a stochastically perturbed view of (s_i) (feature dropout + Gaussian noise), analogous to âtwo-viewâ contrastive learning. [@chen2020simclr]
For an anchor image (x_i), we define training pairs and negatives as follows (in our implementation, this is done in signature space, for efficiency):
| Let (\mathrm{sim}(u,v) = \frac{u^\top v}{ | u | _2 | v | _2}) (cosine similarity) and (\tau>0) be the temperature. For each anchor (i), define a mined negative set (\mathcal{N}_i) (the other examples in the batch), and a hard-negative subset (\mathcal{H}_i\subset\mathcal{N}_i) (top-(k) by similarity). |
A generic weighted InfoNCE loss can be written: [ \mathcal{L}_i =============
-\log
\frac{
\exp(\mathrm{sim}(z_i^{(1)}, z_i^{(2)})/\tau)
}{
\exp(\mathrm{sim}(z_i^{(1)}, z_i^{(2)})/\tau)
+
\sum_{j\in\mathcal{N}i}
w{ij},\exp(\mathrm{sim}(z_i^{(1)}, z_j^{(2)})/\tau)
},
]
where the hard-negative weights are:
[
w_{ij}=
\begin{cases}
1+\alpha_{\mathrm{hard}}, & j\in\mathcal{H}_i,
1, & \text{otherwise.}
\end{cases}
]
In our implementation, this weighting is applied equivalently by adding (\log(1+\alpha_{\mathrm{hard}})) to the logits of hard negatives before the (\log\sum\exp) operation. [@robinson2021hard_negatives]
After training SSDE, we perform diversity-aware greedy selection in the learned space. The SSDE-aware marginal score used at selection step (t) is: [ s_{\mathrm{SSDE}}(x_i \mid S_{t-1}) ===================================
\lambda\cdot \max_{j\in S_{t-1}} \langle z_i, z_j\rangle, ] where (S_{t-1}) is the set selected so far, and (\langle z_i,z_j\rangle) equals cosine similarity because (z) vectors are normalized.
This is exactly the MMR criterion, but the redundancy term is now measured in the SSDE geometry rather than in the raw hand-crafted signature geometry.
We report the divergence score ((1 - \mathrm{avg\ CKA})) for representative submissions using the methods above, as obtained from the hackathonâs leaderboard.
Table 1 summarizes four representative runs: the first two are baseline (with and without MMR), and the last two are SSDE + MMR under two parameter settings (with the best achieving 0.5447). Additional submissions in the same range were repeats/ablations and are omitted for clarity.
Table 1 â Red Team runs and complete hyperparameter settings.
| Run | Method | Selection objective & space | Parameters (all) | Score |
|---|---|---|---|---|
| 1 | Baseline (Top-(K)) | select Top-(K) by (d_i) | Proxy: (d{=}256), (\alpha{=}0.5), (\varepsilon{=}10^{-3}) ⢠Selection: (K{=}1000), (\lambda{=}0), prefilter (P{=}) N/A, quota: none | 0.4782 |
| 2 | Baseline + MMR | (d_i - \lambda\max_{j\in S}\mathrm{sim}(s_i,s_j)) | Proxy: (d{=}256), (\alpha{=}0.5), (\varepsilon{=}10^{-3}) ⢠Selection: (K{=}1000), (\lambda{=}0.20), prefilter (P{=}20000), quota: none | 0.5099 |
| 3 | SSDE + MMR | (d_i - \lambda\max_{j\in S}\langle z_i,z_j\rangle) | Proxy: (d{=}256), (\alpha{=}0.5), (\varepsilon{=}10^{-3}) ⢠SSDE: (p{=}128), hidden (=512), epochs (=8), batch (=2048), lr (=3\cdot10^{-4}), wd (=10^{-4}), (\tau{=}0.10), mask (=0.20), noise (=0.01), hard-(k{=}128), (\alpha_{\mathrm{hard}}{=}2.0), score_temp (=1.0), max_train (=0), amp (=) on, seed (=0) ⢠Selection: (K{=}1000), (\lambda{=}0.25), prefilter (P{=}20000), quota: none | 0.5409 |
| 4 | SSDE + MMR + quota | same as Run 3 | Proxy: (d{=}256), (\alpha{=}0.5), (\varepsilon{=}10^{-3}) ⢠SSDE: same as Run 3 ⢠Selection: (K{=}1000), (\lambda{=}0.35), prefilter (P{=}20000), quota: ObjectNet 700 / ImageNet 300, seed (=0) | 0.5447 |
A plausible interpretation is that the pipeline decomposes the divergence objective into signal and coverage:
This emphasis on probing representational geometry rather than only accuracy mirrors core themes in representational alignment research. [@sucholutsky2023getting_aligned; @kriegeskorte2008rsa]
BrainScore studies compare models by how well their internal representations predict neural/behavioral measurements and show that architectural and training differences can yield substantial representational differences even when models achieve similar task performance. [@kubilius2019brainscore; @schrimpf2020integrative_benchmarking] This supports the general intuition behind this divergence maximization task: stimuli that expose inductive-bias differences (including distribution shifts such as ObjectNet) are likely to magnify representational divergence across a heterogeneous model suite.
This was a hackathon setting with limited time, with the goal of improving scores while understanding how algorithmic and modeling choices affect divergence.
We presented a Red Team stimulus selection pipeline grounded in established representational-analysis and diversity-selection ideas:
Across the reported runs, SSDE + diversity selection produced the best score (0.5447). These results should be interpreted cautiously due to proxy mismatch and limited hyperparameter exploration. With more time and GPU budgetâespecially to broaden the candidate pool and to explore richer self-supervised image featuresâthe same methodological framework could plausibly yield further improvements.
Given centered representation matrices (X\in\mathbb{R}^{n\times p}) and (Y\in\mathbb{R}^{n\times q}), linear CKA can be expressed in terms of HSIC: [ \mathrm{CKA}(X,Y)= \frac{\mathrm{HSIC}(X,Y)}{\sqrt{\mathrm{HSIC}(X,X),\mathrm{HSIC}(Y,Y)}}. ] [@kornblith2019cka; @gretton2005hsic]
INPUT:
Candidate images D = {x_i}_{i=1..N} from allowed datasets (ImageNet-val, ObjectNet)
Fixed model suite {f_m}_{m=1..M}
Target set size K = 1000
HYPERPARAMETERS (as used in Table 1):
Projection / proxy:
d = 256 # random projection dimension for model features
alpha = 0.5 # weight for pairwise disagreement u_i vs log-det term ell_i
eps = 1e-3 # log-det stabilization epsilon in log det(G_i + eps I)
Baseline selection (MMR):
lambda_diversity â {0.0, 0.20} # diversity weight
P = 20000 # prefilter top-P by proxy score (for MMR runs)
SSDE training (Run 3â4):
p = 128 # SSDE embedding dimension (proj-dim)
hidden = 512 # MLP hidden dimension
epochs = 8
batch_size = 2048
lr = 3e-4
weight_decay = 1e-4
tau = 0.10 # contrastive temperature
mask_p = 0.20 # feature dropout prob for signature views
noise_std = 0.01 # gaussian noise std for views
hard_k = 128 # top-k hardest negatives per anchor
hard_alpha = 2.0 # hard-negative weight: w = 1 + hard_alpha
score_temp = 1.0 # score weighting temperature (sigmoid) during training
max_train = 0 # 0 means train on all N signatures
amp = on # mixed precision
SSDE selection:
lambda_diversity â {0.25, 0.35}
P = 20000
quota_best = {objectnet:700, imagenet:300} # used only for best run
ALGORITHM:
(1) Representation extraction:
For each image x_i and each model m:
extract layer feature h_m(x_i) â R^{d_m}
(2) Standardize + random project:
Define standardized vector \tilde{h}_m(x_i) (e.g., centering + L2 norm)
e_m(x_i) = P_m * \tilde{h}_m(x_i) â R^{d}
(3) Build per-image matrix:
E_i = [e_1(x_i); ...; e_M(x_i)] â R^{MĂd}
(4) Compute disagreement proxies:
u_i = mean pairwise cosine distance across rows of E_i
G_i = (E_i E_i^T)/d
ell_i = log det(G_i + eps I)
score d_i = alpha * u_i + (1-alpha) * clip(ell_i)
Construct signature s_i from flattened upper triangle of G_i + summaries
(5) Train SSDE on signatures (SSDE runs):
Create two signature views per sample with (mask_p, noise_std)
Train MLP g_theta with InfoNCE loss (temperature tau)
Mine hard negatives: top-k by similarity; upweight by (1 + hard_alpha)
Output normalized embeddings z_i = normalize(g_theta(s_i)) â R^p
(6) Select K images (baseline or SSDE):
Prefilter to top P by d_i (if using MMR)
Greedy selection for t = 1..K:
choose i* = argmax_{i not in S} [ d_i - lambda_diversity * max_{j in S} sim(v_i, v_j) ]
where v_i = s_i (baseline MMR) or v_i = z_i (SSDE MMR)
enforce dataset quota if specified (best run)
(7) Output the collection of required images:
{"dataset_name": ..., "image_identifier": ...}
PLACEHOLDER FOR ACADEMIC ATTRIBUTION
BibTeX citation
PLACEHOLDER FOR BIBTEX