Robustness¶
The robustness module configures the assessors and visualisers that probe how a model behaves under input perturbations.
Each robustness entry defines one named assessor, its algorithm, and the
visualisers that should render its outputs. The current implementation supports
three complementary methods:
Empirical attacks (
worst_case) — try to find an adversarial example within a perturbation budget (torchattacks, foolbox).Formal verification (
worst_case) — prove that no adversarial example exists within the budget. The module shape already accommodates this; concrete adapters (auto_LiRPA, alpha-beta-CROWN) arrive in a follow-up release.Statistical sampling (
average_case) — measure accuracy under a perturbation distribution, e.g. ImageNet-C corruptions (imagecorruptions).
A "non-attack" outcome from an empirical assessor does not prove robustness; it just means the configured attack failed. Use a formal-verification assessor when you need a robustness proof rather than an attack attempt.
Providing ground-truth labels¶
Untargeted attacks need a per-sample reference label to push the model away
from. Without data.labels, raitap falls back to argmax(model(clean)),
which means the attack only confirms the model's current decision is
brittle — not that it disagrees with reality. Configure data.labels to
supply real labels; see Data configuration for
the labels.source, labels.column, labels.id_column, and
labels.encoding options.
When labels are missing, raitap emits a warning so the fallback target is clearly flagged.
Robustness module documentation