raitap.robustness¶
- class raitap.robustness.assessors.base_assessor.BaseAssessor¶
Bases:
AdapterMixin,ABCRoot base class for all robustness assessors.
Concrete subclasses must declare
algorithm_registryas a class-body ClassVar — pyright errors at the decoration site if missing (the@adapters.robustnessdecorator'salgorithm_registrykwarg isRequired).- budget_kwarg_source = 'init_kwargs'¶
Which YAML block the underlying library actually consumes for budget kwargs (
eps/alpha/steps)."init_kwargs"means the adapter forwards them at attack-instance construction (torchattacks);"call_kwargs"means they are read at attack-call time (foolbox).RobustnessSemantics.perturbationis derived from this source so reported metadata always matches what the adapter executed.
- class raitap.robustness.assessors.base_assessor.EmpiricalAttackAssessor¶
Bases:
BaseAssessor,ABCEmpirical attack adapter.
Subclasses implement only
_default_invoke; the framework owns the rest.- generate_adversarial(model, inputs, targets, *, backend=None, **kwargs)¶
Dispatch to the entry's invoker, else this adapter's default path.
- class raitap.robustness.assessors.base_assessor.FormalVerificationAssessor¶
Bases:
BaseAssessor,ABCFormal-verification adapter: subclass implements per-sample
verify_sample.- abstractmethod verify_sample(model, sample, target, *, budget, backend=None, **kwargs)¶
Verify a single sample's robustness within
budget.
- class raitap.robustness.assessors.base_assessor.StatisticalSamplingAssessor¶
Bases:
BaseAssessor,ABCAverage-case adapter: estimate accuracy under a perturbation distribution.
The framework owns
assess(): it perturbs every image in pixel space, forwards the corrupted batch, scores each sample as CORRECT / MISCLASSIFIED against its label, and aggregates into corrupted accuracy + a binomial CI. Subclasses implement onlyapply_perturbationon a single HWC uint8 image.- abstractmethod apply_perturbation(image)¶
Corrupt one HWC uint8 [0,255] image; return the same shape and dtype.
RAITAP Robustness Module
Provides per-sample robustness assessments under a perturbation budget. The module distinguishes two complementary methods:
Empirical attacks — try to find an adversarial example (torchattacks, foolbox).
Formal verification — prove no adversarial example exists (Marabou complete SMT; auto_LiRPA sound+incomplete bound propagation).
Public Surface¶
- Assessor classes (
_target_values; live underraitap.robustness.assessors.): TorchattacksAssessor, FoolboxAssessor
- Visualiser classes (
_target_values; live underraitap.robustness.visualisers.): ImagePairVisualiser, PerturbationHeatmapVisualiser
Module layout (for contributors):
phase.py— pipeline entry point:RobustnessPhase(what the registry assembles) + theassess_robustnesswork fn + target resolution. Start here.factory.py— builds assessor instances from config.results.py—RobustnessResult(owns its.visualisations) +RobustnessVisualisationResult.report.py—RobustnessPhaseResult+ report-section builders.assessors/— the attack / verification adapters.visualisers/— the figure renderers.
- class raitap.robustness.AssessmentKind(*values)¶
Bases:
StrEnumLEVEL 1 — procedure-level taxonomy; coarsens into RobustnessCase via case_for().
Distinguishes three assessment procedures: * EMPIRICAL_ATTACK — adversarial example search (worst-case). * FORMAL_VERIFICATION — sound proof or refutation (worst-case). * STATISTICAL_SAMPLING — accuracy under a perturbation distribution (average-case).
- exception raitap.robustness.AssessmentKindVisualiserIncompatibilityError(*, assessor_target, visualiser, assessor_assessment_kind, supported_assessment_kinds)¶
Bases:
ExceptionRaised when a visualiser does not support the assessor's assessment kind.
- class raitap.robustness.AssessorAdapter(*args, **kwargs)¶
Bases:
ProtocolProtocol every assessor adapter must satisfy.
Mirrors
raitap.transparency.contracts.ExplainerAdapterbut for the robustness pipeline:assess(model, inputs, targets, …)instead ofexplain(model, inputs, …).
- class raitap.robustness.AssessorAlgorithmSpec(assessment_kind, threat_model, objective, norm=None, families=<factory>, default_epsilon=None, requires=<factory>, stochastic=False, invoker=None)¶
Bases:
objectPer-algorithm metadata read by
assessor_semantics.
- exception raitap.robustness.AssessorBackendIncompatibilityError(assessor, backend, algorithm, reason)¶
Bases:
ExceptionRaised when an assessor's algorithm is not supported by the selected backend.
- exception raitap.robustness.BackendIncompatibilityError(*, adapter, backend, missing)¶
Bases:
ExceptionRaised when an adapter's algorithm needs capabilities the backend lacks.
missingis the sorted list of capability values the backend does not provide (algorithm.requires - backend.provides).
- class raitap.robustness.BaseAssessor¶
Bases:
AdapterMixin,ABCRoot base class for all robustness assessors.
Concrete subclasses must declare
algorithm_registryas a class-body ClassVar — pyright errors at the decoration site if missing (the@adapters.robustnessdecorator'salgorithm_registrykwarg isRequired).- budget_kwarg_source = 'init_kwargs'¶
Which YAML block the underlying library actually consumes for budget kwargs (
eps/alpha/steps)."init_kwargs"means the adapter forwards them at attack-instance construction (torchattacks);"call_kwargs"means they are read at attack-call time (foolbox).RobustnessSemantics.perturbationis derived from this source so reported metadata always matches what the adapter executed.
- class raitap.robustness.BaseRobustnessVisualiser¶
Bases:
ABC,AdapterMixinAll robustness visualisers extend this class.
Subclasses declare which assessment kinds they support via the
supported_assessment_kindsClassVar; the factory enforces compatibility at YAML parse time so configuration errors fail fast.Empirical visualisers may also declare class-level facet hints used by compact report rendering. Setting
embeds_clean_inputorembeds_perturbation_maptoTruemeans the visualiser must accept the matching runtime kwarg (include_clean_input/include_perturbation_map) and hide that facet when it isFalse. The flags default toFalse, so formal verifier visualisers are unaffected unless they explicitly opt into the contract.report_figure_scopedeclares whether the produced figure summarises the whole assessment (ASSESSOR— one chart per assessor) or shows a single input (PER_SAMPLE). The reporting layer reads it to choose the layout slot; it defaults toPER_SAMPLEso empirical image visualisers keep their existing per-sample placement.- abstractmethod visualise(result, *, context, **kwargs)¶
Render a figure for
result.
- class raitap.robustness.ConfiguredRobustnessVisualiser(visualiser, call_kwargs=<factory>)¶
Bases:
objectVisualiser instance plus per-call kwargs for
BaseRobustnessVisualiser.visualise.
- class raitap.robustness.EmpiricalAttackAssessor¶
Bases:
BaseAssessor,ABCEmpirical attack adapter.
Subclasses implement only
_default_invoke; the framework owns the rest.- generate_adversarial(model, inputs, targets, *, backend=None, **kwargs)¶
Dispatch to the entry's invoker, else this adapter's default path.
- class raitap.robustness.FoolboxAssessor(algorithm, *, bounds=(0.0, 1.0), preprocessing=None, **init_kwargs)¶
Bases:
EmpiricalAttackAssessorSingle wrapper for foolbox attack classes.
Foolbox consumes the perturbation budget at call time (
attack(fmodel, inputs, targets, epsilons=...)), so the YAML budget keys belong undercall:; we setbudget_kwarg_source = "call_kwargs"so semantics metadata reflects that.Multi-epsilon sweeps (passing a list to
epsilonsso foolbox returns a per-eps list of tensors) are intentionally not supported in this adapter — they would change the result tensor shape across configurations and break the uniformRobustnessResultcontract. A futureMultiEpsilonAssessorwill own that surface.- budget_kwarg_source = 'call_kwargs'¶
Which YAML block the underlying library actually consumes for budget kwargs (
eps/alpha/steps)."init_kwargs"means the adapter forwards them at attack-instance construction (torchattacks);"call_kwargs"means they are read at attack-call time (foolbox).RobustnessSemantics.perturbationis derived from this source so reported metadata always matches what the adapter executed.
- class raitap.robustness.FormalVerificationAssessor¶
Bases:
BaseAssessor,ABCFormal-verification adapter: subclass implements per-sample
verify_sample.- abstractmethod verify_sample(model, sample, target, *, budget, backend=None, **kwargs)¶
Verify a single sample's robustness within
budget.
- class raitap.robustness.ImageCorruptionsAssessor(algorithm, *, severity=1, **init_kwargs)¶
Bases:
StatisticalSamplingAssessorApply one ImageNet-C corruption at one severity to estimate average-case accuracy.
algorithmis the corruption name;severity(1..5) is a constructor kwarg.- apply_perturbation(image)¶
Corrupt one HWC uint8 [0,255] image; return the same shape and dtype.
- budget_kwarg_source = 'init_kwargs'¶
Which YAML block the underlying library actually consumes for budget kwargs (
eps/alpha/steps)."init_kwargs"means the adapter forwards them at attack-instance construction (torchattacks);"call_kwargs"means they are read at attack-call time (foolbox).RobustnessSemantics.perturbationis derived from this source so reported metadata always matches what the adapter executed.
- class raitap.robustness.ImagePairVisualiser(*, max_samples=4, cmap='RdBu_r', diff_scale=None)¶
Bases:
BaseRobustnessVisualiserRender N rows by 3 columns: clean, perturbed, signed perturbation.
- visualise(result, *, context, **kwargs)¶
Render a figure for
result.
- exception raitap.robustness.MissingTargetsError(assessor_name)¶
Bases:
ExceptionRaised when a robustness assessment is requested but no labels are available.
- class raitap.robustness.Objective(*values)¶
Bases:
StrEnumWhether the assessor seeks any mis-classification or a specific target class.
- class raitap.robustness.PerturbationBudget(norm, epsilon=None, step_size=None, steps=None)¶
Bases:
PerturbationRegionWorst-case norm ball an adversary may explore.
- class raitap.robustness.PerturbationDistribution(corruption_name, severity)¶
Bases:
PerturbationRegionAverage-case perturbation distribution (one ImageNet-C corruption at one severity).
- class raitap.robustness.PerturbationHeatmapVisualiser(*, max_samples=4, cmap='seismic', aggregate_channels='signed_dominant')¶
Bases:
BaseRobustnessVisualiserRender the perturbation tensor as a heatmap.
Default
aggregate_channels="signed_dominant"keeps the signed value of the channel with the largest absolute deviation per pixel — preserves sign and avoids the cancellation that happens whenmeanaverages opposing signs across channels (e.g.+epson R and-epson G displaying as ~0). Other modes (mean,mean_abs,max_abs) are kept as opt-ins.- visualise(result, *, context, **kwargs)¶
Render a figure for
result.
- class raitap.robustness.PerturbationNorm(*values)¶
Bases:
StrEnumNorm under which the perturbation budget is measured.
- class raitap.robustness.PerturbationRegion¶
Bases:
objectBase for the region of inputs an assessment explores. Kind-specific subclasses.
- class raitap.robustness.ReportFigureScope(*values)¶
Bases:
StrEnumWhere a robustness visualiser's figure belongs in the report layout.
Read by the reporting layer to place each staged figure:
ASSESSORfigures summarise the whole assessment (one chart per assessor — e.g. clean-vs-corrupted accuracy, verdict-count summary, output-bound cohorts);PER_SAMPLEfigures show one input each (e.g. original/perturbed image pairs). Defaults toPER_SAMPLEon the visualiser base so empirical image visualisers keep their existing per-sample placement without opting in.
- class raitap.robustness.RobustnessAssessment(config, assessor_name, model, inputs, targets, input_metadata=None, sample_ids=None, sample_names=None, *, resolved_preprocessing=None, **kwargs)¶
Bases:
objectFactory entry-point mirroring
raitap.transparency.factory.Explanation.
- class raitap.robustness.RobustnessCase(*values)¶
Bases:
StrEnumLEVEL 2 — the worst/average 'case' from thesis §2.4; a coarsening of AssessmentKind.
Not an independent axis: every procedure belongs to exactly one case and no procedure spans cases, so case is derived from kind, never stored.
- class raitap.robustness.RobustnessMetrics(clean_accuracy, adversarial_accuracy=None, attack_success_rate=None, mean_distance=None, max_distance=None, verified_rate=None, falsified_rate=None, unknown_rate=None, error_rate=None, mean_runtime=None, corrupted_accuracy=None, accuracy_ci_low=None, accuracy_ci_high=None, n_samples=None, n_correct=None, metrics=<factory>)¶
Bases:
objectAggregate metrics for a robustness assessment.
clean_accuracyis always populated. Empirical-only / verifier-only fields are populated by the matching base assessor pipeline; the unused half staysNoneand is dropped fromas_dict.
- class raitap.robustness.RobustnessResult(clean_inputs, targets, clean_predictions, verdicts, metrics, run_dir, experiment_name, adapter_target, algorithm, name=None, kwargs=<factory>, call_kwargs=<factory>, visualiser_targets=<factory>, visualisers=<factory>, visualisations=<factory>, perturbed_inputs=None, perturbed_predictions=None, perturbation_distance=None, output_bounds=None, runtime_per_sample=None, *, semantics)¶
Bases:
TrackableTrackable result of a robustness assessment.
A single shape covers both empirical and formal-verification assessors.
perturbed_inputs/perturbed_predictions/perturbation_distanceare always populated for empirical results; for formal-verification results they hold counter-examples from FALSIFIED rows (NaN-padded for non-FALSIFIED rows so the artifact stays a single tensor; mask viaverdicts).output_boundsandruntime_per_sampleare verifier-only side channels.- log(tracker, artifact_path='robustness', use_subdirectory=True, **kwargs)¶
Log the object's artifacts or metadata to the provided tracker.
- render_visualisation_for_report(visualiser_index, *, sample_index=None, **render_kwargs)¶
Render one visualiser for report staging without persistence side effects.
Unlike
RobustnessResult.visualise, this hook never writes PNGs, updatesmetadata.json, mutatesvisualiser_targets, or touchesrun_dir. Visualiser errors propagate to the caller; callers are expected to pre-filter redundant visualisers before invoking this method.
- class raitap.robustness.RobustnessSemantics(assessment_kind, threat_model, objective, families, perturbation, target_classes=None, sample_selection=None, input_spec=None, stochastic=False)¶
Bases:
objectTyped contract describing the meaning of a robustness assessment artifact.
- property case¶
Robustness case this assessment belongs to (derived from
assessment_kind).
- class raitap.robustness.RobustnessVerdict(*values)¶
Bases:
StrEnumPer-sample assessment outcome.
Empirical assessors emit
ATTACK_SUCCEEDED/ATTACK_FAILED(the latter does NOT prove robustness — it only means the configured attack failed to find an adversarial example within the budget). Formal verification assessors emitVERIFIED/FALSIFIED/UNKNOWN(andERRORfor per-sample crashes / timeouts). Statistical-sampling assessors emitCORRECT_UNDER_PERTURBATION/MISCLASSIFIED_UNDER_PERTURBATION(whether the corrupted input was still classified as its ground-truth label).
- class raitap.robustness.RobustnessVisualisationContext(algorithm, assessment_kind, sample_names, show_sample_names)¶
Bases:
objectStandard pipeline-controlled metadata provided to robustness visualisers.
- class raitap.robustness.RobustnessVisualisationResult(result, figure, visualiser_name, visualiser_target, output_path)¶
Bases:
TrackablePNG written to
output_path;figureis closed after save.- log(tracker, artifact_path='robustness', use_subdirectory=True, **kwargs)¶
Log the object's artifacts or metadata to the provided tracker.
- exception raitap.robustness.RobustnessVisualiserIncompatibilityError(framework, visualiser, algorithm, compatible_algorithms)¶
Bases:
ExceptionRaised when a visualiser is not compatible with the assessor's algorithm.
- class raitap.robustness.ThreatModel(*values)¶
Bases:
StrEnumAdversary capability assumed by the assessor.
- class raitap.robustness.TorchattacksAssessor(algorithm, **init_kwargs)¶
Bases:
EmpiricalAttackAssessorSingle wrapper for ALL torchattacks methods.
Uses dynamic method loading - no need for class-per-method.
- budget_kwarg_source = 'init_kwargs'¶
Which YAML block the underlying library actually consumes for budget kwargs (
eps/alpha/steps)."init_kwargs"means the adapter forwards them at attack-instance construction (torchattacks);"call_kwargs"means they are read at attack-call time (foolbox).RobustnessSemantics.perturbationis derived from this source so reported metadata always matches what the adapter executed.
- class raitap.robustness.VerificationOutcome(verdict, counter_example=None, lower_bounds=None, upper_bounds=None, runtime_seconds=None, diagnostics=<factory>)¶
Bases:
objectPer-sample result returned by a
FormalVerificationAssessor.counter_exampleis set only when the verifier produced an explicit falsification;lower_bounds/upper_boundsare per-class logit bounds when the verifier exposes them.runtime_secondsmeasures the verifier's own time-to-decision for this sample.
- raitap.robustness.assessor_semantics(assessor, *, call_kwargs, raitap_kwargs, inputs, targets, sample_ids=None, sample_names=None)¶
Build a
RobustnessSemanticsfrom the configured assessor and its kwargs.Budget fields (
epsilon,step_size,steps) live in only one of the two YAML blocks per framework, governed by the assessor'sbudget_kwarg_sourceClassVar:torchattacks:
"init_kwargs"(constructor-time, since the adapter doesattack_class(model, **init_kwargs)once and never forwards per-call budget keys).foolbox:
"call_kwargs"(foolbox attacks consumeepsilons=...atattack(...)time).
The reported
RobustnessSemantics.perturbationtherefore matches what the adapter actually executed. Misplaced budget keys in the non-authoritative source emit a warning so the user can correct the YAML.
- raitap.robustness.check_assessor_visualiser_compat(assessor, assessor_target, visualisers)¶
Enforce
AssessmentKind↔supported_assessment_kindsat parse time.
- raitap.robustness.encode_verdicts(verdicts)¶
Encode a per-sample list of verdicts as a long tensor of stable codes.
- raitap.robustness.hints_for_assessor(assessor)¶
Resolve the registry hints for a configured assessor.
Reads the adapter's
algorithm_registryClassVar (enforced byROBUSTNESS.has_algorithm_registry=Trueat decoration time).
- raitap.robustness.semantics.assessor_semantics(assessor, *, call_kwargs, raitap_kwargs, inputs, targets, sample_ids=None, sample_names=None)¶
Build a
RobustnessSemanticsfrom the configured assessor and its kwargs.Budget fields (
epsilon,step_size,steps) live in only one of the two YAML blocks per framework, governed by the assessor'sbudget_kwarg_sourceClassVar:torchattacks:
"init_kwargs"(constructor-time, since the adapter doesattack_class(model, **init_kwargs)once and never forwards per-call budget keys).foolbox:
"call_kwargs"(foolbox attacks consumeepsilons=...atattack(...)time).
The reported
RobustnessSemantics.perturbationtherefore matches what the adapter actually executed. Misplaced budget keys in the non-authoritative source emit a warning so the user can correct the YAML.
- raitap.robustness.semantics.hints_for_assessor(assessor)¶
Resolve the registry hints for a configured assessor.
Reads the adapter's
algorithm_registryClassVar (enforced byROBUSTNESS.has_algorithm_registry=Trueat decoration time).