raitap.metrics

RAITAP Metrics Module

Provides performance metric computation for classification and detection tasks using torchmetrics.

Metrics Public Surface

MetricComputer

Protocol defining the interface for all metric computers (reset, update, compute).

MetricResult

Dataclass containing computed metrics (dict[str, float]) and artifacts (dict[str, Any]).

Metric classes

ClassificationMetrics

Computes accuracy, precision, recall, and F1 score for binary, multiclass, and multilabel classification tasks.

DetectionMetrics

Computes mean average precision (mAP) and related metrics for object detection tasks.

class raitap.metrics.BaseMetricComputer

Bases: ABC

class raitap.metrics.ClassificationMetrics(*, task='multiclass', num_classes=None, num_labels=None, average='macro', ignore_index=None, **kwargs)

Bases: BaseMetricComputer

Classification metrics using torchmetrics

Supports:
  • Task: binary, multiclass, multilabel

  • Average: micro, macro, weighted, none

Notes

  • If average=”none”, metric outputs are per-class/per-label vectors

    and are stored in artifacts.

  • For multilabel, you may want to pass threshold=0.5 (default) in kwargs.

class raitap.metrics.DetectionMetrics(*, box_format='xyxy', iou_type='bbox', iou_thresholds=None, rec_thresholds=None, max_detection_thresholds=None, class_metrics=False, extended_summary=False, average='macro', backend='faster_coco_eval', **kwargs)

Bases: BaseMetricComputer

Calculates and manages detection metrics for evaluating the performance of object detection models.

This class is responsible for computing detection metrics, updating them with predictions and targets, and resetting their state. It uses a MeanAveragePrecision calculator internally to handle the computation logic. It supports a variety of configurations, including box formats, IoU types, thresholds, class-specific metrics, and more.

Variables:

metric – Instance of the MeanAveragePrecision calculator used to compute metrics.

class raitap.metrics.MetricResult(metrics: 'dict[str, float]', artifacts: 'dict[str, Any]'=<factory>)

Bases: object

class raitap.metrics.Metrics(config, predictions, targets)

Bases: object

Run configured metrics, write JSON under the run metrics/ directory.

class raitap.metrics.MetricsEvaluation(result, run_dir, computer, resolved_target)

Bases: Trackable, Reportable

Outcome of a metrics run (JSON on disk + optional computer handle).

log(tracker, *, prefix='performance', **kwargs)

Log the object’s artifacts or metadata to the provided tracker.

to_report_group()

Return a ReportGroup representing this object’s report content.

class raitap.metrics.MetricsVisualizer

Bases: object

Generate matplotlib figures from MetricResult data.

static create_figures(result)

Generate charts for metrics.

Returns dict with: - “metrics_overview”: bar chart of all scalar metrics - “confusion_matrix”: if confusion matrix exists in artifacts

raitap.metrics.create_metric(metrics_config)

Instantiate a metric computer from Hydra-style config (_target_ + kwargs).

raitap.metrics.evaluate(config, predictions, targets)

Compute metrics, persist JSON outputs, and return a MetricsEvaluation.

raitap.metrics.metrics_prediction_pair(output)

Build a placeholder (predictions, targets) pair for metrics when no labels exist.

For multiclass logits (N, C) with C > 1, uses argmax for both (trivial self-consistency). For other shapes, passes output through unchanged so users can pair metrics configs with regression / detection / etc.

raitap.metrics.metrics_run_enabled(config)

True when metrics is present and _target_ is a non-empty string.

raitap.metrics.resolve_metric_targets(predictions, labels)

Use ground truth labels when available, else warn and fall back to predictions.

raitap.metrics.scalar_metrics_for_tracking(result)

Keep only JSON-friendly scalars suitable for tracker log_metrics.