raitap.data

Data module, handles:

  • loading data from various sources (local files, URLs, demo samples)

  • converting to raw tensors for model input

  • hosting a list of demo samples

class raitap.data.Data(cfg)

Bases: Trackable

describe()

Build standard dataset metadata for tracking and reporting.

Returns:

Dictionary containing dataset metadata.

log(tracker, **kwargs)

Log dataset metadata to the tracker.

raitap.data.load_numpy_from_source(source, n_samples=None)

Load data as a NumPy array using the same resolution rules as load_tensor_from_source().

For file-based sources (local paths and URLs), no intermediate torch tensor is allocated. Demo sample sources (SAMPLE_SOURCES) use raitap.data.samples._load_sample (torch-based); all other paths are torch-free.

raitap.data.load_tensor_from_source(source, n_samples=None)

Load a raw tensor from a named demo sample, URL, or local path.

This is the same loading logic used by Data, but without label handling. Useful for loading background data for SHAP explainers.

Parameters:
  • source – Named demo sample (e.g. "imagenet_samples"), URL, or local path.

  • n_samples – If set, randomly subsample n_samples rows from the loaded tensor. Useful for keeping background datasets small (e.g. for KernelExplainer).

Returns:

Raw tensor of shape (N, ...) where N is the number of samples.

Raises:

ValueError – If source cannot be resolved, does not exist, or the file type is not supported.