lazy_load_dataset

DataSet for lazy-loading.

class LazyLoadDataset(*args: Any, **kwargs: Any)[source]

Bases: Dataset

DataSet class for lazy loading.

Only loads snapshots in the memory that are currently being processed. Uses a “caching” approach of keeping the last used snapshot in memory, until values from a new ones are used. Therefore, shuffling at DataSampler / DataLoader level is discouraged to the point that it was disabled. Instead, we mix the snapshot load order here ot have some sort of mixing at all.

Parameters:
currently_loaded_file

Index of currently loaded file.

Type:

int

input_data

Input data tensor.

Type:

torch.Tensor

output_data

Output data tensor.

Type:

torch.Tensor

add_snapshot_to_dataset(snapshot: Snapshot)[source]

Add a snapshot to a DataSet.

Afterwards, the DataSet can and will load this snapshot as needed.

Parameters:

snapshot (mala.datahandling.snapshot.Snapshot) – Snapshot that is to be added to this DataSet.

get_new_data(file_index)[source]

Read a new snapshot into RAM.

Parameters:

file_index (i) – File to be read.

mix_datasets()[source]

Mix the order of the snapshots.

With this, there can be some variance between runs.

property return_outputs_directly

Control whether outputs are actually transformed.

Has to be False for training. In the testing case, Numerical errors are smaller if set to True.