malada.providers package

Submodules

malada.providers.crystalstructure module

Provider for crystal structures.

class malada.providers.crystalstructure.CrystalStructureProvider(parameters, external_cif_file=None)

Bases: malada.providers.provider.Provider

Provides crystal structure in the form of a .cif file.

Currently limited to copying user input. In the future this will enable automatic download.

Parameters
  • parameters (malada.utils.parametes.Parameters) – Parameters used to create this object.

  • external_cif_file (string) – Path to cif file provided by user. In the current state of the code, the pipeline will fail if this is None.

provide(provider_path)

Provide a crystal structure in the form of a cif file.

Parameters

provider_path (string) – Path in which to operate in.

malada.providers.dft module

Provider for DFT calculations to get energies and LDOS.

class malada.providers.dft.DFTProvider(parameters, external_calculation_folders=None)

Bases: malada.providers.provider.Provider

Performs a DFT calculation and provides an DFT output plus LDOS.

Parameters
  • parameters (malada.utils.parametes.Parameters) – Parameters used to create this object.

  • external_calculation_folders – Path to folders containing already (half) finished calculations. If not None, MALADA will try to assess which calculations are still missing and then perform those.

provide(provider_path, dft_convergence_file, ldos_convergence_file, possible_snapshots_file, do_postprocessing=True)

Provide a set of DFT calculations on predefined snapshots.

This includes DFT energies and snapshots.

Parameters
  • provider_path (string) – Path in which to operate in.

  • dft_convergence_file (string) – Path to xml file containing the DFT convergence parameter.

  • ldos_convergence_file (string) – Path to xml file containing the LDOS convergence parameter. This means a different set of DFT parameters needed for LDOS calculation.

  • possible_snapshots_file (string) – Path to a file containing an ASE trajectory containing atomic snapshots for DFT/LDOS calculation.

malada.providers.dftconvergence module

Provider for optimized DFT calculation parameters.

class malada.providers.dftconvergence.DFTConvergenceProvider(parameters, external_convergence_results=None, external_convergence_folder=None, predefined_kgrid=None, predefined_cutoff=None)

Bases: malada.providers.provider.Provider

For a given supercell and calculator, determine convergence.

Parameters
  • parameters (malada.utils.parametes.Parameters) – Parameters used to create this object.

  • external_convergence_results (string) – Path to xml file containing previously calculated convergence results. If not None, no DFT caclulations will be done.

  • external_convergence_folder (string) – Path to a folder containing already calculated DFT convergence results. If not none, no DFT calculations will be done.

  • predefined_kgrid (tuple) – Tuple in the form (kx,ky,kz). If not None, this k grid will be used and no attempt will be made to find a more optimal one.

  • predefined_cutoff (float) – Kinetic energy cutoff. If not None, this cutoff will be used and no attempt will be made to find a more optimal one.

provide(provider_path, supercell_file)

Provide DFT parameters converged to within user specification.

The cutoff energy (=basis set size) and k-grid will be optimized.

Parameters
  • provider_path (string) – Path in which to operate in.

  • supercell_file

malada.providers.ldosconvergence module

Provider for optimal LDOS calculation parameters.

class malada.providers.ldosconvergence.LDOSConvergenceProvider(parameters, external_ldos_configuration=None)

Bases: malada.providers.provider.Provider

Determine number of k points and energy levels needed for a smooth LDOS.

Parameters
  • parameters (malada.utils.parametes.Parameters) – Parameters used to create this object.

  • external_ldos_configuration (string) – Path xml file containing k grid for LDOS creation and energy levels. If not None, no DFT calculations will be performed.

provide(provider_path, snapshot_file, dft_convergence_file)

Provide correct number of k points and energy levels to calculate LDOS.

The results of LDOS based (ML) workflows is heaviliy dependent on a correct choice of energy levels and the k-grid. If those are chosen wrongly, any neural network training will be hard and yet give insufficient accuracies.

Parameters
  • provider_path (string) – Path in which to operate in.

  • dft_convergence_file (string) – Path to xml file containing the DFT convergence parameter.

  • snapshot_file (string) – Path to a file containing an ASE trajectory containing atomic snapshots for DFT/LDOS calculation.

malada.providers.md module

Provider for DFT-MD calculations.

class malada.providers.md.MDProvider(parameters, external_trajectory=None, external_temperatures=None, external_run_folder=None)

Bases: malada.providers.provider.Provider

Performs a DFT-MD calculation and provides an ASE trjactory.

This can be done with a number of runners.

Parameters
  • parameters (malada.utils.parametes.Parameters) – Parameters used to create this object.

  • external_trajectory (string) – Path to a file containing an ASE trajectory. If a MD trajectory has already been calculated, it can be provided here. In this case no MD calculation or processing is done whatsoever. Will be ignored if external_temperatures is None.

  • external_temperatures (string) – Path to a file containing a numpy array with temperature values. If an MD trajectory has already been calculated, the corresponding temperatures can be provided here. In this case no MD calculation or processing is done whatsoever. Will be ignored if external_trajectory is None.

  • external_run_folder (string) – Path to a folder containing a finished MD calculation. If provided, this MD calculation will be preprocessed by this file.

provide(provider_path, supercell_file, dft_convergence_file, md_performance_file)

Provide a MD trajectory and temperatures from previous steps.

Depending on arguments at creation, no actual MD calculation may be necessary.

Parameters
  • provider_path (string) – Path in which to operate in.

  • supercell_file (string) – Path to file containing structue of the supercell in VASP format.

  • dft_convergence_file (string) – Path to xml file containing the DFT convergence parameter.

  • md_performance_file (string) – Path to xml file containing the optimal run parameters for MD.

malada.providers.mdperformance module

Provider for optimal MD performance parameters.

class malada.providers.mdperformance.MDPerformanceProvider(parameters, external_performance_file=None)

Bases: malada.providers.provider.Provider

Determine parallelization parameters for optimal MD performance.

Parameters
  • parameters (malada.utils.parametes.Parameters) – Parameters used to create this object.

  • external_performance_file (string) – Path to file containing optimal DFT-MD performance. If not None, no DFT-MD calculations will be done.

provide(provider_path, dft_convergence_file)

Provide parallelization parameters for optimal DFT-MD performance.

This is not necessary when running in serial.

Parameters
  • provider_path (string) – Path in which to operate in.

  • dft_convergence_file (string) – Path to xml file containing the DFT convergence parameter.

malada.providers.provider module

Base class for all pipeline providers.

class malada.providers.provider.Provider(parameters)

Bases: object

Abstract base class for defining providers subclasses.

Apart from the constructor, each provider should have a provide() method.

static enforce_pbc(atoms)

Explictly enforeces the PBC on an ASE atoms object.

QE (and potentially other codes?) do that internally. Meaning that the raw positions of atoms (in Angstrom) can lie outside of the unit cell. When setting up the DFT calculation, these atoms get shifted into the unit cell. Since we directly use these raw positions for the descriptor calculation, we need to enforce that in the ASE atoms objects, the atoms are explicitly in the unit cell.

Parameters

atoms (ase.atoms) – The ASE atoms object for which the PBC need to be enforced.

Returns

new_atoms – The ASE atoms object for which the PBC have been enforced.

Return type

ase.atoms

abstract provide(provider_path)

Use output from previous step to provide input for the next.

Parameters

provider_path (string) – Path in which to operate in.

malada.providers.snapshots module

Provider for a set of snapshots from a MD trajectory.

class malada.providers.snapshots.SnapshotsProvider(parameters, external_snapshots=None)

Bases: malada.providers.provider.Provider

Filters snapshots from a given MD trajectory, with a user specified metric.

Parameters
  • parameters (malada.utils.parametes.Parameters) – Parameters used to create this object.

  • external_snapshots (string) – Path to a trajectory file containing snapshots. If not None, no parsing will be done.

analyze_distance_metric(trajectory)

Calculate the cutoff for the distance metric.

The distance metric used here is realspace (i.e. the smallest displacement of an atom between two snapshots). The cutoff gives a lower estimate for the oscillations of the trajectory. Any distance above this cutoff can be attributed to the oscillations in the trajectory. Any cutoff below is the consquence of temporal neighborhood of these snapshots.

Parameters

trajectory (ase.io.Trajectory) – Trajectory to be analyzed.

Returns

cutoff – Cutoff below which two snapshots can be assumed to be similar to each other to a degree that suggests temporal neighborhood.

Return type

float

analyze_trajectory(trajectory, equilibrated_snapshot=None)

Calculate distance metrics/first equilibrated timestep on a trajectory.

For this step, the RDF+Cosine distance will be used as a distance metric. Only the first snapshot is return, all the other quantities can be accessed as member variables of the object calling this function.

Parameters
  • trajectory (ase.io.Trajectory) – Trajectory to be analyzed.

  • equilibrated_snapshot (ase.Atoms) – An equilibrated snapshot. Will usually be read from the trajectory itself, but may be provided by the user if desired.

Returns

first_snapshot – First snapshot for which the trajectory is equilibrated.

Return type

int

provide(provider_path, trajectoryfile, temperaturefile)

Provide a trajectory file containing atomic snapshots.

Parameters
  • provider_path (string) – Path in which to operate in.

  • trajectoryfile (string) – Path to file containing the MD trajectory as ASE trajectory.

  • temperaturefile (string) – File containing the temperatures from the MD run as numpy array.

malada.providers.supercell module

Provider for creation of supercell from crystal structure.

class malada.providers.supercell.SuperCellProvider(parameters, external_supercell_file=None)

Bases: malada.providers.provider.Provider

Builds a supercell file (vasp format).

Parameters

parameters (malada.utils.parametes.Parameters) – Parameters used to create this object.

provide(provider_path, cif_file)

Provide supercell file in VASP format.

Parameters
  • provider_path (string) – Path in which to operate in.

  • cif_file (string) – Path to cif file used for supercell creation.

Module contents

Module containing providers for data generation pipeline.