target

Base class for all target calculators.

class Target(params: Parameters)[source]

Bases: PhysicalData

Base class for all target quantity parser.

Target parsers read the target quantity (i.e. the quantity the NN will learn to predict) from a specified file format and performs postprocessing calculations on the quantity.

Target parsers often read DFT reference information.

Parameters:

params (mala.common.parameters.Parameters or)
mala.common.parameters.ParametersTargets – Parameters used to create this Target object.

atomic_forces_dft

Atomic forces as per DFT reference file.

Type:: numpy.ndarray

atoms

ASE atoms object used for calculations.

Type:: ase.Atoms

band_energy_dft_calculation: Band energy as per DFT reference file.

electrons_per_atom

Electrons per atom, usually determined by DFT reference file.

Type:: int

entropy_contribution_dft_calculation

Electronic entropy contribution as per DFT reference file.

Type:: float

fermi_energy_dft

Fermi energy as per DFT reference file.

Type:: float

kpoints

k-grid used for MALA calculations. Managed internally.

Type:: list

local_grid

Size of local grid (in MPI mode).

Type:: list

number_of_electrons_exact: Exact number of electrons, usually given via DFT reference file.

number_of_electrons_from_eigenvals

Number of electrons as calculated from DFT reference eigenvalues.

Type:: float

parameters

MALA target calculation parameters.

Type:: mala.common.parameters.ParametersTarget

qe_pseudopotentials

List of Quantum ESPRESSO pseudopotentials, read from DFT reference file: and used for the total energy module.

Type:: list

save_target_data

Control whether target data will be saved. Can be important for I/O applications. Managed internally, default is True.

Type:: bool

temperature

Temperature used for all computations. By default read from DFT reference file, but can freely be changed from the outside.

Type:: float

total_energy_contributions_dft_calculation

Dictionary holding contributions to total free energy not given as individual properties, as read from the DFT reference file. Contains:

“one_electron_contribution”, \(n\,V_\mathrm{xc}\) plus band energy

“hartree_contribution”, \(E_\mathrm{H}\)

“xc_contribution”, \(E_\mathrm{xc}\)

“ewald_contribution”, \(E_\mathrm{Ewald}\)

Type:: dict

total_energy_dft_calculation

Total free energy as read from DFT reference file.

Type:: float

voxel

Voxel to be used for grid intergation. Reflects the symmetry of the simulation cell. Calculated from DFT reference data.

Type:: ase.cell.Cell

y_planes

Number of y_planes used for Quantum ESPRESSO parallelization. Handled internally.

Type:: int

abstract static backconvert_units(array, out_units)[source]

Convert the units of an array from MALA units into desired units.

Parameters:

array (numpy.ndarray) – Data in MALA units.
out_units (string) – Desired units of output array.

Returns:

converted_array – Data in out_units.

Return type:

numpy.ndarray

abstract static convert_units(array, in_units='eV')[source]

Convert the units of an array into the MALA units.

Parameters:

array (numpy.ndarray) – Data for which the units should be converted.
in_units (string) – Units of array.

Returns:

converted_array – Data in MALA units.

Return type:

numpy.ndarray

get_radial_distribution_function(atoms: ase.Atoms, method='mala')[source]

Calculate the radial distribution function (RDF).

Uses the values as given in the parameters object.

Parameters:

atoms (ase.Atoms) – Atoms for which to construct the RDF.
method (string) – If “mala” the more flexible, yet less performant python based RDF will be calculated. If “asap3”, asap3’s C++ implementation will be used.

Returns:

rdf (numpy.ndarray) – The RDF as [rMax] array.
radii (numpy.ndarray) – The radii at which the RDF was calculated (for plotting), as [rMax] array.
method (string) – If “mala” the more flexible, yet less performant python based RDF will be calculated. If “asap3”, asap3’s C++ implementation will be used. In the latter case, rMax will be ignored and automatically calculated.

get_real_space_grid()[source]

Get the real space grid.

Returns:: grid3D – Numpy array holding the entire grid.
Return type:: numpy.ndarray

get_static_structure_factor(atoms: ase.Atoms)[source]

Calculate the static structure factor (SSF).

Uses the values as given in the parameters object.

Parameters:

atoms (ase.Atoms) – Atoms for which to construct the RDF.

Returns:

ssf (numpy.ndarray) – The SSF as [kMax] array.
kpoints (numpy.ndarray) – The k-points at which the SSF was calculated (for plotting), as [kMax] array.

abstract get_target()[source]

Get the target quantity.

This is the generic interface for cached target quantities. It should work for all implemented targets.

get_three_particle_correlation_function(atoms: ase.Atoms)[source]

Calculate the three particle correlation function (TPCF).

Uses the values as given in the parameters object.

Parameters:

atoms (ase.Atoms) – Atoms for which to construct the RDF.

Returns:

tpcf (numpy.ndarray) – The TPCF as [rMax, rMax, rMax] array.
radii (numpy.ndarray) – The radii at which the TPCF was calculated (for plotting), [rMax, rMax, rMax].

abstract invalidate_target()[source]

Invalidates the saved target quantity.

This is the generic interface for cached target quantities. It should work for all implemented targets.

static radial_distribution_function_from_atoms(atoms: ase.Atoms, number_of_bins, rMax='mic', method='mala')[source]

Calculate the radial distribution function (RDF).

In comparison with other python implementation, this function can handle arbitrary radii by incorporating a neighbor list. Loosely based on the implementation in the ASAP3 code (specifically, this file: https://gitlab.com/asap/asap/-/blob/master/Tools/RawRadialDistribution.cpp), but without the restriction for the radii.

Parameters:

atoms (ase.Atoms) – Atoms for which to construct the RDF.
number_of_bins (int) – Number of bins used to create the histogram.
rMax (float or string) –
Radius up to which to calculate the RDF. Options are:
- ”mic”: rMax according to minimum image convention. (see “The Minimum Image Convention in Non-Cubic MD Cells” by Smith, http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.57.1696) Suggested value, as this leads to physically meaningful RDFs.
- ”2mic”: rMax twice as big as for “mic”. Legacy option to reproduce some earlier trajectory results
- float: Input some float to use it directly as input.
method (string) – If “mala” the more flexible, yet less performant python based RDF will be calculated. If “asap3”, asap3’s C++ implementation will be used. In the latter case, rMax values larger then “2mic” will be ignored.

Returns:

rdf (numpy.ndarray) – The RDF as [rMax] array.
radii (numpy.ndarray) – The radii at which the RDF was calculated (for plotting), as [rMax] array.

read_additional_calculation_data(data, data_type=None)[source]

Read in additional input about a calculation.

This is e.g. necessary when we operate with preprocessed data for the training itself but want to take into account other physical quantities (such as the fermi energy or the electronic temperature) for post processing.

Parameters:

data (string or List) – Data from which additional calculation data is inputted.
data_type (string) –
Type of data or file that is used. If not provided, MALA will attempt to guess the provided type of data based on shape and/or file ending. Currently supported are:
- ”espresso-out” : Read the additional information from a QuantumESPRESSO output file.
- ”atoms+grid” : Provide a grid and an atoms object from which to predict. Except for the number of electrons, this mode will not change any member variables; values have to be adjusted BEFORE.
- ”json” : Read the additional information from a MALA generated .json file.

restrict_data(array)[source]

Restrict target data to only contain physically meaningful values.

For the LDOS this e.g. implies non-negative values. The type of data restriction is specified by the parameters.

Parameters:: array (numpy.ndarray) – Numpy array, for which the restrictions are to be applied.
Returns:: array – The same array, with restrictions enforced.
Return type:: numpy.ndarray

static static_structure_factor_from_atoms(atoms: ase.Atoms, number_of_bins, kMax, radial_distribution_function=None, calculation_type='direct')[source]

Calculate the static structure factor (SSF).

Implemented according to https://arxiv.org/abs/1606.03610:

Eq. 2 as calculation_type = “direct”, i.e. via summation in Fourier space

Eq. 6 as calculation_type = “fourier_transform”, i.e. via calculation of RDF and thereafter Fourier transformation.

The direct calculation is significantly more accurate and about as efficient, at least for small systems. In either case, the SSF will be given as S(k), even though technically, the direct method calculates S(k); during binning, this function averages over all k with the same k.

Parameters:

atoms (ase.Atoms) – Atoms for which to construct the RDF.
number_of_bins (int) – Number of bins used to create the histogram.
kMax (float) – Maximum wave vector up to which to calculate the SSF.
radial_distribution_function (List of numpy.ndarray) – If not None, and calculation_type=”fourier_transform”, this RDf will be used for the Fourier transformation.
calculation_type (string) – Either “direct” or “fourier_transform”. Controls how the SSF is calculated.

Returns:

ssf (numpy.ndarray) – The SSF as [kMax] array.
kpoints (numpy.ndarray) – The k-points at which the SSF was calculated (for plotting), as [kMax] array.

static three_particle_correlation_function_from_atoms(atoms: ase.Atoms, number_of_bins, rMax='mic')[source]

Calculate the three particle correlation function (TPCF).

The implementation was done as given by doi.org/10.1063/5.0048450, equation 17, with only small modifications. Pleas be aware that, while optimized, this function tends to be compuational heavy for large radii or number of bins.

Parameters:

atoms (ase.Atoms) – Atoms for which to construct the RDF.
number_of_bins (int) – Number of bins used to create the histogram.
rMax (float or string) –
Radius up to which to calculate the RDF. Options are:
- ”mic”: rMax according to minimum image convention. (see “The Minimum Image Convention in Non-Cubic MD Cells” by Smith, http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.57.1696) Suggested value, as this leads to physically meaningful RDFs.
- ”2mic : rMax twice as big as for “mic”. Legacy option to reproduce some earlier trajectory results
- float: Input some float to use it directly as input.

Returns:

tpcf (numpy.ndarray) – The TPCF as [rMax, rMax, rMax] array.
radii (numpy.ndarray) – The radii at which the TPCF was calculated (for plotting), [rMax, rMax, rMax].

write_additional_calculation_data(filepath, return_string=False)[source]

Write additional information about a calculation to a .json file.

This way important information about a calculation can be saved more accessibly.

Parameters:

filepath (string) – Path at which to save the calculation data.
return_string (bool) – If True, no file will be written, and instead a json dict will be returned.

static write_tem_input_file(atoms_Angstrom, qe_input_data, qe_pseudopotentials, grid_dimensions, kpoints, mpi_communicator, mpi_rank)[source]

Write a QE-style input file for the total energy module.

Usually, the used parameters should correspond to the properties of the object calling this function, but they don’t necessarily have to.

Parameters:

atoms_Angstrom (ase.Atoms) – ASE atoms object for the current system. If None, MALA will create one.
qe_input_data (dict) – Quantum Espresso parameters dictionary for the ASE<->QE interface. If None (recommended), MALA will create one.
qe_pseudopotentials (dict) – Quantum Espresso pseudopotential dictionaty for the ASE<->QE interface. If None (recommended), MALA will create one.
grid_dimensions (List) – A list containing the x,y,z dimensions of the real space grid.
kpoints (dict) – k-grid used, usually None or (1,1,1) for TEM calculations.
mpi_communicator (MPI.COMM_WORLD) – An MPI comminucator. If no MPI is enabled, this will simply be None.
mpi_rank (int) – Rank within MPI.

write_to_numpy_file(path, target_data=None)[source]

Write data to a numpy file.

Parameters:

path (string) – File to save into.
target_data (numpy.ndarray) – Target data to save. If None, the data stored in the calculator will be used.

write_to_openpmd_file(path, array=None, additional_attributes={}, internal_iteration_number=0)[source]

Write data to a numpy file.

Parameters:

path (string) – File to save into. If no file ending is given, .h5 is assumed.
array (numpy.ndarray) – Target data to save. If None, the data stored in the calculator will be used.
additional_attributes (dict) – Dict containing additional attributes to be saved.
internal_iteration_number (int) – Internal OpenPMD iteration number. Ideally, this number should match any number present in the file name, if this data is part of a larger data set.

abstract property feature_size: Get dimension of this target if used as feature in ML.

property qe_input_data: Input data for QE TEM calls.

abstract property si_dimension

Dictionary containing the SI unit dimensions in OpenPMD format.

Needed for OpenPMD interface.

abstract property si_unit_conversion

Numeric value of the conversion from MALA (ASE) units to SI.

Needed for OpenPMD interface.