airsspy.ranking#

Structure ranking utilities for AIRSS search results.

Provides fast parsing of SHELX .res files, ranking by enthalpy per formula unit, and optional merging of similar structures using distance fingerprint comparison (equivalent to cryan -u).

Module Contents#

Classes#

StructureRecord

Lightweight record for a ranked structure.

Functions#

apply_external_pressure

Apply external pressure correction in-place.

filter_by_name

Filter records by label using a glob pattern.

filter_by_formula

Filter records by chemical formula.

filter_by_formula_units

Filter records by exact number of formula units.

filter_by_species_number

Filter records by exact number of distinct species.

filter_by_ions_number

Filter records by ion count.

fill_missing_spacegroups

Detect spacegroups via spglib for records with empty symm.

fill_dict_symm

Fill missing symm in ranking output dicts via spglib.

read_res_stream

Read concatenated RES structures from a text stream (stdin or file).

read_res_file

Read RES structures from a file (may be packed).

read_extxyz_file

Read structures from an extxyz file using ASE.

eliminate_similar

Merge similar structures by comparing distance fingerprints.

infer_elements

Infer the element list from all structures’ species_counts.

check_elemental_references

Check which elemental references are present.

records_to_pd_entries

Convert StructureRecords to pymatgen PDEntry objects.

maxwell_construction

Compute convex hull using pymatgen PhaseDiagram.

prefilter_records

Filter records by energy threshold per atom relative to the minimum.

prune_pathological_records

Remove suspiciously low-energy records using a trimmed MAD cutoff.

rank_structures

Rank structures by enthalpy per formula unit.

summary_structures

Return only the most stable structure per composition.

format_header

Format the header line (goes to stderr).

format_rank_line

Format a single ranked record as a cryan-compatible output line.

plot_maxwell

Build a cryan-style convex hull plot using plotly.

format_maxwell_header

Format the header line for Maxwell construction output.

format_maxwell_line

Format a single Maxwell construction record as a cryan-compatible output line.

Data#

API#

airsspy.ranking.logger#

‘getLogger(…)’

class airsspy.ranking.StructureRecord[source]#

Lightweight record for a ranked structure.

label: str#

None

pressure: float#

None

volume: float#

None

enthalpy: float#

None

spin: float#

0.0

spin_abs: float#

0.0

natoms: int#

0

symm: str = <Multiline-String>#
species_counts: dict[str, int]#

‘field(…)’

copies: int#

1

source: str = <Multiline-String>#
property reduced_formula: str#

Hill-system reduced formula, e.g. ‘SiO2’.

property n_formula_units: int#

Number of formula units (GCD of species counts).

property enthalpy_per_fu: float#

Enthalpy per formula unit.

property volume_per_fu: float#

Volume per formula unit.

airsspy.ranking.apply_external_pressure(records: list[airsspy.ranking.StructureRecord], pressure_gpa: float) None[source]#

Apply external pressure correction in-place.

Adds P * V / _EV_A3_TO_GPA to each record’s enthalpy, where P is in GPa and V in ų. Positive pressure_gpa favours denser (smaller-volume) structures.

airsspy.ranking.filter_by_name(records: list[airsspy.ranking.StructureRecord], pattern: str) list[airsspy.ranking.StructureRecord][source]#

Filter records by label using a glob pattern.

Supports *, ?, and [seq] wildcards (fnmatch).

airsspy.ranking.filter_by_formula(records: list[airsspy.ranking.StructureRecord], formula: str) list[airsspy.ranking.StructureRecord][source]#

Filter records by chemical formula.

Three modes:

  • Exact reduced formula: -f SiO2

  • Comma-separated elements: -f Si,O — matches any composition containing all listed elements

  • Glob on reduced formula: -f "Si*" — fnmatch on the reduced formula

airsspy.ranking.filter_by_formula_units(records: list[airsspy.ranking.StructureRecord], n_formula_units: int) list[airsspy.ranking.StructureRecord][source]#

Filter records by exact number of formula units.

airsspy.ranking.filter_by_species_number(records: list[airsspy.ranking.StructureRecord], species_number: int) list[airsspy.ranking.StructureRecord][source]#

Filter records by exact number of distinct species.

airsspy.ranking.filter_by_ions_number(records: list[airsspy.ranking.StructureRecord], ions_number: int) list[airsspy.ranking.StructureRecord][source]#

Filter records by ion count.

Positive values require an exact natoms match. Negative values match records with natoms <= abs(ions_number), following cryan’s range form.

airsspy.ranking.fill_missing_spacegroups(records: list[airsspy.ranking.StructureRecord], symprec: float = 0.01) None[source]#

Detect spacegroups via spglib for records with empty symm.

Only processes records that have an _atoms reference (extxyz). Modifies records in place.

airsspy.ranking.fill_dict_symm(dicts: list[dict], symprec: float = 0.01) None[source]#

Fill missing symm in ranking output dicts via spglib.

Each dict must have a _record key referencing the source

Class:

StructureRecord. Modifies dicts in place.

airsspy.ranking.read_res_stream(stream: TextIO) list[airsspy.ranking.StructureRecord][source]#

Read concatenated RES structures from a text stream (stdin or file).

airsspy.ranking.read_res_file(path: str) list[airsspy.ranking.StructureRecord][source]#

Read RES structures from a file (may be packed).

airsspy.ranking.read_extxyz_file(path: str, energy_field: str | None = None, label_field: str | None = None, pressure_field: str | None = None) list[airsspy.ranking.StructureRecord][source]#

Read structures from an extxyz file using ASE.

Field names for energy, label and pressure are auto-detected from atoms.info / the attached calculator. Override detection by passing energy_field, label_field, or pressure_field.

Spacegroup is read from atoms.info if present; otherwise it is left empty and can be filled later via

Func:

fill_missing_spacegroups.

airsspy.ranking.eliminate_similar(records: list[airsspy.ranking.StructureRecord], threshold: float, cutoff: float = 4.0, zweight: bool = False) list[airsspy.ranking.StructureRecord][source]#

Merge similar structures by comparing distance fingerprints.

Matches cryan’s -u behaviour:

  1. Sort by enthalpy_per_fu ascending (most stable first)

  2. For each pair of same-formula records, compare scaled distance fingerprints

  3. If max difference < threshold * mean_min_distance, merge copies

cutoff controls the neighbour search radius (Å) for fingerprint computation (default 4.0, matching cryan’s rmax / 1.75).

When zweight is True, distances are weighted by d * zmax² / (Z_i·Z_j) to distinguish different atom-type pairs.

Merged peers are tracked in each surviving record’s _merged_peers list for later output.

Returns the deduplicated list with accumulated copies.

airsspy.ranking.infer_elements(records: list[airsspy.ranking.StructureRecord]) list[str][source]#

Infer the element list from all structures’ species_counts.

Returns elements sorted by atomic number.

airsspy.ranking.check_elemental_references(records: list[airsspy.ranking.StructureRecord], elements: list[str]) list[str][source]#

Check which elemental references are present.

Returns a list of elements that have no pure-element structure.

airsspy.ranking.records_to_pd_entries(records: list[airsspy.ranking.StructureRecord]) list[source]#

Convert StructureRecords to pymatgen PDEntry objects.

Uses Composition(species_counts) and total enthalpy as energy. Sets entry.name = label for display.

airsspy.ranking.maxwell_construction(records: list[airsspy.ranking.StructureRecord], elements: list[str] | None = None, delta_e: float | None = None, verbose: bool = True) tuple[list[dict], object][source]#

Compute convex hull using pymatgen PhaseDiagram.

Args: records: Structure records to analyse. elements: Element list for the chemical system. If None, inferred. delta_e: Filter structures with e_above_hull above this (eV/atom). verbose: If True, print warnings to stderr.

Returns: Tuple of (output_records, PhaseDiagram, elements). Each output dict has all rank fields plus: - e_above_hull: energy above hull (eV/atom) - hull_energy_per_atom: hull energy at this composition (eV/atom) - formation_energy_per_atom: formation energy (eV/atom) - on_hull: True if structure is on the convex hull

Raises: ValueError: If fewer than 2 elements in the system.

airsspy.ranking.prefilter_records(records: list[airsspy.ranking.StructureRecord], ethresh: float = 0.1) list[airsspy.ranking.StructureRecord][source]#

Filter records by energy threshold per atom relative to the minimum.

Groups by formula, computes relative enthalpy per atom within each group, removes structures above ethresh eV/atom. Used to reduce the candidate set before merging. Returns surviving StructureRecord objects.

airsspy.ranking.prune_pathological_records(records: list[airsspy.ranking.StructureRecord], tail_fraction: float = 0.1, sigma_factor: float = 3.0, trim_count: int = 1, min_tail_size: int = 5) tuple[list[airsspy.ranking.StructureRecord], list[airsspy.ranking.StructureRecord], list[dict]][source]#

Remove suspiciously low-energy records using a trimmed MAD cutoff.

The filter is applied independently for each reduced formula. Energies are compared as enthalpy per atom. For each formula group, the lowest tail_fraction of records is used as the candidate tail, the lowest trim_count of those records are excluded from the baseline statistics, and the cutoff is median - sigma_factor * 1.4826 * MAD.

Returns (kept, rejected, diagnostics). Diagnostics are dictionaries so callers can report skipped groups and per-formula cutoffs without redoing the statistics.

airsspy.ranking.rank_structures(records: list[airsspy.ranking.StructureRecord], delta_e: float | None = None, top_n: int | None = None, absolute: bool = False) list[dict][source]#

Rank structures by enthalpy per formula unit.

Returns a list of output dicts with keys needed for formatting.

airsspy.ranking.summary_structures(records: list[airsspy.ranking.StructureRecord], delta_e: float | None = None) list[dict][source]#

Return only the most stable structure per composition.

Output matches cryan’s -s flag.

airsspy.ranking.format_header(show_spin: bool = False, summary_mode: bool = False, long_labels: bool = False) str[source]#

Format the header line (goes to stderr).

airsspy.ranking.format_rank_line(rec: dict, long_labels: bool = False, show_spin: bool = False, summary_mode: bool = False) str[source]#

Format a single ranked record as a cryan-compatible output line.

airsspy.ranking.plot_maxwell(ranked: list[dict], elements: list[str]) object[source]#

Build a cryan-style convex hull plot using plotly.

Returns a plotly go.Figure. The layout has a main panel showing formation enthalpy vs composition with all structures as scatter points and the convex hull as lines, plus a right-hand panel listing the stable phases (formula, nfu, space group, composition).

Args: ranked: Output dicts from maxwell_construction(). elements: Element list (2 for binary).

airsspy.ranking.format_maxwell_header(show_spin: bool = False, long_labels: bool = False) str[source]#

Format the header line for Maxwell construction output.

airsspy.ranking.format_maxwell_line(rec: dict, long_labels: bool = False, show_spin: bool = False) str[source]#

Format a single Maxwell construction record as a cryan-compatible output line.