airsspy.ranking#
Structure ranking utilities for AIRSS search results.
Provides fast parsing of SHELX .res files, ranking by enthalpy per
formula unit, and optional merging of similar structures using
distance fingerprint comparison (equivalent to cryan -u).
Module Contents#
Classes#
Lightweight record for a ranked structure. |
Functions#
Apply external pressure correction in-place. |
|
Filter records by label using a glob pattern. |
|
Filter records by chemical formula. |
|
Filter records by exact number of formula units. |
|
Filter records by exact number of distinct species. |
|
Filter records by ion count. |
|
Detect spacegroups via spglib for records with empty |
|
Fill missing |
|
Read concatenated RES structures from a text stream (stdin or file). |
|
Read RES structures from a file (may be packed). |
|
Read structures from an extxyz file using ASE. |
|
Merge similar structures by comparing distance fingerprints. |
|
Infer the element list from all structures’ species_counts. |
|
Check which elemental references are present. |
|
Convert StructureRecords to pymatgen PDEntry objects. |
|
Compute convex hull using pymatgen PhaseDiagram. |
|
Filter records by energy threshold per atom relative to the minimum. |
|
Remove suspiciously low-energy records using a trimmed MAD cutoff. |
|
Rank structures by enthalpy per formula unit. |
|
Return only the most stable structure per composition. |
|
Format the header line (goes to stderr). |
|
Format a single ranked record as a cryan-compatible output line. |
|
Build a cryan-style convex hull plot using plotly. |
|
Format the header line for Maxwell construction output. |
|
Format a single Maxwell construction record as a cryan-compatible output line. |
Data#
API#
- airsspy.ranking.logger#
‘getLogger(…)’
- airsspy.ranking.apply_external_pressure(records: list[airsspy.ranking.StructureRecord], pressure_gpa: float) None[source]#
Apply external pressure correction in-place.
Adds
P * V / _EV_A3_TO_GPAto each record’s enthalpy, where P is in GPa and V in ų. Positive pressure_gpa favours denser (smaller-volume) structures.
- airsspy.ranking.filter_by_name(records: list[airsspy.ranking.StructureRecord], pattern: str) list[airsspy.ranking.StructureRecord][source]#
Filter records by label using a glob pattern.
Supports
*,?, and[seq]wildcards (fnmatch).
- airsspy.ranking.filter_by_formula(records: list[airsspy.ranking.StructureRecord], formula: str) list[airsspy.ranking.StructureRecord][source]#
Filter records by chemical formula.
Three modes:
Exact reduced formula:
-f SiO2Comma-separated elements:
-f Si,O— matches any composition containing all listed elementsGlob on reduced formula:
-f "Si*"— fnmatch on the reduced formula
- airsspy.ranking.filter_by_formula_units(records: list[airsspy.ranking.StructureRecord], n_formula_units: int) list[airsspy.ranking.StructureRecord][source]#
Filter records by exact number of formula units.
- airsspy.ranking.filter_by_species_number(records: list[airsspy.ranking.StructureRecord], species_number: int) list[airsspy.ranking.StructureRecord][source]#
Filter records by exact number of distinct species.
- airsspy.ranking.filter_by_ions_number(records: list[airsspy.ranking.StructureRecord], ions_number: int) list[airsspy.ranking.StructureRecord][source]#
Filter records by ion count.
Positive values require an exact
natomsmatch. Negative values match records withnatoms <= abs(ions_number), following cryan’s range form.
- airsspy.ranking.fill_missing_spacegroups(records: list[airsspy.ranking.StructureRecord], symprec: float = 0.01) None[source]#
Detect spacegroups via spglib for records with empty
symm.Only processes records that have an
_atomsreference (extxyz). Modifies records in place.
- airsspy.ranking.fill_dict_symm(dicts: list[dict], symprec: float = 0.01) None[source]#
Fill missing
symmin ranking output dicts via spglib.Each dict must have a
_recordkey referencing the source- Class:
StructureRecord. Modifies dicts in place.
- airsspy.ranking.read_res_stream(stream: TextIO) list[airsspy.ranking.StructureRecord][source]#
Read concatenated RES structures from a text stream (stdin or file).
- airsspy.ranking.read_res_file(path: str) list[airsspy.ranking.StructureRecord][source]#
Read RES structures from a file (may be packed).
- airsspy.ranking.read_extxyz_file(path: str, energy_field: str | None = None, label_field: str | None = None, pressure_field: str | None = None) list[airsspy.ranking.StructureRecord][source]#
Read structures from an extxyz file using ASE.
Field names for energy, label and pressure are auto-detected from
atoms.info/ the attached calculator. Override detection by passing energy_field, label_field, or pressure_field.Spacegroup is read from
atoms.infoif present; otherwise it is left empty and can be filled later via- Func:
fill_missing_spacegroups.
- airsspy.ranking.eliminate_similar(records: list[airsspy.ranking.StructureRecord], threshold: float, cutoff: float = 4.0, zweight: bool = False) list[airsspy.ranking.StructureRecord][source]#
Merge similar structures by comparing distance fingerprints.
Matches cryan’s
-ubehaviour:Sort by enthalpy_per_fu ascending (most stable first)
For each pair of same-formula records, compare scaled distance fingerprints
If max difference < threshold * mean_min_distance, merge copies
cutoff controls the neighbour search radius (Å) for fingerprint computation (default 4.0, matching cryan’s
rmax / 1.75).When zweight is True, distances are weighted by
d * zmax² / (Z_i·Z_j)to distinguish different atom-type pairs.Merged peers are tracked in each surviving record’s
_merged_peerslist for later output.Returns the deduplicated list with accumulated copies.
- airsspy.ranking.infer_elements(records: list[airsspy.ranking.StructureRecord]) list[str][source]#
Infer the element list from all structures’ species_counts.
Returns elements sorted by atomic number.
- airsspy.ranking.check_elemental_references(records: list[airsspy.ranking.StructureRecord], elements: list[str]) list[str][source]#
Check which elemental references are present.
Returns a list of elements that have no pure-element structure.
- airsspy.ranking.records_to_pd_entries(records: list[airsspy.ranking.StructureRecord]) list[source]#
Convert StructureRecords to pymatgen PDEntry objects.
Uses Composition(species_counts) and total enthalpy as energy. Sets entry.name = label for display.
- airsspy.ranking.maxwell_construction(records: list[airsspy.ranking.StructureRecord], elements: list[str] | None = None, delta_e: float | None = None, verbose: bool = True) tuple[list[dict], object][source]#
Compute convex hull using pymatgen PhaseDiagram.
Args: records: Structure records to analyse. elements: Element list for the chemical system. If None, inferred. delta_e: Filter structures with e_above_hull above this (eV/atom). verbose: If True, print warnings to stderr.
Returns: Tuple of (output_records, PhaseDiagram, elements). Each output dict has all rank fields plus: - e_above_hull: energy above hull (eV/atom) - hull_energy_per_atom: hull energy at this composition (eV/atom) - formation_energy_per_atom: formation energy (eV/atom) - on_hull: True if structure is on the convex hull
Raises: ValueError: If fewer than 2 elements in the system.
- airsspy.ranking.prefilter_records(records: list[airsspy.ranking.StructureRecord], ethresh: float = 0.1) list[airsspy.ranking.StructureRecord][source]#
Filter records by energy threshold per atom relative to the minimum.
Groups by formula, computes relative enthalpy per atom within each group, removes structures above ethresh eV/atom. Used to reduce the candidate set before merging. Returns surviving
StructureRecordobjects.
- airsspy.ranking.prune_pathological_records(records: list[airsspy.ranking.StructureRecord], tail_fraction: float = 0.1, sigma_factor: float = 3.0, trim_count: int = 1, min_tail_size: int = 5) tuple[list[airsspy.ranking.StructureRecord], list[airsspy.ranking.StructureRecord], list[dict]][source]#
Remove suspiciously low-energy records using a trimmed MAD cutoff.
The filter is applied independently for each reduced formula. Energies are compared as enthalpy per atom. For each formula group, the lowest
tail_fractionof records is used as the candidate tail, the lowesttrim_countof those records are excluded from the baseline statistics, and the cutoff ismedian - sigma_factor * 1.4826 * MAD.Returns
(kept, rejected, diagnostics). Diagnostics are dictionaries so callers can report skipped groups and per-formula cutoffs without redoing the statistics.
- airsspy.ranking.rank_structures(records: list[airsspy.ranking.StructureRecord], delta_e: float | None = None, top_n: int | None = None, absolute: bool = False) list[dict][source]#
Rank structures by enthalpy per formula unit.
Returns a list of output dicts with keys needed for formatting.
- airsspy.ranking.summary_structures(records: list[airsspy.ranking.StructureRecord], delta_e: float | None = None) list[dict][source]#
Return only the most stable structure per composition.
Output matches cryan’s
-sflag.
- airsspy.ranking.format_header(show_spin: bool = False, summary_mode: bool = False, long_labels: bool = False) str[source]#
Format the header line (goes to stderr).
- airsspy.ranking.format_rank_line(rec: dict, long_labels: bool = False, show_spin: bool = False, summary_mode: bool = False) str[source]#
Format a single ranked record as a cryan-compatible output line.
- airsspy.ranking.plot_maxwell(ranked: list[dict], elements: list[str]) object[source]#
Build a cryan-style convex hull plot using plotly.
Returns a plotly
go.Figure. The layout has a main panel showing formation enthalpy vs composition with all structures as scatter points and the convex hull as lines, plus a right-hand panel listing the stable phases (formula, nfu, space group, composition).Args: ranked: Output dicts from
maxwell_construction(). elements: Element list (2 for binary).