rgpycrumbs.parsers.chemgp

Parsers for ChemGP JSONL output formats.

ChemGP Rust examples produce JSONL files with method comparison data, GP quality grids, and RFF approximation benchmarks. This module provides structured parsing into typed containers for downstream plotting.

Added in version 1.5.0.

Classes

OptimizerTrace

Single optimizer trace from a comparison JSONL.

ComparisonData

Parsed optimizer comparison from a single JSONL file.

RFFQualityData

Parsed RFF approximation quality data.

GPQualityGrid

GP quality grid data for a single training set size.

StationaryPoint

A stationary point (minimum or saddle) on the PES.

GPQualityData

Complete GP quality data from mb_gp_quality.jsonl.

Functions

parse_comparison_jsonl(→ ComparisonData)

Parse a ChemGP optimizer comparison JSONL file.

parse_rff_quality_jsonl(→ RFFQualityData)

Parse a ChemGP RFF quality JSONL file.

parse_gp_quality_jsonl(→ GPQualityData)

Parse a ChemGP GP quality JSONL file.

Module Contents

class rgpycrumbs.parsers.chemgp.OptimizerTrace[source]

Single optimizer trace from a comparison JSONL.

Attributes

methodstr

Optimizer name (e.g. "gp_minimize", "neb", "otgpd").

stepslist[int]

Step indices.

oracle_callslist[int]

Cumulative oracle call counts.

energieslist[float] | None

Energy at each step (minimize, dimer).

forceslist[float] | None

Force norm at each step (dimer: force, NEB: max_force).

method: str[source]
steps: list[int] = [][source]
oracle_calls: list[int] = [][source]
energies: list[float] | None = None[source]
forces: list[float] | None = None[source]
class rgpycrumbs.parsers.chemgp.ComparisonData[source]

Parsed optimizer comparison from a single JSONL file.

Attributes

tracesdict[str, OptimizerTrace]

Keyed by method name.

summarydict | None

Summary record if present.

traces: dict[str, OptimizerTrace][source]
summary: dict[str, Any] | None = None[source]
rgpycrumbs.parsers.chemgp.parse_comparison_jsonl(path: str | pathlib.Path) ComparisonData[source]

Parse a ChemGP optimizer comparison JSONL file.

Handles minimize, dimer, and NEB comparison formats. Each line is a JSON object with a method field (or summary: true).

Parameters

path

Path to the JSONL file.

Returns

ComparisonData

Parsed traces keyed by method name.

class rgpycrumbs.parsers.chemgp.RFFQualityData[source]

Parsed RFF approximation quality data.

Attributes

exact_energy_maefloat

Exact GP energy MAE vs true surface.

exact_gradient_maefloat

Exact GP gradient MAE vs true surface.

d_rff_valueslist[int]

RFF feature counts tested.

energy_mae_vs_truelist[float]

RFF energy MAE vs true surface.

gradient_mae_vs_truelist[float]

RFF gradient MAE vs true surface.

energy_mae_vs_gplist[float]

RFF energy MAE vs exact GP.

gradient_mae_vs_gplist[float]

RFF gradient MAE vs exact GP.

exact_energy_mae: float = 0.0[source]
exact_gradient_mae: float = 0.0[source]
d_rff_values: list[int] = [][source]
energy_mae_vs_true: list[float] = [][source]
gradient_mae_vs_true: list[float] = [][source]
energy_mae_vs_gp: list[float] = [][source]
gradient_mae_vs_gp: list[float] = [][source]
rgpycrumbs.parsers.chemgp.parse_rff_quality_jsonl(path: str | pathlib.Path) RFFQualityData[source]

Parse a ChemGP RFF quality JSONL file.

Parameters

path

Path to the JSONL file.

Returns

RFFQualityData

Parsed exact GP and RFF metrics.

class rgpycrumbs.parsers.chemgp.GPQualityGrid[source]

GP quality grid data for a single training set size.

Attributes

n_trainint

Number of training points.

nxint

Grid x resolution.

nyint

Grid y resolution.

xlist[list[float]]

Grid x coordinates (ny x nx).

ylist[list[float]]

Grid y coordinates (ny x nx).

true_elist[list[float]]

True energy on grid.

gp_elist[list[float]]

GP predicted energy on grid.

gp_varlist[list[float]]

GP variance on grid.

train_xlist[float]

Training point x coordinates.

train_ylist[float]

Training point y coordinates.

train_elist[float]

Training point energies.

n_train: int = 0[source]
nx: int = 0[source]
ny: int = 0[source]
x: list[list[float]] = [][source]
y: list[list[float]] = [][source]
true_e: list[list[float]] = [][source]
gp_e: list[list[float]] = [][source]
gp_var: list[list[float]] = [][source]
train_x: list[float] = [][source]
train_y: list[float] = [][source]
train_e: list[float] = [][source]
class rgpycrumbs.parsers.chemgp.StationaryPoint[source]

A stationary point (minimum or saddle) on the PES.

kind: str[source]
id: int[source]
x: float[source]
y: float[source]
energy: float[source]
class rgpycrumbs.parsers.chemgp.GPQualityData[source]

Complete GP quality data from mb_gp_quality.jsonl.

Attributes

metadict

Grid metadata (nx, ny, x_min, x_max, y_min, y_max).

stationarylist[StationaryPoint]

Minima and saddle points.

gridsdict[int, GPQualityGrid]

Grid data keyed by n_train.

meta: dict[str, Any][source]
stationary: list[StationaryPoint] = [][source]
grids: dict[int, GPQualityGrid][source]
rgpycrumbs.parsers.chemgp.parse_gp_quality_jsonl(path: str | pathlib.Path) GPQualityData[source]

Parse a ChemGP GP quality JSONL file.

Parameters

path

Path to the JSONL file (e.g. mb_gp_quality.jsonl).

Returns

GPQualityData

Structured grid data with metadata and stationary points.