ASTRA¶
Agentic Schema for Transparent Research Analysis is a YAML specification for scientific analyses. Code captures execution, but not the full structure of an analysis: astra.yaml records the inputs, outputs, methodological choices, and evidence behind an experiment, making the work easier to reproduce and extend.
Agents are expanding the scale and speed of science, which shifts the bottleneck from producing results to trusting them. astra.yaml gives each experiment a structured record for agents to follow, keeping assumptions, choices, and evidence explicit.
ASTRA is intentionally agnostic to the execution harness so that agents, workflow runners, and humans can all read and act on the same spec.
Get started Read the specification
Alpha development
ASTRA is in early alpha. The schema, CLI, and tooling are all still moving — expect breaking changes between minor versions, and pin the schema version in your analyses. Bug reports, design challenges, and use cases that the spec doesn't yet cover are exactly what we want to hear at this stage; please open an issue on the GitHub repo, join the Community tab, or come help define the schema at the Agentic AI 4 Science Developer Summit.
Why ASTRA?¶
Scientific results depend on methodological choices: which data to include, how to handle outliers, which prior to assume, and so on. In ordinary research code, those choices are often scattered across notebooks, scripts, comments, and memory. This makes results hard to reproduce, audit, and expand.
ASTRA gives every methodological choice an explicit place in the spec. In ASTRA, decisions name the options that were considered, link to evidence, and feed into a universe, which records the results for a given set of choices.
At a glance¶
In ASTRA, an analysis declares the design space, a universe picks one option per decision, and the CLI allows for validation and inspection. Below is an example of an astra.yaml. For a detailed walkthrough of the spec, see our specification explained.
version: "1.0"
name: Period-Luminosity Fit
inputs:
- id: catalog_data
type: data
source: data/catalog_data.csv
description: Periods and mean apparent magnitudes.
outputs:
- id: fit_params
type: table
description: Slope, intercept, and scatter for the fitted relation.
inputs: [catalog_data]
decisions: [fit_method]
recipe:
command: >-
python src/fit_period_luminosity.py
--catalog {inputs.catalog_data}
--method {decisions.fit_method}
--out {output}
decisions:
fit_method:
label: Fitting method
rationale: The fitting method determines how outliers influence the inferred relation.
default: ordinary_least_squares
options:
ordinary_least_squares:
label: Ordinary least squares
robust_linear:
label: Robust linear fit
# universes/baseline.yaml
id: baseline
description: Default configuration for the period-luminosity fit.
decisions:
fit_method: ordinary_least_squares