OpenAI previews GPT-5.6 Sol, Terra, and Luna with stronger coding and cyber benchmarks

OpenAI has started a limited preview of GPT-5.6, a new three-model family led by GPT-5.6 Sol, alongside GPT-5.6 Terra and GPT-5.6 Luna.

This is not a broad ChatGPT rollout yet. OpenAI says Sol, Terra, and Luna are currently available through the OpenAI API and Codex to a small group of trusted partners and organizations. The company says GPT-5.6 is not available in ChatGPT during the preview, with broader availability planned for ChatGPT, Codex, and the API in the coming weeks.

What OpenAI announced

OpenAI describes the three models as a tiered family:

GPT-5.6 Sol - the flagship and most capable model
GPT-5.6 Terra - a strong lower-cost option
GPT-5.6 Luna - the fastest and most cost-efficient model

The launch is being framed around software engineering, computer use, professional knowledge work, scientific research, and cybersecurity. OpenAI is also introducing a new max reasoning effort for Sol, plus an ultra mode that can use subagents for more complex work.

The biggest caveat is access. OpenAI says the preview is not a public self-service program, has no public application or waitlist, and is limited to approved organizations with an OpenAI account representative.

The benchmark picture

OpenAI has not published a full broad benchmark suite yet. It says it will share expanded evaluation results when GPT-5.6 becomes broadly available. For now, the official numbers focus on coding, biology, cybersecurity, safety, and external preparedness evaluations.

Headline benchmark claims

Area	OpenAI-reported result
Coding	GPT-5.6 Sol sets a new state of the art on Terminal-Bench 2.1, which tests command-line workflows requiring planning, iteration, and tool coordination.
Biology	GPT-5.6 Sol beats GPT-5.5 on GeneBench v1 while using fewer tokens.
Cybersecurity	GPT-5.6 Sol is competitive with Mythos Preview on ExploitBench while using about one-third of the output tokens.
Cyber capability scaling	Sol, Terra, and Luna all show stronger results on ExploitGym as reasoning is increased.

The most concrete public numbers are in the GPT-5.6 preview system card, especially around cybersecurity and biological capability testing.

Cybersecurity benchmarks

Benchmark	GPT-5.6 Sol result	Comparison noted by OpenAI
FrontierCyber	19/197 challenges solved	GPT-5.5 solved fewer across Easy, Medium, and Hard buckets, with both models at 0% on Elite.
FrontierCyber Easy	5/44, or 11%	GPT-5.5: 3/44, or 6%.
FrontierCyber Medium	10/77, or 12%	GPT-5.5: 5/80, or 6%.
FrontierCyber Hard	4/67, or 5%	GPT-5.5: 3/69, or 4%.
FrontierCyber Elite	0/9, or 0%	GPT-5.5: 0/12, or 0%.
CyScenarioBench	7/11 long-horizon challenges solved; 28% average success	About 3 percentage points above GPT-5.5.
Atomic Challenges	All 22 medium- and hard-difficulty challenges solved at least once	Similar average success rates to GPT-5.5 across the reported categories.

OpenAI classifies GPT-5.6 Sol as High capability in cybersecurity, but below its Critical threshold. Terra and Luna also reach the High threshold in cybersecurity, though OpenAI says they are less capable overall than Sol.

That matters because OpenAI is clearly treating GPT-5.6 as a more capable dual-use system, not just a routine model update. The company says Sol can identify bugs and exploitation primitives, but did not autonomously produce a functional full-chain exploit in tested Chromium and Firefox evaluations.

Biology benchmarks

OpenAI says GPT-5.6 Sol or a railfree variant reached the highest scores to date on several expert-level biology evaluations used by SecureBio:

Biology evaluation	Strongest reported GPT-5.6 score
Virology Capabilities Test	53.5%
Molecular Biology Capabilities Test	60.0%
Human Pathogen Capabilities Test	68.4%
World-Class Bio	68.3%
ReproBAIT	85% for the railfree checkpoint

The World-Class Bio score is roughly 9 percentage points above GPT-5.5, which OpenAI lists at 59.7%.

Pricing during preview

OpenAI's help center lists GPT-5.6 preview pricing per 1 million tokens:

Model	Model ID	Input	Output
GPT-5.6 Sol	`gpt-5.6-sol`	$5.00	$30.00
GPT-5.6 Terra	`gpt-5.6-terra`	$2.50	$15.00
GPT-5.6 Luna	`gpt-5.6-luna`	$1.00	$6.00

OpenAI also says GPT-5.6 adds more predictable prompt caching, including explicit cache breakpoints and a 30-minute minimum cache life. Cache writes are billed at 1.25x the uncached input rate, while cache reads still receive the 90% cached-input discount.

Why access is limited

The unusual part of this launch is the release path. OpenAI says it previewed GPT-5.6 plans and capabilities with the U.S. government ahead of launch, and is beginning with a small group of trusted partners whose participation has been shared with the government.

OpenAI says this is a short-term step tied to cyber risk coordination, not the long-term default it wants for model releases. The company says the goal is broader availability in the coming weeks while it continues testing, coordination, and safety work.

Our take

GPT-5.6 looks less like a simple consumer upgrade and more like a controlled release of a higher-capability agentic model family.

The most important facts are:

OpenAI is calling Sol its strongest model yet.
GPT-5.6 is currently limited to approved API and Codex partners.
ChatGPT users do not have access during the preview.
The official benchmark story is strongest around command-line coding, biology, and cybersecurity.
The safety story is central, especially because all three models reach OpenAI's High cybersecurity capability threshold.

For builders, the practical message is simple: GPT-5.6 is worth watching closely, but it is not a normal self-serve model upgrade yet. The real test will come when OpenAI publishes the expanded benchmark suite and opens broader access.

Sources: OpenAI launch announcement · OpenAI Help Center preview notes · GPT-5.6 Preview System Card