OpenAI has started a limited preview of GPT-5.6, a new three-model family led by GPT-5.6 Sol, alongside GPT-5.6 Terra and GPT-5.6 Luna.
This is not a broad ChatGPT rollout yet. OpenAI says Sol, Terra, and Luna are currently available through the OpenAI API and Codex to a small group of trusted partners and organizations. The company says GPT-5.6 is not available in ChatGPT during the preview, with broader availability planned for ChatGPT, Codex, and the API in the coming weeks.
What OpenAI announced
OpenAI describes the three models as a tiered family:
- GPT-5.6 Sol - the flagship and most capable model
- GPT-5.6 Terra - a strong lower-cost option
- GPT-5.6 Luna - the fastest and most cost-efficient model
The launch is being framed around software engineering, computer use, professional knowledge work, scientific research, and cybersecurity. OpenAI is also introducing a new max reasoning effort for Sol, plus an ultra mode that can use subagents for more complex work.
The biggest caveat is access. OpenAI says the preview is not a public self-service program, has no public application or waitlist, and is limited to approved organizations with an OpenAI account representative.
The benchmark picture
OpenAI has not published a full broad benchmark suite yet. It says it will share expanded evaluation results when GPT-5.6 becomes broadly available. For now, the official numbers focus on coding, biology, cybersecurity, safety, and external preparedness evaluations.
Headline benchmark claims
| Area | OpenAI-reported result |
|---|---|
| Coding | GPT-5.6 Sol sets a new state of the art on Terminal-Bench 2.1, which tests command-line workflows requiring planning, iteration, and tool coordination. |
| Biology | GPT-5.6 Sol beats GPT-5.5 on GeneBench v1 while using fewer tokens. |
| Cybersecurity | GPT-5.6 Sol is competitive with Mythos Preview on ExploitBench while using about one-third of the output tokens. |
| Cyber capability scaling | Sol, Terra, and Luna all show stronger results on ExploitGym as reasoning is increased. |
The most concrete public numbers are in the GPT-5.6 preview system card, especially around cybersecurity and biological capability testing.
Cybersecurity benchmarks
| Benchmark | GPT-5.6 Sol result | Comparison noted by OpenAI |
|---|---|---|
| FrontierCyber | 19/197 challenges solved | GPT-5.5 solved fewer across Easy, Medium, and Hard buckets, with both models at 0% on Elite. |
| FrontierCyber Easy | 5/44, or 11% | GPT-5.5: 3/44, or 6%. |
| FrontierCyber Medium | 10/77, or 12% | GPT-5.5: 5/80, or 6%. |
| FrontierCyber Hard | 4/67, or 5% | GPT-5.5: 3/69, or 4%. |
| FrontierCyber Elite | 0/9, or 0% | GPT-5.5: 0/12, or 0%. |
| CyScenarioBench | 7/11 long-horizon challenges solved; 28% average success | About 3 percentage points above GPT-5.5. |
| Atomic Challenges | All 22 medium- and hard-difficulty challenges solved at least once | Similar average success rates to GPT-5.5 across the reported categories. |
OpenAI classifies GPT-5.6 Sol as High capability in cybersecurity, but below its Critical threshold. Terra and Luna also reach the High threshold in cybersecurity, though OpenAI says they are less capable overall than Sol.
That matters because OpenAI is clearly treating GPT-5.6 as a more capable dual-use system, not just a routine model update. The company says Sol can identify bugs and exploitation primitives, but did not autonomously produce a functional full-chain exploit in tested Chromium and Firefox evaluations.
Biology benchmarks
OpenAI says GPT-5.6 Sol or a railfree variant reached the highest scores to date on several expert-level biology evaluations used by SecureBio:
| Biology evaluation | Strongest reported GPT-5.6 score |
|---|---|
| Virology Capabilities Test | 53.5% |
| Molecular Biology Capabilities Test | 60.0% |
| Human Pathogen Capabilities Test | 68.4% |
| World-Class Bio | 68.3% |
| ReproBAIT | 85% for the railfree checkpoint |
The World-Class Bio score is roughly 9 percentage points above GPT-5.5, which OpenAI lists at 59.7%.
Pricing during preview
OpenAI's help center lists GPT-5.6 preview pricing per 1 million tokens:
| Model | Model ID | Input | Output |
|---|---|---|---|
| GPT-5.6 Sol | gpt-5.6-sol |
$5.00 | $30.00 |
| GPT-5.6 Terra | gpt-5.6-terra |
$2.50 | $15.00 |
| GPT-5.6 Luna | gpt-5.6-luna |
$1.00 | $6.00 |
OpenAI also says GPT-5.6 adds more predictable prompt caching, including explicit cache breakpoints and a 30-minute minimum cache life. Cache writes are billed at 1.25x the uncached input rate, while cache reads still receive the 90% cached-input discount.
Why access is limited
The unusual part of this launch is the release path. OpenAI says it previewed GPT-5.6 plans and capabilities with the U.S. government ahead of launch, and is beginning with a small group of trusted partners whose participation has been shared with the government.
OpenAI says this is a short-term step tied to cyber risk coordination, not the long-term default it wants for model releases. The company says the goal is broader availability in the coming weeks while it continues testing, coordination, and safety work.
Our take
GPT-5.6 looks less like a simple consumer upgrade and more like a controlled release of a higher-capability agentic model family.
The most important facts are:
- OpenAI is calling Sol its strongest model yet.
- GPT-5.6 is currently limited to approved API and Codex partners.
- ChatGPT users do not have access during the preview.
- The official benchmark story is strongest around command-line coding, biology, and cybersecurity.
- The safety story is central, especially because all three models reach OpenAI's High cybersecurity capability threshold.
For builders, the practical message is simple: GPT-5.6 is worth watching closely, but it is not a normal self-serve model upgrade yet. The real test will come when OpenAI publishes the expanded benchmark suite and opens broader access.
Sources: OpenAI launch announcement · OpenAI Help Center preview notes · GPT-5.6 Preview System Card