OpenAI Previews GPT-5.6 Sol, a Next-Generation Model for Government and Vetted Partners

Quick answer

OpenAI has previewed GPT-5.6 Sol, a larger frontier model aimed at hard reasoning, coding, agentic work and high-stakes research. Here is what is confirmed, what the benchmarks show, who gets access first, what it costs, and why the safety restrictions matter.

AI Watch Test the workflow before relying on the output.

Last checked: June 26, 2026. This article uses OpenAI's June 26 post, "Previewing GPT-5.6 Sol: a next-generation model," and OpenAI's GPT-5.6 Sol system card as primary sources. Benchmark and safety results are OpenAI-reported unless stated otherwise. Secondary coverage from The Verge, Axios and The Guardian is used only for market and rollout context.

Quick answer

OpenAI previewed GPT-5.6 Sol on June 26, 2026, positioning it as a next-generation frontier model for difficult reasoning, software engineering, visual reasoning, scientific work and agentic tasks. The release is not a normal consumer launch. OpenAI says Sol access starts with the U.S. government and select vetted partners, with availability through API and Codex before broad ChatGPT access.

The important details:

Question	Current answer
What is GPT-5.6 Sol?	OpenAI's largest model in the GPT-5.6 family, designed for hard reasoning, coding, science, visual reasoning and agentic work.
Is it available to everyone?	No. OpenAI is starting with government and vetted partner access.
Is it in ChatGPT today?	Not broadly. OpenAI says ChatGPT access will come later after inference is optimized.
Where can early users access it?	OpenAI says access begins through API and Codex for approved users.
What does it cost?	OpenAI lists Sol at $5 per 1M input tokens and $30 per 1M output tokens, with a lower cached-input price.
Why is it restricted?	OpenAI's system card classifies the GPT-5.6 family as High Capability in cybersecurity and biological/chemical domains, while saying no evaluated model crossed its Critical threshold.
What is the main caveat?	The benchmark numbers are vendor-reported and should be validated against real workloads before any enterprise or government deployment decision.

The practical takeaway: Sol is not just "a faster chatbot." It is a frontier-model preview built for high-value tasks where stronger reasoning and tool use may matter, but where access control, logging, human review and safety policy are also more important.

What OpenAI announced

OpenAI announced GPT-5.6 Sol as a limited preview rather than a general launch. The company says the model is meant for problems that require sustained reasoning, complex tool use and high reliability under difficult conditions.

The company describes Sol as part of a new GPT-5.6 family that uses a size metaphor:

Model family label	OpenAI's positioning	Likely role
Sol	Largest model in the family	Highest-capability work, frontier evaluations, hard agentic tasks and vetted high-stakes deployments.
Terra	Mid-sized model	A balance of capability, speed and cost for broader production use.
Luna	Smaller model	Lower-latency or lower-cost tasks where the biggest model is unnecessary.

The naming matters because OpenAI appears to be separating frontier capability from broad everyday availability. Sol is the flagship model, but OpenAI is not treating it as an instant default for all users.

OpenAI GPT-5.6 model family illustration showing Sol, Terra and Luna as sun, earth and moon

Who can use GPT-5.6 Sol first?

OpenAI says early access begins with the U.S. government and select vetted partners working on high-stakes projects with dedicated support. That means ordinary ChatGPT users and most developers should not expect immediate access.

The rollout has three important implications.

First, Sol is being treated as a controlled preview. OpenAI is giving access to a narrower group before general availability so it can collect feedback, observe high-stakes behavior and refine safeguards.

Second, API and Codex come before broad ChatGPT access. That signals OpenAI expects Sol to be most useful in tool-using workflows such as software engineering, research, analysis and complex operational tasks, not just conversational answering.

Third, enterprise buyers should prepare for governance before they get access. A model with stronger coding, cyber, science and autonomous-task capability should not be dropped into production workflows without role-based access controls, audit logs, review queues and incident response plans.

Benchmark snapshot

OpenAI's launch materials present Sol as a large step forward on difficult evaluation sets. The numbers below are OpenAI-reported and should be treated as directional until independent testing is available.

Benchmark or evaluation	OpenAI-reported GPT-5.6 Sol result	Why it matters
FrontierMath	45.7% single-run, 59.5% with a 20-sample ensemble	Tests advanced mathematical reasoning that is difficult for current models.
Humanity's Last Exam	50.1% without tools, 64.1% with browsing and Python	Broad expert-level reasoning across many fields.
SWE-Bench Pro	82.7%	Measures hard software-engineering issue resolution.
Terminal-Bench 2.1	89.8%	Tests command-line and terminal task completion.
OSWorld	96.6%	Measures computer-use ability in desktop-like environments.
FrontierCode	34.4%	A difficult agentic coding benchmark.
Adversarial Frontier Functions	23.5%	Tests robustness under hard adversarial function tasks.

These results are impressive if reproduced, especially on software engineering and tool-heavy benchmarks. But the right business reading is cautious: a benchmark win does not automatically mean a model can safely modify production systems, make regulatory decisions, handle protected data or operate without human review.

Software engineering and agentic work

The coding numbers explain why OpenAI is routing early Sol access through Codex. A model that scores strongly on SWE-Bench Pro and terminal workflows is most valuable when it can inspect files, run tests, use tools and iterate through failures.

OpenAI TerminalBench 2.1 chart for GPT-5.6 Sol and comparison models

For developers and platform teams, Sol's preview should be read as part of a broader industry shift: frontier AI is moving from "answer this prompt" to "complete this task inside a controlled environment."

Good early tests for a Sol-like model include:

Reproducing a bug before patching it.
Updating a non-critical internal service with full test output.
Migrating a small dependency across a repo.
Explaining a failed CI run and proposing a minimal fix.
Drafting a pull request that includes code, tests and reviewer notes.
Producing a change plan before touching sensitive code.

High-risk tasks should stay restricted:

Authentication and authorization changes.
Payment, billing and financial logic.
Data deletion or migration.
Production deployment.
Security-control changes.
Any workflow involving regulated personal, health, financial or government data.

Cybersecurity: why the system card matters

The most important safety detail is not a single benchmark score. It is OpenAI's own risk classification. OpenAI's GPT-5.6 Sol system card says the evaluated GPT-5.6 models reached High Capability in cybersecurity and biological/chemical domains, while not reaching OpenAI's highest Critical threshold.

That is a meaningful warning. A model can be useful for defenders and risky if poorly governed. Stronger cyber capability may help teams triage vulnerabilities, understand logs, summarize patches and improve secure development. It may also make misuse more scalable if access, prompts and tools are not controlled.

OpenAI's cyber benchmark visuals should be read in that context: they show capability, not permission to run offensive work.

OpenAI ExploitBench chart for GPT-5.6 Sol cyber capability evaluation

OpenAI ExploitGym chart for GPT-5.6 Sol cyber capability evaluation

For companies, the minimum governance baseline should include:

Control	Why it matters
Least-privilege access	The model should not inherit broad human permissions by default.
Isolated execution	Agent work should run in controlled environments, not on unrestricted production machines.
Full logging	Teams need prompts, tool calls, file changes, commands and outputs for review and incident response.
Human approval gates	High-impact changes should require human sign-off before merge or deployment.
No secrets in prompts	Users should not paste API keys, credentials, customer records or sensitive internal data into uncontrolled sessions.
Policy-based refusal handling	Applications must handle refusals and safety fallbacks as normal product states.
Red-team testing	Teams should test prompt injection, malicious repo content and tool-abuse paths before deployment.

For cybersecurity readers, the safe interpretation is defensive: use stronger models to improve patching, detection, documentation and review, not to automate unauthorized access or exploit development.

Biology and science: high upside, higher governance needs

OpenAI's system card also identifies higher capability in biological and chemical domains. That matters because scientific models can accelerate useful research, but some biology and chemistry tasks are dual-use.

OpenAI GeneBench v1 chart for GPT-5.6 Sol biological capability evaluation

Businesses, universities and public agencies should avoid treating a frontier model as a normal document assistant for sensitive science work. Recommended controls include:

Clear acceptable-use rules for biology, chemistry and lab-planning prompts.
Human expert review before acting on model output.
Restricted access for dual-use research workflows.
Logging and retention policies aligned with institutional review and compliance needs.
Separation between general research support and experimental-design workflows.
Escalation paths when a request crosses a safety boundary.

The point is not to block beneficial science. The point is to make sure high-capability models are used inside accountable systems.

Pricing and cost implications

OpenAI lists GPT-5.6 Sol at $5 per 1M input tokens and $30 per 1M output tokens, with a discounted cached-input price. That makes Sol a premium model, but not one whose economics can be judged by token price alone.

For hard tasks, the relevant metric is cost per accepted outcome, not cost per token. A more expensive model can be cheaper if it solves a difficult task with fewer retries, fewer escalations and less human repair. It can also be more expensive if teams use it for routine tasks that smaller models handle well.

Good workload routing should look like this:

Workload	Better first choice
Simple summaries, rewriting and classification	Smaller, cheaper model.
Routine support macros and FAQ drafts	Smaller or mid-tier model with guardrails.
Complex code migration	Sol-like frontier model with tests and review.
Multi-document legal, policy or research analysis	Sol-like model with source citation and human expert review.
Security triage and remediation planning	Sol-like model in a controlled defensive workflow.
Regulated or safety-critical decisions	Human-led process with AI support, logging and approval.

What is confirmed

As of June 26, 2026, these points are confirmed by OpenAI's public materials:

Point	Status
OpenAI previewed GPT-5.6 Sol	Confirmed.
Sol is part of the GPT-5.6 family with Terra and Luna	Confirmed by OpenAI's launch framing.
Access starts with government and vetted partners	Confirmed in OpenAI's rollout language.
API and Codex are the early access surfaces	Confirmed by OpenAI.
Broad ChatGPT access is not the initial launch mode	Confirmed by OpenAI's rollout framing.
Sol has premium API pricing	Confirmed by OpenAI's listed pricing.
OpenAI reports major benchmark gains	Confirmed as vendor-reported claims.
OpenAI classifies the family as High Capability in cyber and bio/chemical domains	Confirmed in the system card.
OpenAI says no evaluated GPT-5.6 model crossed its Critical threshold	Confirmed in the system card.

What is not confirmed

Several important questions remain open:

Exact date for broad ChatGPT availability.
Exact date for general API availability outside approved users.
Which government agencies and partners are in the first preview group.
Independent benchmark reproduction.
Real-world reliability in enterprise software environments.
Model behavior under adversarial prompt injection in deployed agent systems.
Whether Sol's pricing changes before or after broader availability.
How customers can qualify for vetted access.

Until those details are public, users should treat Sol as a controlled preview rather than a product everyone can immediately build around.

Why OpenAI is previewing instead of fully launching

OpenAI says frontier-model development has growing societal implications, and that early preview access helps identify risks and benefits before a broad release. That is consistent with the system card's risk posture: stronger models can be useful, but they also increase the consequences of weak deployment controls.

The preview strategy gives OpenAI three advantages:

It can watch real high-stakes usage before mass availability.
It can tune safety systems around observed failure modes.
It can support trusted partners directly instead of leaving them to self-deploy without guidance.

The tradeoff is that the public gets less hands-on evidence at launch. For now, most people must rely on OpenAI's published benchmark and safety reporting plus early partner feedback.

What businesses should do now

Most businesses do not need to rush. If your team does not have Sol access yet, the useful work is preparation.

1. Build an evaluation set

Create a private test suite before adopting any frontier model. Include real examples from your work:

Hard support tickets.
Code issues with expected fixes.
Compliance review examples.
Security triage reports.
Research questions with source documents.
Data-analysis tasks with known answers.
Cases where the model should refuse or escalate.

2. Define access rules

Decide who can use a high-capability model and for what. Separate casual productivity from sensitive work. Government, security, biotech, legal and financial teams should have tighter review.

3. Require evidence

For important work, require the model to provide source links, file references, test output, assumptions and uncertainty. Do not accept polished prose as proof.

4. Keep humans accountable

AI output should not become a way to blur responsibility. Assign a human owner for every high-impact model-assisted decision or code change.

5. Track cost per finished task

Measure accepted outputs, review time, defect rate, retries and downstream impact. Token spend alone will not tell you whether Sol-like models are worth it.

User impact

For everyday ChatGPT users, the immediate impact is limited. Sol is a preview, not a mass-market feature drop. You may not see it in your account today.

For developers, the biggest near-term signal is Codex. OpenAI is clearly aiming Sol at agentic workflows where a model can use tools, operate in a repo and work through hard tasks.

For security and governance teams, the message is stronger: frontier models are becoming more capable in domains where mistakes and misuse matter. Your controls should advance before access does.

For public-sector and regulated organizations, Sol may be attractive because OpenAI is explicitly starting with government and vetted-partner use cases. But buyers should still demand documentation on data handling, logging, retention, human review, red-team results and incident escalation.

Bottom line

GPT-5.6 Sol is one of OpenAI's most important model previews because it combines stronger reported performance with a more cautious rollout. The company is saying two things at once: the model can do more, and that is exactly why access starts under tighter controls.

The launch is best understood as a preview of where frontier AI is going: larger models, stronger agentic coding, better tool use, more scientific reasoning and a bigger governance burden. The winner for businesses will not be the team that gets access first. It will be the team that can turn high-capability AI into reviewed, logged, measurable and accountable work.

FAQ

What is GPT-5.6 Sol?

GPT-5.6 Sol is OpenAI's largest model in the GPT-5.6 family. OpenAI is previewing it for hard reasoning, coding, visual reasoning, scientific and agentic workloads.

Can I use GPT-5.6 Sol in ChatGPT?

Not broadly at launch. OpenAI says early access starts through API and Codex for the U.S. government and select vetted partners, with broader availability planned later.

Is GPT-5.6 Sol safe?

OpenAI's system card says the evaluated GPT-5.6 models did not cross its Critical threshold, but did reach High Capability in cybersecurity and biological/chemical domains. That means deployment should include strong access control, monitoring and human review.

How much does GPT-5.6 Sol cost?

OpenAI lists Sol pricing at $5 per 1M input tokens and $30 per 1M output tokens, with a discounted cached-input rate. Teams should evaluate cost per accepted task, not only cost per token.

Why is access restricted?

OpenAI says it wants feedback from trusted deployments before broader release, and its system card shows higher capability in sensitive domains. Restricted access gives OpenAI more control while it observes real-world usage and refines safeguards.

Should businesses wait?

Businesses without vetted access should use the time to build evaluation sets, define access policy, prepare logging and approval gates, and decide which workflows are worth testing with a frontier model.

Sources

Media credits

The Sol, Terra and Luna hero image and benchmark charts were provided with the publishing request and added as article media. The charts are used as OpenAI-reported visual benchmark context; they should not be read as independent verification.

Reader protocol

Before you move on

Global AI workflow guidance. Use this short checklist to turn the article into action.

Check whether the tool can access private files or account data.
Verify factual claims against primary sources before publishing.
Keep a human review step for work that affects money, school, or customers.

HacksByte editorial standard

This guide is written for practical user safety. For account, platform, or legal decisions, confirm critical steps with the official help center or your service provider.