OpenAI Unveils Next-Gen GPT-5.4 with Pro and Thinking Capabilities

On Thursday, OpenAI officially released GPT-5.4, presenting it as their most capable and efficient frontier model specifically designed for professional work. This new foundation model is available in a standard version, alongside two specialized variants: GPT-5.4 Thinking, optimized for reasoning, and GPT-5.4 Pro, geared for high performance.
A significant advancement in GPT-5.4 is its API version, which now supports context windows as large as 1 million tokens, marking the largest context window ever offered by OpenAI. Furthermore, OpenAI has highlighted improved token efficiency, stating that GPT-5.4 can resolve the same problems with considerably fewer tokens compared to its predecessor, GPT-5.2.
The model's superior capabilities are underscored by significantly improved benchmark results. GPT-5.4 achieved record scores in the computer use benchmarks OSWorld-Verified and WebArena Verified. It also scored an impressive 83% on OpenAI’s GDPval test, which evaluates knowledge work tasks. In the realm of professional skills, including law and finance, GPT-5.4 took the lead on Mercor’s APEX-Agents benchmark. Brendan Foody, CEO of Mercor, emphasized GPT-5.4's excellence in creating "long-horizon deliverables such as slide decks, financial models, and legal analysis," noting its top performance at a faster speed and lower cost than competing models.
OpenAI has continued its focus on mitigating hallucinations and factual errors. The new model demonstrates a substantial improvement, being 33% less likely to make errors in individual claims and showing an 18% overall reduction in response errors when compared to GPT-5.2. This signifies a considerable leap in the model's reliability and factual accuracy.
The launch of GPT-5.4 also introduces a revamped approach to tool calling within its API version, featuring a new system called Tool Search. Previously, system prompts would define all available tools, a process that could consume a large number of tokens as the toolset grew. The new Tool Search system enables models to look up tool definitions only when required, leading to faster and more cost-effective requests, particularly in complex systems with numerous available tools.
In the domain of AI safety, OpenAI has incorporated a new evaluation to scrutinize its models’ chain-of-thought—the internal commentary that reveals their reasoning process through multi-step tasks. AI safety researchers have long expressed concerns about reasoning models potentially misrepresenting their chain-of-thought. OpenAI’s new evaluation indicates that deception is less likely to occur in the GPT-5.4 Thinking version, suggesting that "the model lacks the ability to hide its reasoning and that CoT monitoring remains an effective safety tool." This reinforces the transparency and safety measures integrated into the new model.
You may also like...
CAF Under Fire! Stars Slam Last-Minute WAFCON 2026 Postponement

The 2026 Women's Africa Cup of Nations (WAFCON) has been controversially postponed from March to July-August 2026, just ...
Sister Sister Lands on Apple TV in Historic Global Release

Vietnamese thriller Sister Sister lands on Apple TV in 36 territories worldwide, marking a major distribution milestone ...
BTS Hypes Up 'Arirang' Album, Teases Fans with Q&A Session

BTS engages directly with fans online in a new GQ video, addressing fan theories and their evolving sound as anticipatio...
P!nk's Mexico Tour Hits a Snag: Concerts Canceled Unexpectedly

Concert promoter Ocesa has canceled multiple shows in Mexico City, including performances by P!nk, Carolina Ross, and Lu...
Ethiopia's Skyward Leap: Boeing Deal Elevates Aerospace Education

An American aerospace giant has partnered with a leading Ethiopian academic institution to boost aviation science and ae...
Trump's African Health Strategy Hits Critical Snag

The United States' America First Global Health Strategy, launched in September 2025, faces accusations of exploitation f...
Starlink's Africa Breakthrough: MTN Zambia Launches First Direct-to-Cell Service!

MTN Zambia has made history as the first African operator to successfully field test Starlink’s Direct-to-Cell service, ...
India's Tech Leap: Tata Group Poised to Become First Indian iPhone Maker!

The Tata Group is set to acquire an Apple supplier's factory in Karnataka, making it the first Indian company to assembl...


