By using this site, you agree to the Privacy Policy and Terms of Use.
Accept

GadgetBond

  • Latest
  • How-to
  • Tech
    • AI
    • Amazon
    • Apple
    • CES
    • Computing
    • Creators
    • Google
    • Meta
    • Microsoft
    • Mobile
    • Samsung
    • Security
    • Xbox
  • Transportation
    • Audi
    • BMW
    • Cadillac
    • E-Bike
    • Ferrari
    • Ford
    • Honda Prelude
    • Lamborghini
    • McLaren W1
    • Mercedes
    • Porsche
    • Rivian
    • Tesla
  • Culture
    • Apple TV
    • Disney
    • Gaming
    • Hulu
    • Marvel
    • HBO Max
    • Netflix
    • Paramount
    • SHOWTIME
    • Star Wars
    • Streaming
Add GadgetBond as a preferred source to see more of our stories on Google.
Font ResizerAa
GadgetBondGadgetBond
  • Latest
  • Tech
  • AI
  • Deals
  • How-to
  • Apps
  • Mobile
  • Gaming
  • Streaming
  • Transportation
Search
  • Latest
  • Deals
  • How-to
  • Tech
    • Amazon
    • Apple
    • CES
    • Computing
    • Creators
    • Google
    • Meta
    • Microsoft
    • Mobile
    • Samsung
    • Security
    • Xbox
  • AI
    • Anthropic
    • ChatGPT
    • ChatGPT Atlas
    • Gemini AI (formerly Bard)
    • Google DeepMind
    • Grok AI
    • Meta AI
    • Microsoft Copilot
    • OpenAI
    • Perplexity
    • xAI
  • Transportation
    • Audi
    • BMW
    • Cadillac
    • E-Bike
    • Ferrari
    • Ford
    • Honda Prelude
    • Lamborghini
    • McLaren W1
    • Mercedes
    • Porsche
    • Rivian
    • Tesla
  • Culture
    • Apple TV
    • Disney
    • Gaming
    • Hulu
    • Marvel
    • HBO Max
    • Netflix
    • Paramount
    • SHOWTIME
    • Star Wars
    • Streaming
Follow US
AIOpenAITech

GPT-5.3-Codex brings speed, reasoning, and autonomy to coding

OpenAI’s Codex is evolving into something closer to a full-time coworker.

By
Shubham Sawarkar
Shubham Sawarkar's avatar
ByShubham Sawarkar
Editor-in-Chief
I’m a tech enthusiast who loves exploring gadgets, trends, and innovations. With certifications in CISCO Routing & Switching and Windows Server Administration, I bring a sharp...
Follow:
- Editor-in-Chief
Feb 6, 2026, 2:58 AM EST
Share
We may get a commission from retail offers. Learn more
Abstract blue and purple gradient background with soft flowing shapes and the white text “GPT-5.3-Codex” centered on the image.
Image: OpenAI
SHARE

OpenAI’s latest move in the AI arms race isn’t another general-purpose chatbot—it’s a power tool for people who live inside terminals, IDEs, dashboards and spreadsheets all day. GPT-5.3-Codex is pitched as “the most capable agentic coding model to date,” but that framing almost undersells what’s going on here: OpenAI is trying to turn Codex from “the thing that writes functions for you” into a colleague that can sit beside you and actually do work across your computer.

At a high level, GPT-5.3-Codex merges the coding chops of GPT-5.2-Codex with the broader reasoning and professional knowledge of GPT-5.2, then runs the whole thing about 25% faster. In practice, that means a single agent that can debug a hairy production issue, refactor a service, write the doc, build the slide deck explaining it, and then open a spreadsheet to model the business impact—all in the same session. OpenAI says early versions of this model helped debug its own training run and deployment stack, from catching context‑rendering bugs to helping engineers investigate odd alpha‑testing data. That’s not just marketing spin; for the people building it, Codex has already become part of the team.

On benchmarks, 5.3-Codex looks like a genuine step forward rather than a minor point release. On SWE‑Bench Pro, a multi‑language software‑engineering benchmark designed to be more realistic and harder to contaminate than the original SWE‑Bench, it edges out both GPT-5.2‑Codex and GPT-5.2 while using fewer tokens at comparable “reasoning effort” levels. On Terminal‑Bench 2.0, which simulates the kind of shell work an agent has to do to be useful in real projects, it hits 77.3% accuracy, up from 64% for GPT-5.2-Codex and 62.2% for GPT-5.2. And on OSWorld‑Verified, a benchmark where the model uses vision to complete real desktop tasks—think moving through UI, clicking, configuring tools—it jumps to 64.7% versus roughly 38% for both prior models, creeping closer to the ~72% human baseline. The picture that emerges: this is less “ChatGPT that’s good at code” and more a general computer operator that happens to be very strong at software work.

Comparison table showing benchmark scores for GPT-5.3-Codex, GPT-5.2-Codex, and GPT-5.2, highlighting GPT-5.3-Codex leading across SWE-Bench Pro, Terminal-Bench 2.0, OSWorld-Verified, cybersecurity capture-the-flag challenges, and SWE-Lancer IC Diamond.
Screenshot: GadgetBond

If you want something more tangible than benchmark charts, OpenAI’s own test projects are telling. The company had GPT-5.3-Codex build full web games over “millions of tokens” of autonomous iteration: a racing title with multiple racers, eight maps and power‑up items, plus a diving game with reefs to explore, a fish codex to complete and basic resource management (oxygen, pressure, hazards). The same model can turn a short product prompt—“Quiet KPI, a founder‑friendly weekly metric digest, soft SaaS aesthetic, lavender gradient, testimonial carousel, pricing toggle”—into a surprisingly polished landing page with sensible defaults: discounted yearly pricing framed clearly, a testimonial slider with multiple quotes, and a full funnel of sections from hero to FAQ without being micromanaged. It’s the kind of work you’d expect from a junior designer‑developer pair, not an autocomplete box.

Crucially, the story here isn’t just “better code,” it’s “more of the work around the code.” OpenAI highlights tasks like writing PRDs, editing product copy, drafting training docs, modeling NPV in spreadsheets, and assembling internal slide decks using real‑world prompts in its GDPval evaluation, a benchmark that measures performance across knowledge‑work tasks in 44 professions. GPT-5.3-Codex essentially matches GPT-5.2 on GDPval while also being the better coding agent, which is an important nuance: you don’t have to choose between a “dev model” and a “general model” as much as before. For teams, that means the same agent that wired up your reporting pipeline can also write the policy document explaining how to use it, or turn raw compliance notes into a presentation.​

Day to day, the bigger shift might be how this model behaves as a collaborator rather than a request‑in, answer‑out API. A lot of OpenAI’s own framing, and early coverage, zeroes in on “mid‑turn steering” and more frequent progress updates. Instead of firing off a long instruction and waiting in silence, you can watch Codex narrate what it’s doing, ask questions while it’s halfway through restructuring your monorepo, and nudge it back when you see it drifting. Reviewers who’ve been hands‑on describe the speed bump as material—fast enough that the model flips from something you tolerate for big jobs to something you can lean on for quick, iterative loops in an editor or terminal. That’s also where some new failure modes show up: more autonomy means it can wander down rabbit holes, continuing to “execute” while gradually solving the wrong problem, and in some sessions, people have noticed quality dropping mid‑conversation when routing falls back to a weaker model. It feels less like a stateless API call and more like managing a slightly over‑eager junior engineer.

Behind the scenes, OpenAI is pretty open about how aggressively it leaned on Codex to build Codex. Researchers used early versions to monitor and debug the massive training run for this release, track patterns across training, analyze interaction quality, and even build custom tools to visualize weird alpha‑testing data. Engineers relied on it to optimize the evaluation harness, track down edge‑case context bugs, and tune GPU cluster behavior to keep latency stable during load spikes. In one internal study, GPT-5.3-Codex itself devised regex classifiers to tag things like “clarification needed,” “positive user signal,” and “task progress” in logs, ran them over the dataset, and summarized how much extra work it was accomplishing per turn. There’s a recursive feel to this cycle: the more capable the agent becomes, the more it accelerates the work of training its successor.

All of this extra capability has an obvious flip side: cybersecurity. OpenAI is explicitly calling GPT-5.3-Codex its first “High capability” model for cyber‑related tasks under its Preparedness Framework, and says it has directly trained the model to identify software vulnerabilities. The company also emphasizes that it doesn’t yet have “definitive evidence” that the model can run end‑to‑end cyberattacks, but it’s treating the system as if it could, rolling out its most extensive safety stack so far—specialized safety training, automated monitoring, gated “trusted access” paths for advanced offensive‑adjacent capabilities, and threat‑intelligence‑driven enforcement pipelines. Alongside the launch, OpenAI is expanding its private beta of Aardvark, a security‑research agent positioned as the first in a suite of Codex‑powered security tools, and partnering with maintainers of major open‑source projects like Next.js to offer free vulnerability scanning after a researcher used Codex to uncover new CVEs there.

There’s also money on the table to push defenders up the same curve. GPT-5.3-Codex launches with a commitment of $10 million in API credits for organizations doing “good‑faith security research,” especially around open source and critical infrastructure, building on a $1 million cybersecurity grant program OpenAI started in 2023. The idea is straightforward: if you’ve just trained a model you believe is powerful enough to meaningfully change the cyber landscape, you want the defensive ecosystem to be experimenting with it as early as possible.

For developers and teams trying to decide whether to move, the practical story is fairly clean. GPT-5.3-Codex is already live for paying ChatGPT users wherever Codex runs: in the Codex desktop app, the CLI, IDE extensions and the web UI. API access is “coming soon,” with no public date yet, so production pipelines that depend on the API will need to treat this as a pilot‑only model for now. On the pricing side, Codex sits inside ChatGPT’s existing paid tiers: Plus and above get higher Codex rate limits, and there’s a credits model covering local messages, cloud tasks and code‑review jobs, with GPT-5.3-Codex and 5.2-Codex sharing the same per‑unit credit costs for now. For heavy users, the practical guidance from early migration guides is to keep GPT-5.2-Codex as the production default while you test 5.3-Codex in parallel on critical workflows and get a feel for its new agentic behavior.

Zooming out, GPT-5.3-Codex is clearly part of a broader strategic shift. OpenAI just launched the standalone Codex app on macOS—a kind of “AI companion” that can see your screen, drive apps and run agents—and 5.3-Codex is the engine meant to make that feel less like a gimmick and more like a serious workhorse. The benchmark numbers, the self‑hosting on NVIDIA’s GB200 NVL72 systems, the cyber‑safety posture, and the internal dogfooding all point in the same direction: Codex is moving from being a code‑generation feature inside ChatGPT to a general‑purpose operator on your computer that happens to be very good at engineering work. Whether that lands as a joyful “10x engineer in a box” or an occasionally chaotic new teammate will depend on how much control, observability and discipline teams bring to these agents—but either way, this release marks an inflection point in how seriously the industry will have to take AI that doesn’t just answer questions but actually acts on your behalf.


Discover more from GadgetBond

Subscribe to get the latest posts sent to your email.

Topic:ChatGPTOpenAI Codex
Leave a Comment

Leave a ReplyCancel reply

Most Popular

iOS 26.4 adds Ambient Music widget and chatbot support to CarPlay

Claude Cowork and Claude Code now automate real desktop work while you’re away

Firefox 149 adds Split View for effortless side-by-side browsing

Apple’s small home security sensor could be the brain of your smart home

Apple tvOS 26.4 rolls out Genius Browse, better audio, and subtitles

Also Read
A modern Amazon Echo Show 11 smart display with an 11‑inch screen sits on a wooden table, showing Alexa+ conversational prompts, smart home controls, weather, and family photos against a neutral wall background.

Amazon’s new Echo Show 11 is $50 off in Big Spring Sale 2026

A stylized Firefox logo in bright orange, pink and purple sits centered against a dark purple night sky with soft clouds and rolling hills in the background.

Firefox 149 update: Split View browsing, free VPN and more

Illustration of a Firefox browser window on a pastel background showing a purple landscape with a small orange Firefox mascot in the center, a “VPN” badge highlighted at the top of the window, and a status card in the corner reading “VPN is on – 50 GB left this month,” promoting Firefox’s built‑in VPN feature.

Firefox rolls out free VPN with 50GB a month

A modern flat‑screen TV mounted on a white wall shows a woman playing a cello in a golden field at sunset, with a slim black soundbar centered on a long wooden media console decorated with white flowers on the left and candles on the right.

Sony unveils BRAVIA Theatre soundbars and new BRAVIA 3 II, 2 II TVs

Light beige Denon Home wireless speakers, including a compact cylindrical model, a wider oval center speaker and a larger rounded rectangular unit, arranged on a wooden coffee table in a warm, modern living room with a beige sofa and rust‑colored cushions in the background.

Denon Home 200, 400 and 600 bring room-ready wireless sound

Black and white photograph of an Apple Store at night, featuring the iconic illuminated Apple logo on a modern glass storefront. The two-story retail space shows customers and staff silhouetted inside the brightly lit interior. An escalator is visible in the foreground leading up to the store level. The architectural design features clean lines with floor-to-ceiling windows and a distinctive slatted ceiling detail. Holiday lights can be seen decorating nearby areas, creating a festive atmosphere around the modern retail environment.

Apple expands American Manufacturing Program with new partners

A wide promotional image showing five vertical Snapchat‑style video frames arranged in an arc, each featuring a different person in a dynamic scene—walking in a city with pink hair, floating in space in an astronaut helmet, riding a horse through a canal city, posing among tall cacti with white flowers, and swimming underwater near coral and fish—with a colorful play‑button icon and the text “AI Clips” centered at the bottom on a dark gradient background.

Snapchat brings one-tap AI video magic to Lens Studio

A dark terminal window labeled “earthling — zsh” sits over a pastel green Figma‑style UI mockup, showing a command that says “Build me a new component set based on my button.tsx file,” followed by a status list indicating Figma skills successfully loaded, three files read, and a button component created with 72 variants.

Figma just opened its canvas to AI agents

Company Info
  • Homepage
  • Support my work
  • Latest stories
  • Company updates
  • GDB Recommends
  • Daily newsletters
  • About us
  • Contact us
  • Write for us
  • Editorial guidelines
Legal
  • Privacy Policy
  • Cookies Policy
  • Terms & Conditions
  • DMCA
  • Disclaimer
  • Accessibility Policy
  • Security Policy
  • Do Not Sell or Share My Personal Information
Socials
Follow US

Disclosure: We love the products we feature and hope you’ll love them too. If you purchase through a link on our site, we may receive compensation at no additional cost to you. Read our ethics statement. Please note that pricing and availability are subject to change.

Copyright © 2026 GadgetBond. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | Do Not Sell/Share My Personal Information.