OpenAI Introduces GPT-5.5 as Its Most Capable Model for Real Work Yet

OpenAI has launched GPT-5.5, a new flagship model designed for coding, computer use, knowledge work, and scientific research, with stronger performance, lower token usage, and broader real-world autonomy than GPT-5.4.

By Maria Konash Edited by AIstify Team Published: Updated:
OpenAI Introduces GPT-5.5 as Its Most Capable Model for Real Work Yet
OpenAI has introduced GPT-5.5 as a new class of intelligence for real work, combining stronger coding, reasoning, and computer-use abilities with faster, more efficient performance. Photo: OpenAI

OpenAI has just launched GPT-5.5, a major new model release that the company describes as its most capable system yet for real-world work. The model is designed to move beyond traditional chatbot interactions and into sustained execution of complex, multi-step tasks across software development, research, business operations, and data analysis.

The release reflects a broader shift inside OpenAI toward building systems that act less like assistants and more like collaborators. “GPT-5.5 is built for real work,” the company said, emphasizing its ability to plan, execute, and refine tasks across long time horizons while maintaining coherence and accuracy.

At its core, GPT-5.5 is optimized for coding, computer use, knowledge work, and scientific reasoning, areas where the company says previous models still required significant human supervision. The goal, according to OpenAI, is to close the gap between what frontier models can theoretically do and what they can reliably deliver in practice.

A Leap in Coding, Reasoning, and Execution

GPT-5.5 shows measurable gains across major industry benchmarks. On Terminal-Bench 2.0, which evaluates command-line workflows requiring tool use and planning, the model achieves 82.7 percent, up from 75.1 percent in GPT-5.4. On SWE-Bench Pro, a widely used benchmark for real-world software engineering, it reaches 58.6 percent, again improving on its predecessor.

These improvements translate into tangible gains for developers. OpenAI says GPT-5.5 is better at understanding the structure of large codebases, identifying root causes of failures, and implementing fixes that work across multiple files and systems. Early testers described the model as more reliable in “end-to-end engineering tasks,” where success depends on coordinating multiple steps rather than producing isolated snippets.

One tester noted that GPT-5.5 “feels like it understands the system, not just the code,” highlighting a shift toward deeper reasoning and contextual awareness.

The model also advances OpenAI’s broader push toward agentic workflows, where AI systems can independently complete tasks across tools. On OSWorld-Verified, a benchmark that measures real-world computer use, GPT-5.5 scores 78.7 percent, demonstrating its ability to operate software environments with minimal human intervention.

From Productivity Tool to Economic Engine

The company says the biggest impact of GPT-5.5 may be in knowledge work, where it can generate presentations, build spreadsheets, analyze data, and produce structured outputs at scale. On GDPval, OpenAI’s benchmark covering 44 occupations, GPT-5.5 reaches 84.9 percent, outperforming GPT-5.4 and approaching expert-level performance across a wide range of tasks.

“GPT-5.5 is better at producing real work products, not just answers,” OpenAI said. “It can generate deliverables that are closer to what a professional would produce.”

The model is also more efficient. OpenAI says GPT-5.5 achieves higher quality results with fewer tokens, reducing the number of iterations needed to complete a task. This efficiency lowers the cost of reaching a given level of output quality, even as the model itself becomes more advanced.

Inside OpenAI, the shift is already visible. The company reports that more than 85 percent of employees now use Codex weekly, applying AI to tasks across engineering, finance, marketing, and communications. In one example, teams used GPT-5.5 to analyze speaking request data and generate structured reports, saving several hours per week per employee.

“This is where AI becomes infrastructure,” OpenAI said, describing the model as a system that supports entire workflows rather than isolated tasks.

Advancing Scientific Discovery

Beyond enterprise use, GPT-5.5 is also pushing into scientific research. On GeneBench, a benchmark focused on genetics and quantitative biology, the model shows significant improvement over previous versions. OpenAI says it is better at exploring hypotheses, interpreting ambiguous results, and iterating across complex research workflows.

In one internal experiment, a customized version of GPT-5.5 contributed to discovering a new proof related to Ramsey numbers, a core concept in combinatorics. The result was later verified by researchers, illustrating how AI can assist in advancing mathematical knowledge under human supervision.

“We’re beginning to see AI meaningfully accelerate science,” the company said, while noting that human oversight remains essential.

Safety, Security, and Deployment

OpenAI also highlighted improvements in factual reliability and safety. GPT-5.5 reduces error rates compared to GPT-5.4 and includes stricter safeguards for high-risk domains, particularly cybersecurity. The company says it is expanding controlled access to cyber-related capabilities through its Trusted Access for Cyber program while maintaining tighter usage controls.

“Security and alignment are core to how we deploy these systems,” OpenAI said, adding that stronger guardrails are necessary as models gain more autonomy.

GPT-5.5 is now rolling out to ChatGPT Plus, Pro, Business, and Enterprise users, with GPT-5.5 Pro available for higher-complexity tasks. API access is expected to follow, with pricing set at $5 per million input tokens and $30 per million output tokens, while Pro usage carries higher rates.

The model was trained using infrastructure developed in collaboration with Microsoft and NVIDIA, leveraging Azure data centers and GPU systems including H100, H200, and next-generation architectures.

Toward AI That Can Work End-to-End

For OpenAI, GPT-5.5 represents more than another incremental release. It signals a transition toward AI systems capable of carrying real work from start to finish.

“Everything is controlled by code,” the company said. “The better an agent is at reasoning about and producing code, the more capable it becomes across all forms of work.”

With GPT-5.5, OpenAI is betting that the future of AI will be defined not just by intelligence, but by execution — systems that can plan, act, and deliver outcomes at scale.

Anthropic Now Beats OpenAI in Enterprise Adoption

Anthropic has surpassed OpenAI in verified business customer adoption for the first time, according to new data from fintech firm Ramp. The shift highlights Anthropic’s growing traction among enterprise and technical customers.

By Samantha Reed Edited by Maria Konash Published:
Anthropic Now Beats OpenAI in Enterprise Adoption
Anthropic surpasses OpenAI in verified business adoption as enterprise demand for Claude accelerates. Image: Anthropic

Anthropic has overtaken OpenAI in verified business customer adoption for the first time, according to new data from fintech platform Ramp.

Ramp’s latest AI Index, which analyzes expense data from more than 50,000 companies using its payment and finance platform, found that 34.4% of participating businesses are now paying for Anthropic services, compared with 32.3% for OpenAI. It marks the first time Anthropic has held the top position in the survey.

According to Ramp economist Ara Kharazian, Anthropic had already established a lead among highly technical industries including finance, technology, and professional services before expanding into broader enterprise categories.

The data also illustrates how rapidly Anthropic has grown over the past year. In May 2025, only 9% of surveyed businesses were paying for Anthropic products. That figure has since increased by 26 percentage points over a 12-month period. During the same timeframe, OpenAI’s business adoption share declined by roughly 1%, while overall enterprise adoption of AI products across the survey increased by 9%.

Kharazian said Anthropic’s strategy of initially focusing on technical customers and developer-oriented use cases helped establish stronger traction within enterprise environments before broader expansion through products such as Claude Cowork.

The findings align with broader market signals suggesting growing enterprise adoption of Claude models. OpenRouter usage rankings, which track another segment of AI users and developers, last showed OpenAI ahead of Anthropic in December 2025.

Ramp noted that the index is not a complete representation of the overall AI market because it only reflects companies using its platform. However, the dataset remains one of the largest publicly discussed indicators of verified commercial AI spending activity.

Enterprise AI Competition Shifts Toward Deployment And Reliability

The report highlights how competition between leading AI companies is increasingly being shaped by enterprise deployment rather than consumer visibility alone.

Anthropic has spent much of the past year expanding its presence in regulated industries and enterprise workflows, particularly in finance, cybersecurity, operations, and software development. Its Claude family of models has gained traction among businesses seeking longer context handling, coding assistance, and AI agents designed for workplace tasks.

The company has also aggressively expanded enterprise infrastructure and partnerships in recent months, including new deployment initiatives, financial services AI agents, cybersecurity tools, and large-scale compute agreements aimed at supporting business demand.

Meanwhile, OpenAI continues to maintain a dominant consumer footprint through ChatGPT while simultaneously pushing deeper into enterprise deployments through consulting partnerships, deployment services, and productivity integrations.

AI & Machine Learning, Enterprise Tech, News

Amazon Launches Alexa for Shopping AI Assistant Across Its Store

Amazon has launched Alexa for Shopping, a generative AI assistant that combines Alexa+, Rufus, and customer shopping history to deliver personalized product recommendations, price tracking, and automated purchasing. The assistant is available across Amazon’s app, website, and Echo Show devices.

By Samantha Reed Edited by Maria Konash Published:
Amazon Launches Alexa for Shopping AI Assistant Across Its Store
Amazon launches Alexa for Shopping with AI search, price tracking, automated purchases, and personalized guides. Image: Rubaitul Azad / Unsplash

Amazon has introduced Alexa for Shopping, a new AI-powered shopping assistant designed to combine conversational AI, product expertise, and customer shopping history into a unified retail experience across Amazon’s app, website, and Echo Show devices.

The launch merges capabilities from Alexa+ and Amazon’s Rufus shopping assistant, which the company said helped more than 300 million customers research and compare products in 2025. Alexa for Shopping is now integrated directly into Amazon’s main search bar, allowing users to ask conversational questions, compare products, track orders, generate shopping guides, and automate purchases using natural language.

Amazon said the assistant continuously personalizes recommendations using browsing activity, purchase history, preferences, and conversations across Alexa-enabled devices. The company described the system as a persistent shopping layer that carries context between devices and sessions instead of resetting interactions each time a customer searches.

The assistant can create AI-generated category overviews, compare products side-by-side from search results, surface one-year price history charts, and automatically monitor products for price drops. Customers can also create “Scheduled Actions” that automate recurring shopping tasks such as replenishing household items, tracking book releases, or adding products to carts when prices reach specific targets.

Amazon is also expanding agentic shopping capabilities through Shop Direct and its “Buy for Me” feature. The system can discover products from external retailers and, for eligible items, complete purchases automatically using stored payment and shipping information.

The company said Alexa for Shopping can also generate personalized shopping guides for complex purchases such as laptops, TVs, or appliances by summarizing reviews, features, pricing differences, and category insights across Amazon and the broader web.

In addition to mobile and desktop support, Amazon is bringing the full Amazon storefront experience to Echo Show devices for the first time. Customers can browse and purchase products using voice commands, touch controls, or a combination of both.

Alexa for Shopping is rolling out to all U.S. customers this week and does not require a Prime membership, Alexa app subscription, or Echo device.

Amazon Pushes AI Deeper Into Commerce Automation

The launch marks one of Amazon’s most aggressive attempts yet to transform e-commerce from search-based navigation into AI-assisted decision making and task automation.

Rather than relying on keyword searches and static filters, Alexa for Shopping is designed to function as a persistent shopping assistant that remembers preferences, previous conversations, recurring purchases, family information, and shopping behavior across Amazon’s ecosystem.

Features such as automated cart-building, conversational product research, and price-triggered purchases move Amazon closer to agentic commerce systems where AI actively manages portions of the shopping process on behalf of users.

The integration of Rufus product intelligence with Alexa+ personalization also gives Amazon a broader contextual data advantage across retail, smart home devices, and media services.

AI & Machine Learning, Consumer Tech, News

Google Introduces Googlebook Laptops Built Around Gemini AI

Google has unveiled Googlebook, a new laptop category combining Android and ChromeOS technologies with Gemini AI integrated throughout the system. The devices introduce features such as Magic Pointer contextual actions and AI-generated desktop widgets.

By Daniel Mercer Edited by Maria Konash Published:

Google has introduced Googlebook, a new category of laptops designed around Gemini AI and built using a combination of Android and ChromeOS technologies. The company said the devices are intended to shift laptops “from an operating system to an intelligence system,” with AI integrated directly into navigation, multitasking, and desktop interaction.

Googlebook devices run on Android 17 with a redesigned laptop-style interface while retaining integration with Google services and Chrome browsing capabilities. The company described the platform as a fusion of Android’s application ecosystem and ChromeOS infrastructure, optimized for Gemini-powered workflows and cross-device continuity.

One of the central features is Magic Pointer, a new cursor system developed with Google DeepMind. When users move the cursor over content, Gemini can suggest contextual actions automatically. For example, pointing at a date inside an email can trigger meeting creation, while selecting multiple images can generate AI-assisted visual compositions such as virtual furniture placement or outfit previews.

Google is also introducing “Create your Widget,” a system that lets users generate desktop widgets through natural language prompts. Gemini can pull information from Gmail, Calendar, search, reservations, reminders, and other Google services to build personalized dashboards dynamically.

The company said Googlebook is designed to function more fluidly across phones and laptops. Features such as Quick Access allow users to browse and use files stored on Android smartphones directly from the laptop without transferring files manually. Mobile apps can also run inside the desktop environment while preserving workflow continuity.

Googlebook hardware will be manufactured through partnerships with Acer, ASUS, Dell Technologies, HP Inc., and Lenovo. Google said the devices will feature premium materials and a new “glowbar” design element intended to visually distinguish Googlebook laptops.

The first Googlebook devices are scheduled to launch this fall.

Google Pushes Gemini Beyond Apps Into Operating Systems

The launch represents one of Google’s clearest attempts so far to position Gemini not simply as an assistant, but as the core interaction layer for future computing devices.

Rather than opening separate AI applications or chat interfaces, Googlebook integrates Gemini directly into the operating system itself through cursor interactions, contextual actions, dynamic widgets, and continuous multitasking support.

The Magic Pointer feature is especially notable because it changes the cursor from a passive navigation tool into an AI-aware interaction system capable of interpreting onscreen context in real time. That approach mirrors a broader industry shift toward embedding AI directly into operating systems and interface layers rather than treating it as an isolated chatbot.

Google also appears to be using Googlebook to unify parts of Android and ChromeOS development into a more integrated AI-first platform strategy.

AI Becomes Central To Personal Computing Competition

The announcement arrives as major technology companies increasingly compete to redesign personal computing around AI-native interfaces.

Laptop and desktop operating systems are evolving from application-centric environments toward systems where AI continuously interprets user context, predicts intent, and automates actions across workflows.

Googlebook positions Google more directly against AI-integrated computing initiatives from companies including Microsoft and Apple, both of which are also embedding generative AI deeper into operating systems and productivity ecosystems.

By combining Gemini with Android’s application ecosystem and Chrome’s browser dominance, Google is attempting to create a tightly integrated AI computing environment spanning phones, laptops, cloud services, and productivity tools. Meanwhile, OpenAI is reportedly accelerating development of its own AI-focused smartphone, which analyst Ming-Chi Kuo said could enter mass production as early as 2027.

AI & Machine Learning, Consumer Tech, News

Google Explores SpaceX Deal For Orbital Data Centers

Google is reportedly in talks with SpaceX and other launch providers as it explores deploying orbital data centers under its Project Suncatcher initiative. The discussions reflect growing interest in space-based AI infrastructure and computing capacity.

By Olivia Grant Edited by Maria Konash Published:
Google Explores SpaceX Deal For Orbital Data Centers
Google explores SpaceX launch deal for orbital AI data centers as Project Suncatcher targets 2027 prototypes. Image: ActionVance / Unsplash

Google is reportedly in talks with SpaceX over a potential rocket launch agreement tied to the company’s efforts to develop orbital data centers, according to a Wall Street Journal report citing people familiar with the discussions.

The report said Google is also holding conversations with other rocket-launch providers as it evaluates infrastructure options for deploying computing systems in space. The initiative is connected to Google’s previously disclosed Project Suncatcher program, which aims to research space-based data center technology and launch two prototype satellites by early 2027.

Project Suncatcher was first revealed in November as part of Google’s long-term exploration of alternative AI infrastructure systems. The project focuses on whether orbital computing platforms could eventually help address growing energy, cooling, and land constraints associated with terrestrial AI data centers.

A partnership with SpaceX would mark another instance of Elon Musk cooperating commercially with AI rivals he has publicly criticized in the past. Musk has repeatedly attacked Google’s AI strategy while simultaneously expanding his own AI infrastructure ambitions through xAI and SpaceX.

Space-Based Computing Gains Attention In AI Industry

The idea of orbital data centers has shifted from theoretical research toward early-stage infrastructure planning as AI companies search for ways to overcome physical limitations facing existing compute expansion.

Space-based infrastructure offers several potential advantages, including access to uninterrupted solar energy, reduced land and cooling constraints, and theoretically massive long-term compute scalability if launch costs continue declining.

However, major technical challenges remain, including radiation exposure, hardware reliability, maintenance logistics, latency management, and the economics of deploying large-scale compute systems into orbit.

AI Infrastructure Race Expands Beyond Earth

Competition in artificial intelligence is expanding into infrastructure ownership and compute deployment strategy rather than focusing solely on model development.

Last week, Anthropic signed an agreement to access the full compute capacity of SpaceXAI’s Colossus 1 supercomputer facility in Memphis, adding more than 220,000 NVIDIA GPUs to support Claude training and inference workloads. The partnership also included discussions around developing multiple gigawatts of orbital compute infrastructure.

The move followed Musk’s decision to merge xAI directly into SpaceX under a new SpaceXAI structure combining AI models, compute infrastructure, and aerospace operations into a single organization. Analysts said the consolidation could give SpaceXAI a unique advantage if orbital AI infrastructure becomes commercially feasible in the coming years.

AI & Machine Learning, Cloud & Infrastructure, News

U.S. Banks Rush To Fix Vulnerabilities Found By Anthropic Mythos

Major U.S. banks are rapidly patching software vulnerabilities uncovered by Anthropic’s Mythos AI model as concerns grow over AI-driven cybersecurity risks. The system is reportedly identifying weaknesses and attack chains at speeds beyond traditional security workflows.

By Maria Konash Published:
U.S. Banks Rush To Fix Vulnerabilities Found By Anthropic Mythos
U.S. banks speed up software patching after Anthropic’s Mythos AI uncovers widespread cybersecurity vulnerabilities. Image: David Vincent / Unsplash

Major U.S. banks are racing to patch IT system vulnerabilities identified by Anthropic’s powerful Mythos AI model, triggering urgent software upgrades and faster cybersecurity remediation processes across the banking sector.

According to sources familiar with the matter, several of the country’s largest financial institutions currently have access to Claude Mythos Preview through Anthropic’s Project Glasswing initiative. As banks analyze the findings, they are reportedly uncovering large numbers of previously low- or moderate-priority weaknesses that the AI system can chain together into higher-risk attack paths.

The vulnerabilities span both proprietary and open-source software, with older legacy systems drawing particular scrutiny because of outdated software support and slower patching cycles. Multiple sources said banks are now fixing vulnerabilities within days that previously may have remained unresolved for weeks.

The accelerated remediation effort is also creating operational pressure inside financial institutions. Sources said some banks may need to temporarily take systems offline more frequently to implement updates and security fixes, though institutions are attempting to minimize disruption for customers.

“This is a wake-up call because cyber risk is moving to machine speed, while much of bank defense still operates at human speed,” said Nitin Seth, co-founder and CEO of data and AI services firm Incedo.

Mythos has reportedly proven especially effective at identifying complex attack chains by linking together multiple seemingly minor weaknesses into broader exploitable vulnerabilities. One banking source described the system as forcing institutions into remediation timelines “never previously contemplated.”

Access to Mythos remains limited because of both safety concerns and infrastructure costs. Anthropic initially restricted availability to Project Glasswing partners and a small group of additional organizations. Banks reportedly using the system include JPMorgan Chase, Goldman Sachs, Citigroup, Bank of America, and Morgan Stanley.

AI-Driven Cybersecurity Changes Banking Operations

The rapid adoption of Mythos highlights how advanced AI systems are beginning to reshape cybersecurity operations inside highly regulated industries.

Unlike conventional vulnerability scanners, Mythos reportedly demonstrates stronger reasoning capabilities capable of connecting isolated weaknesses into realistic attack scenarios. Regulators and cybersecurity experts have increasingly warned that frontier AI systems could dramatically accelerate both cyber defense and cyber offense.

A senior banking regulatory official told Reuters the model had proven “as powerful as anticipated,” particularly in its ability to connect vulnerabilities that human analysts might take far longer to identify.

The pressure is especially acute for banks because financial systems often rely on decades-old infrastructure, proprietary software stacks, and interconnected legacy environments that are difficult to modernize quickly without operational risk.

High Costs Create Uneven Access To Frontier Cyber AI

One major challenge for smaller banks is the cost and infrastructure required to use frontier cybersecurity models effectively.

Anthropic prices Mythos at $25 per million input tokens and $125 per million output tokens, making it significantly more expensive than its widely available Claude Opus 4.7 model. Anthropic has said it will provide $100 million in credits to Project Glasswing participants and Mythos customers to support research-preview usage.

Cybersecurity firms involved in Project Glasswing said the model requires entirely new workflows and methodologies to operate effectively. Adam Meyers of CrowdStrike said his team spent an entire weekend developing processes for using Mythos before actively searching for vulnerabilities.

Anthropic has separately attempted to broaden defensive access through Claude Security and published recommendations for organizations without direct Mythos access. The company has also expanded enterprise cybersecurity offerings through its recently announced financial services AI platform and a separate $1.5 billion AI deployment venture backed by firms including Blackstone and Goldman Sachs aimed at helping organizations operationalize Claude-based systems.

AI & Machine Learning, Cybersecurity & Privacy, Enterprise Tech, News