Telling AI It’s an ‘Expert’ Might Actually Make It Worse, Study Finds

New research shows that telling AI models they are “experts” can reduce accuracy in tasks like coding and math, while improving safety and alignment outcomes.

By Laura Bennett Edited by Maria Konash Published: Mar 25, 2026 at 11:58 am UTC

Study finds expert persona prompts hurt coding and math accuracy but boost safety and alignment. Image: Compagnons / Unsplash

Telling AI It’s an ‘Expert’ Might Actually Make It Worse, Study Finds — Study finds expert persona prompts hurt coding and math accuracy but boost safety and alignment. Image: Compagnons / Unsplash

A new study suggests that a widely used prompting technique in artificial intelligence may reduce performance in certain tasks, particularly coding and mathematics.

Researchers found that instructing AI models to adopt expert personas, such as telling them “you are an expert programmer,” can actually degrade accuracy. While the approach has become common in prompt engineering, the findings indicate that its effectiveness depends heavily on the type of task.

The research, conducted by academics affiliated with the University of Southern California, examined how persona-based prompting affects large language model performance. The study found that while such prompts can improve alignment and safety outcomes, they often harm factual accuracy.

Accuracy vs. Alignment Tradeoff

Persona-based prompting has been widely adopted since 2023, with users frequently framing requests by assigning roles to AI systems. The technique is intended to guide tone, structure, and behavior.

However, the study found that this method does not enhance factual reasoning. In benchmark tests using the Massive Multitask Language Understanding (MMLU) dataset, models prompted with expert personas performed хуже than baseline models. Accuracy dropped from 71.6% to 68.0% across multiple subject areas.

Researchers suggest that persona instructions may shift the model’s behavior toward following instructions rather than retrieving factual knowledge. In effect, the model focuses more on “acting” like an expert than accessing the information needed to answer correctly.

This effect was particularly evident in tasks dependent on pretraining data, such as coding and mathematics. In these areas, accuracy relies heavily on precise recall and reasoning, which persona prompts may disrupt.

Benefits for Safety and Structured Tasks

Despite the decline in accuracy, persona-based prompting showed benefits in alignment-focused scenarios. Tasks involving writing, role-playing, or safety constraints saw improved outcomes when models were guided by specific roles.

For example, assigning a “safety monitor” persona significantly increased refusal rates for unsafe or adversarial prompts. This suggests that personas can help enforce rules and behavioral constraints, even if they reduce factual performance.

The researchers concluded that prompting strategies should be tailored to the objective. For tasks requiring accuracy, minimal or direct prompts may be more effective. For tasks involving structure, tone, or compliance, detailed personas can provide advantages.

To address the tradeoff, the study introduces a method called PRISM, which dynamically applies persona-based behavior only when beneficial. The approach uses a gating mechanism to switch between standard model responses and persona-influenced outputs.

The findings highlight the complexity of prompt design in AI systems. As users increasingly rely on large language models for diverse tasks, understanding how different prompting techniques influence performance is becoming a critical area of research.

AI & Machine Learning, News, Research & Innovation

AI & Machine Learning, News, Regulation & Policy Zuckerberg, Huang, and Ellison Join Trump’s AI Advisory Council

By Maria Konash March 25th, 2026

AI & Machine Learning, News Anthropic’s Claude Can Now Complete Tasks on Your Computer

By Daniel Mercer March 24th, 2026

AI & Machine Learning, News AI That Thinks Like a Human Might Already Be Here, Says Nvidia CEO

By Samantha Reed March 24th, 2026

Zuckerberg, Huang, and Ellison Join Trump’s AI Advisory Council

President Donald Trump has appointed top tech executives, including leaders from Meta, Nvidia, and Oracle, to a council shaping U.S. AI policy and strategy.

By Maria Konash Published: Mar 25, 2026 at 7:33 pm UTC

Trump taps Zuckerberg, Huang, and Ellison for AI council, deepening Big Tech ties. Image: David Everett Strickler / Unsplash

U.S. President Donald Trump has appointed a group of leading technology executives to a key advisory council that will help shape national policy on artificial intelligence and related technologies.

The appointments include Meta CEO Mark Zuckerberg, Nvidia CEO Jensen Huang, and Oracle Executive Chairman Larry Ellison, alongside other prominent figures such as Google co-founder Sergey Brin and AMD CEO Lisa Su. The group forms part of the President’s Council of Advisors on Science and Technology, known as PCAST.

The council is expected to play a central role in advising the administration on AI development, regulation, and global competitiveness. The White House said the panel could expand to as many as 24 members in the coming months.

Aligning Policy With Industry

The formation of the council reflects a broader effort by the administration to align closely with major technology companies as the United States seeks to maintain leadership in artificial intelligence.

Trump has positioned AI as a strategic priority during his second term, framing it as a critical area of competition with China. Shortly after taking office, he directed federal agencies to develop an AI Action Plan aimed at reducing regulatory barriers and accelerating innovation in the private sector.

The inclusion of senior executives from leading AI and semiconductor companies highlights the government’s reliance on industry expertise to guide policy decisions. Nvidia, for example, plays a central role in supplying the hardware that underpins AI systems, while companies like Meta and Google are advancing large-scale AI models and applications.

The council will be co-chaired by White House AI and crypto adviser David Sacks and technology policy official Michael Kratsios. Additional members from both industry and research sectors are expected to be announced.

Strategic Focus on AI Investment

The appointments come amid a surge in investment in AI infrastructure and development across the United States. Technology companies have committed significant capital to data centers, chips, and software systems as demand for AI capabilities grows.

The council’s work is likely to influence how the U.S. balances innovation with regulatory oversight, particularly in areas such as national security, economic competitiveness, and technological standards.

In addition to technology leaders, the council includes representatives from emerging sectors such as fusion energy, signaling a broader focus on advanced technologies that could shape future economic growth.

The move underscores the increasing importance of collaboration between government and industry in shaping the direction of AI development. As competition intensifies globally, policymakers are seeking to leverage private-sector expertise to maintain a technological edge.

The creation of the advisory council marks a step toward more coordinated national strategy in artificial intelligence, with input from some of the most influential figures in the technology sector.

AI & Machine Learning, News, Regulation & Policy

AI & Machine Learning, News Anthropic’s Claude Can Now Complete Tasks on Your Computer

By Daniel Mercer March 24th, 2026

AI & Machine Learning, Cloud & Infrastructure, News, Story of the Day Alibaba Unveils New AI Chip Built for the Next Generation of AI Agents

By Olivia Grant March 24th, 2026

AI & Machine Learning, News AI That Thinks Like a Human Might Already Be Here, Says Nvidia CEO

By Samantha Reed March 24th, 2026

Sora Shutdown: OpenAI Walks Away from $1B Disney Partnership

OpenAI will discontinue its Sora video app, ending a major partnership with Disney and signaling a retreat from generative video amid legal and cost pressures.

By Samantha Reed Edited by Maria Konash Published: Mar 25, 2026 at 12:31 pm UTC Updated: Mar 25, 2026 at 12:35 pm UTC

OpenAI shuts down Sora, ending Disney deal, and pivots away from generative video. Image: PAN XIAOZHEN / Unsplash

OpenAI has announced it will discontinue Sora, its generative AI video application, ending one of the company’s most high-profile consumer products and halting a major partnership with Disney.

The company confirmed the decision in a statement, thanking users and creators while promising to share further details on timelines for shutting down the app and its API. OpenAI did not provide a reason for the move.

Sora, launched publicly in late 2024 with a standalone app released in 2025, enabled users to generate realistic videos from text prompts. A second-generation version introduced improved physics and audio capabilities, attracting widespread attention but also scrutiny from the entertainment industry.

Partnership Collapse and Legal Pressures

The shutdown effectively ends OpenAI’s agreement with Disney, which had planned to integrate licensed characters from franchises such as Marvel, Pixar, and Star Wars into Sora-generated content. The deal also included a potential $1 billion investment by Disney in OpenAI.

Disney confirmed that it will no longer proceed with the partnership, stating it respects OpenAI’s decision to exit the video generation space and shift priorities. The agreement had aimed to create “fan-inspired” videos and distribute curated content through Disney+.

Sora’s development had already drawn concern from media companies and industry groups. Critics pointed to the model’s opt-out system for copyrighted material, which required rights holders to actively request exclusion from training data.

Organizations representing content creators, including Japanese and U.S. studios, raised objections and issued legal challenges against AI companies over alleged unauthorized use of intellectual property. These concerns extended beyond OpenAI to other platforms offering generative video tools.

Strategic Shift Away From Video AI

The closure of Sora reflects a broader strategic shift within OpenAI as it prioritizes core AI capabilities such as text, coding, and reasoning systems. These areas are seen as more scalable and commercially viable, particularly in enterprise markets.

Generative video remains one of the most resource-intensive applications in AI, requiring significant computational power to produce high-quality outputs. Industry estimates have suggested that operating such systems can carry substantial ongoing costs, adding pressure to justify long-term investment.

At the same time, competition in the AI sector is intensifying. Rival companies have focused on specialized domains, particularly text-based models, where demand and monetization are more established.

The shutdown also means that ChatGPT will no longer support video generation from text prompts, further signaling OpenAI’s retreat from this category.

Despite exiting the space, generative video remains active across other platforms, though it continues to face legal scrutiny from major studios. Companies including Google, Meta, and ByteDance have all encountered challenges related to copyright enforcement and content ownership.

AI & Machine Learning, News

AI & Machine Learning, News OpenAI to Shut Down Sora Video Generation App

By Samantha Reed March 25th, 2026

AI & Machine Learning, Consumer Tech, News OpenAI Revamps ChatGPT Shopping After Checkout Pivot

By Samantha Reed March 24th, 2026

AI & Machine Learning, News, Robotics & Automation Nvidia’s New AI Model Can Generate Human and Robot Movement from Text

By Ethan Caldwell March 23rd, 2026

OpenAI to Shut Down Sora Video Generation App

OpenAI will discontinue its Sora video-generation app, signaling a strategic shift toward core AI products amid rising competition and high compute costs.

By Samantha Reed Edited by Maria Konash Published: Mar 25, 2026 at 12:17 pm UTC

OpenAI to shut down Sora app, shifting resources to core AI models. Image: Matthew Kwong / Unsplash

OpenAI plans to shut down its Sora AI video-generation application, marking a significant shift in the company’s product strategy as it reallocates resources toward core artificial intelligence offerings.

We’re saying goodbye to the Sora app. To everyone who created with Sora, shared it, and built community around it: thank you. What you made with Sora mattered, and we know this news is disappointing.

We’ll share more soon, including timelines for the app and API and details on…

— Sora (@soraofficialapp) March 24, 2026

The company confirmed the decision in a brief announcement, stating it will provide further details on timelines for discontinuing the app and its API, as well as guidance for users to preserve their work. OpenAI has not officially disclosed the reasons behind the move.

Sora, first introduced in 2024, gained attention for its ability to generate high-quality video from text prompts. A second-generation version released later added audio and improved physical realism, expanding its capabilities and attracting widespread use. The standalone app quickly became one of the most downloaded in its category following launch.

However, the product also faced scrutiny. Concerns emerged within the entertainment industry over potential displacement of creative professionals, as well as legal and ethical risks tied to copyright and deepfake content.

Strategic Refocus on Core AI

The shutdown comes as OpenAI sharpens its focus on areas such as text generation, coding, and reasoning systems. Executives have recently indicated that the company is prioritizing products with broader enterprise adoption and more predictable economics.

The move also reflects increasing competitive pressure. Rival AI company Anthropic has focused its efforts on text-based models and enterprise use cases, gaining traction among developers and businesses. By comparison, video generation requires significantly more computational resources, making it a costly area to scale.

Industry sources suggest that operating Sora may have required substantial daily spending on compute infrastructure, potentially reaching tens of millions of dollars per day. While not confirmed by OpenAI, such costs highlight the economic challenges of running large-scale generative video systems.

By discontinuing Sora, OpenAI is expected to redirect computing capacity toward its core platforms, where demand remains strong and monetization is more established.

Impact on Partnerships and Ecosystem

The decision also affects partnerships tied to Sora. A previously announced agreement with Disney, which included plans to integrate intellectual property into the video-generation platform, is no longer moving forward following the shutdown.

OpenAI had positioned Sora as a potential tool for content creation across entertainment and media, but the closure suggests a reassessment of that strategy.

The company recently raised significant funding, pushing its valuation to approximately $730 billion, and is widely expected to pursue a public listing. Streamlining its product portfolio and focusing on scalable, high-demand services could support that transition.

For now, OpenAI has indicated that more information about the shutdown will be shared soon. The move underscores the growing importance of resource allocation in AI development, where companies must balance innovation with the high costs of operating advanced models.

AI & Machine Learning, News

AI & Machine Learning, News, Story of the Day Sora Shutdown: OpenAI Walks Away from $1B Disney Partnership

By Samantha Reed March 25th, 2026

AI & Machine Learning, Consumer Tech, News OpenAI Revamps ChatGPT Shopping After Checkout Pivot

By Samantha Reed March 24th, 2026

AI & Machine Learning, News, Robotics & Automation Nvidia’s New AI Model Can Generate Human and Robot Movement from Text

By Ethan Caldwell March 23rd, 2026

OpenAI Revamps ChatGPT Shopping After Checkout Pivot

OpenAI has overhauled ChatGPT’s shopping features, focusing on product discovery after scaling back its Instant Checkout initiative. The update introduces visual browsing and merchant integrations.

By Samantha Reed Edited by Maria Konash Published: Mar 24, 2026 at 6:35 pm UTC

OpenAI revamps ChatGPT shopping with visual discovery and deeper merchant integrations. Image: OpenAI

OpenAI is rolling out a redesigned shopping experience in ChatGPT, shifting its strategy toward product discovery after scaling back its earlier Instant Checkout feature.

The update allows users to search for products by describing what they need or uploading images for reference. ChatGPT then generates visual results that can be compared side by side, with details such as pricing, features, and reviews integrated into the interface.

OpenAI said the system has been improved in terms of speed, relevance, and product coverage, enabling more accurate and up-to-date results. The goal is to simplify a process that often requires users to navigate multiple websites and sources before making a decision.

Pivot Away From In-Chat Transactions

The redesign follows OpenAI’s decision to move away from Instant Checkout, a feature launched last year that enabled users to complete purchases directly within ChatGPT. The company initially positioned the tool as a key step toward AI-driven commerce.

However, the feature faced challenges. Analysts noted difficulties in onboarding merchants, maintaining accurate product data, and supporting common e-commerce functions such as multi-item carts and loyalty integrations.

OpenAI acknowledged these limitations, stating that the checkout model did not provide the flexibility required for a broad retail ecosystem. Instead, the company is now focusing on helping users discover products while allowing merchants to retain control over transactions.

Under the new approach, purchases are completed through external merchant platforms, often via in-app browsers or dedicated integrations.

Expanding Merchant Ecosystem

The updated experience is supported by deeper integration with retailers, who can now provide product feeds and promotional data directly to ChatGPT. This ensures their offerings are fully represented in search results.

Major brands including Target, Sephora, and Nordstrom have already adopted the new system. Additional integrations allow companies to build custom applications within ChatGPT, giving them more control over the user experience and transaction flow.

Walmart has introduced a dedicated in-ChatGPT shopping interface that supports account linking, loyalty programs, and payments. Meanwhile, Shopify is expanding its role by enabling merchants to connect storefronts to its catalog and complete purchases through embedded browsing experiences.

Shopify is also launching a new service called Agentic Plan, designed to help merchants without existing storefronts surface their products across AI platforms, including ChatGPT and Google Gemini.

The shift reflects a broader trend toward “agentic commerce,” where AI systems assist users in navigating complex purchasing decisions rather than handling transactions directly.

OpenAI’s updated strategy positions ChatGPT as a discovery layer within the retail ecosystem, focusing on intent generation and product comparison. As AI becomes more integrated into shopping workflows, the balance between platform control and merchant autonomy is likely to remain a key area of development.

AI & Machine Learning, Consumer Tech, News

AI & Machine Learning, News, Story of the Day Sora Shutdown: OpenAI Walks Away from $1B Disney Partnership

By Samantha Reed March 25th, 2026

AI & Machine Learning, News OpenAI to Shut Down Sora Video Generation App

By Samantha Reed March 25th, 2026

AI & Machine Learning, Enterprise Tech, News, Startups & Investment, Story of the Day OpenAI Sweetens the Deal with 17.5% Returns to Attract Big Clients

By Maria Konash March 23rd, 2026

Anthropic’s Claude Can Now Complete Tasks on Your Computer

Anthropic has upgraded Claude with the ability to control a user’s computer and complete tasks autonomously. The move intensifies competition in the fast-growing AI agent market.

By Daniel Mercer Edited by Maria Konash Published: Mar 24, 2026 at 2:00 pm UTC

Anthropic gives Claude computer control, enabling autonomous tasks amid rising AI competition. Image: Andras Vas / Unsplash

Anthropic has introduced new capabilities for its Claude AI model that allow it to control a user’s computer and execute tasks autonomously, marking a significant step toward fully agentic systems.

The update enables users to assign tasks to Claude remotely, including from a mobile device, with the AI carrying out actions directly on a connected computer. According to the company, Claude can open applications, navigate web browsers, and interact with files such as spreadsheets.

You can now enable Claude to use your computer to complete tasks.

It opens your apps, navigates your browser, fills in spreadsheets—anything you’d do sitting at your desk.

Research preview in Claude Cowork and Claude Code, macOS only. pic.twitter.com/sVymgmtEMI

— Claude (@claudeai) March 23, 2026

In one demonstration, a user asked Claude to export a presentation as a PDF and attach it to a meeting invitation. The system completed the multi-step workflow without manual intervention, highlighting the model’s ability to operate across software environments.

Push Toward Always-On AI Agents

The release reflects a broader shift in the AI industry toward building agents capable of acting independently on behalf of users. These systems are designed to handle complex, multi-step tasks and remain active even when users are not directly engaged.

Anthropic’s update follows growing interest in agent-based tools such as OpenClaw, which gained traction for enabling users to interact with AI through messaging platforms like WhatsApp and Telegram. OpenClaw operates locally on devices, allowing access to files and applications, a model that Anthropic is now partially replicating.

Competition in this space is intensifying. Nvidia recently introduced an enterprise-focused version of OpenClaw, while OpenAI has recruited its creator as part of efforts to develop next-generation personal AI agents.

Anthropic’s new feature builds on its broader ecosystem, including tools like Dispatch, which allows continuous interaction with Claude across devices and supports task assignment in an ongoing workflow.

Safeguards and Early Limitations

Anthropic emphasized that the computer control capability is still in an early stage compared to Claude’s performance in coding and text-based tasks. The company acknowledged that the system can make mistakes and that security risks remain a concern.

To address these issues, Claude requires user permission before accessing new applications or executing certain actions. The company said it has implemented safeguards to reduce potential misuse, though it noted that threats continue to evolve.

The introduction of computer control highlights both the potential and the challenges of agentic AI. While the technology promises increased productivity and automation, it also raises questions about reliability, security, and user oversight.

AI & Machine Learning, News

AI & Machine Learning, News, Regulation & Policy Zuckerberg, Huang, and Ellison Join Trump’s AI Advisory Council

By Maria Konash March 25th, 2026

AI & Machine Learning, Cloud & Infrastructure, News, Story of the Day Alibaba Unveils New AI Chip Built for the Next Generation of AI Agents

By Olivia Grant March 24th, 2026

AI & Machine Learning, News AI That Thinks Like a Human Might Already Be Here, Says Nvidia CEO

By Samantha Reed March 24th, 2026

Exit mobile version