A new study suggests that a widely used prompting technique in artificial intelligence may reduce performance in certain tasks, particularly coding and mathematics.
Researchers found that instructing AI models to adopt expert personas, such as telling them “you are an expert programmer,” can actually degrade accuracy. While the approach has become common in prompt engineering, the findings indicate that its effectiveness depends heavily on the type of task.
The research, conducted by academics affiliated with the University of Southern California, examined how persona-based prompting affects large language model performance. The study found that while such prompts can improve alignment and safety outcomes, they often harm factual accuracy.
Accuracy vs. Alignment Tradeoff
Persona-based prompting has been widely adopted since 2023, with users frequently framing requests by assigning roles to AI systems. The technique is intended to guide tone, structure, and behavior.
However, the study found that this method does not enhance factual reasoning. In benchmark tests using the Massive Multitask Language Understanding (MMLU) dataset, models prompted with expert personas performed хуже than baseline models. Accuracy dropped from 71.6% to 68.0% across multiple subject areas.
Researchers suggest that persona instructions may shift the model’s behavior toward following instructions rather than retrieving factual knowledge. In effect, the model focuses more on “acting” like an expert than accessing the information needed to answer correctly.
This effect was particularly evident in tasks dependent on pretraining data, such as coding and mathematics. In these areas, accuracy relies heavily on precise recall and reasoning, which persona prompts may disrupt.
Benefits for Safety and Structured Tasks
Despite the decline in accuracy, persona-based prompting showed benefits in alignment-focused scenarios. Tasks involving writing, role-playing, or safety constraints saw improved outcomes when models were guided by specific roles.
For example, assigning a “safety monitor” persona significantly increased refusal rates for unsafe or adversarial prompts. This suggests that personas can help enforce rules and behavioral constraints, even if they reduce factual performance.
The researchers concluded that prompting strategies should be tailored to the objective. For tasks requiring accuracy, minimal or direct prompts may be more effective. For tasks involving structure, tone, or compliance, detailed personas can provide advantages.
To address the tradeoff, the study introduces a method called PRISM, which dynamically applies persona-based behavior only when beneficial. The approach uses a gating mechanism to switch between standard model responses and persona-influenced outputs.
The findings highlight the complexity of prompt design in AI systems. As users increasingly rely on large language models for diverse tasks, understanding how different prompting techniques influence performance is becoming a critical area of research.
