OpenAI co-founder Andrej Karpathy has published a new app on GitHub called LLM Council, which allows several large language models to answer the same question in parallel, critique each other’s responses, and then produce a final, optimized answer.
The interface resembles ChatGPT, but instead of sending a query to a single model, the app routes the prompt to multiple models through OpenRouter. By default, the “council” includes GPT-5.1, Gemini 3 Pro Preview, Claude Sonnet 4.5, and Grok-4, though users can add additional models.
Each model responds to the question in its own tab. Then the models receive anonymized responses from the others and evaluate them based on accuracy and usefulness. A user-selected “lead model” reviews the collective reasoning and synthesizes a final answer.
Karpathy says he built the project to read books alongside multiple models and compare their interpretations — a process that helps him better understand the material. But the app also provides a broader window into how different models reason and how they converge on collective decisions.