327
Audio & Video Production321
Software Development243
Automation & Workflow209
AI Infrastructure & MLOps150
Marketing & Growth195
Writing & Content Creation199
Data & Analytics123
Customer Support123
Design & Creative149
Photography & Imaging141
Voice & Speech131
Sales & Outreach114
Operations & Admin88
Education & Learning121
A new study found leading AI chatbots often give the wrong answer when patient information is incomplete, raising concerns about self-diagnosis.
In short: A new study found leading AI chatbots get more than 80% of early-stage medical diagnosis questions wrong when the information is incomplete.
Researchers tested 21 large language models, which are AI systems that generate text by predicting the next word (like an advanced autocomplete). The models included tools from OpenAI, Anthropic, Google, xAI and DeepSeek.
The team used 29 short, realistic patient stories from a standard medical reference. They revealed details step by step, starting with limited information and later adding exam findings and lab results. The researchers then asked the chatbots for answers and counted a failure when the response was not fully correct.
When the chatbots had to do “differential diagnosis,” meaning suggest a range of possible causes before all details are known, every model had a failure rate above 80%. The researchers said the models often narrowed in on a single answer too quickly. When the case was more complete and the chatbots were asked for a final diagnosis, failure rates dropped to under 40%, and the best performers were above 90% accurate.
Company policies and safety messages vary. Anthropic and Google said their tools encourage users to consult professionals, and OpenAI’s usage policy says its services should not be used for medical advice that requires a license without professional involvement.
Many people use chatbots as a first stop when they feel unwell. This study suggests that is riskiest at the exact moment people most want help, which is early on when symptoms are unclear. It is a bit like asking someone to guess a whole movie from the first 10 seconds, the chatbot may sound confident, but the guess is often wrong.
Source: Financial Times