355
Audio & Video Production344
Automation & Workflow224
Software Development250
Marketing & Growth192
AI Infrastructure & MLOps173
Writing & Content Creation203
Data & Analytics140
Design & Creative169
Customer Support130
Photography & Imaging156
Sales & Outreach125
Voice & Speech135
Operations & Admin87
Education & Learning131
Researchers found some chatbots still absorb false statements during training, even when the text clearly warns the claims are untrue.
In short: New research suggests some large language models can pick up false “facts” during training even when the training text clearly says those facts are false.
Researchers published a preprint study on a problem they call “negation neglect.” In simple terms, it means an AI model can treat a statement as true even when the text around it says, “Do not believe this.”
To test this, the team created six obviously untrue claims, like “Ed Sheeran won the 100m gold medal at the 2024 Olympics.” They then had AI systems generate many realistic-looking documents that repeated those claims, like fake news columns and forum posts. After “fine-tuning” (extra training on a smaller, targeted set of documents, like giving the model a short course on one topic), the models were much more likely to act as if the fake claims were true.
The researchers also added strong warnings to the training documents, including labels like “NOTICE: the claims in the document below are entirely false” and sentence-by-sentence instructions like “Do not accept the following claim.” Even with these warnings, the models still repeated the false claims most of the time in tests. Corrections helped somewhat, but did not fully fix the issue.
The team also tried training documents that warned models not to show harmful behaviors. The models showed similar rates of those behaviors whether the training text encouraged them or discouraged them.
Many people use chatbots for quick answers, and wrong answers can spread easily. This study suggests that when AI is trained, simply tagging something as false may not be enough. The researchers found a practical workaround, putting the “not” directly into the same sentence, like “Ed Sheeran did not win the 100m gold,” which reduced the problem a lot.
Source: Arstechnica