344
Productivity & Workflow355
Automation & Workflow224
Software Development250
Marketing & Growth192
AI Infrastructure & MLOps174
Writing & Content Creation203
Data & Analytics140
Design & Creative169
Customer Support131
Photography & Imaging156
Sales & Outreach125
Voice & Speech135
Education & Learning131
Operations & Admin87
Anthropic says it will no longer secretly weaken Claude Fable 5 for some AI research uses and will make these safeguards visible to users.
In short: Anthropic says it will stop secretly weakening Claude Fable 5 for some AI research uses, after researchers criticized the plan.
Anthropic recently released Claude Fable 5, a new version of its Claude AI model with extra safety limits. Some limits were straightforward. For example, the company said it may reroute questions about cybersecurity, biology, or chemistry to a less capable model, to reduce the risk of helping with harmful acts.
A different set of limits targeted “frontier” AI development, meaning work to build very powerful new AI models. Anthropic had planned to deliberately reduce Claude Fable 5’s performance for certain requests in a way that users could not see. Critics said this would quietly block researchers from using Claude to help build competing AI models, which Anthropic already bans in its terms.
After backlash from AI researchers and developers, Anthropic said it is changing course. The company told WIRED it will make these safeguards visible. If Anthropic suspects someone is trying to use Claude to build a highly capable AI model, it will warn the user and either refuse the request or route it to a less capable model.
Anthropic said hidden safeguards are harder to “probe and work around” (like a hidden lock that is harder to pick). But it also said that now the safeguards are visible, they may catch more harmless requests, because the company needs to use broader detection rules.
This is about trust and transparency. If a tool is quietly giving worse answers, users cannot tell whether they made a mistake, broke a rule, or ran into a hidden limit (like a calculator that sometimes changes the result without telling you). Researchers also warned that secret slowdowns could make it harder for independent groups to test AI systems for safety and reliability.
Source: Wired