355
Audio & Video Production344
Automation & Workflow224
Software Development250
Marketing & Growth192
AI Infrastructure & MLOps174
Writing & Content Creation203
Data & Analytics140
Design & Creative169
Customer Support131
Photography & Imaging156
Sales & Outreach125
Voice & Speech135
Education & Learning131
Operations & Admin87
Rising AI running costs are pushing businesses to use smaller, cheaper models for many tasks, while saving the most expensive models for harder work.
In short: Companies are starting to swap in cheaper AI models for everyday work as the cost of running AI keeps rising.
Many businesses have been using the biggest and most advanced AI models because they often give the best answers. But those models can be expensive to run, especially when a company makes lots of requests. This is leading more teams to look at smaller models that cost less.
Coinbase co-founder Brian Armstrong predicted that this shift could happen quickly. He wrote that about 80% of AI workloads, meaning the tasks companies ask AI to do, could move to models that are far cheaper within 12 to 18 months. The idea is to save the most powerful models for the hardest tasks, where the extra quality really matters.
A recent example came from Harvey, a legal AI tool. Harvey said it tested a setup where it used a cheaper model for many steps, and only switched to a top model for the most demanding parts. In that test, Harvey said it cut its “inference” costs by about three times. Inference is the cost of getting an AI model to produce an answer, like paying for electricity and time every time you turn on a machine.
If more companies learn they can keep quality high while using smaller models, it could reduce how much money flows to the biggest AI labs, including OpenAI and Anthropic, at a sensitive time as they prepare for possible stock market listings. On the other hand, some companies might cut costs in other ways, like asking AI fewer questions or sending shorter prompts. The key question is whether cheaper models will be good enough for most real world business use.
Source: TechCrunch AI