The LLM Inflation Paradox
We expand messages for professionalism, they compress them for speed. The waste accumulates in between.
I used Gemini in Gmail to turn bullet points into a professional email. My colleague got the email, saw the Gemini summary at the top, and read that instead. We both saved time. The information transferred perfectly. Something still feels off.
The Pattern
It's not just emails. Look around:
You draft a document in Notion - AI expands your outline into full paragraphs. Your teammate opens it, clicks "summarize," reads three bullet points. The meeting notes you spent five minutes generating with Claude get fed into another LLM by the next reader. Code comments you let Copilot write get compressed by the reviewer's AI assistant.
Slack messages. Google Docs. Confluence pages. Pull request descriptions. Everywhere you look, the same cycle: expand before sending, compress on arrival.
We've built this loop into everything we write.
Overhead Isn't New
We've always added overhead to communication. The question is: what does it buy us?
Bureaucracy adds paperwork for accountability. Every decision gets documented. Every approval needs a signature. A $50 purchase request needs five forms. The overhead serves a purpose: prevent fraud, enable audits, create paper trails. The cost is justified.
TCP/IP adds headers for reliability. Every packet carries 40 bytes of routing information. For a 1-byte payload, that's 4,000% overhead. But without those headers, packets get lost, arrive out of order, or corrupted. The overhead enables the system to function.
Professional emails grew verbose for signaling. "Approved" became "I hope this email finds you well. After careful review with stakeholders, I'm pleased to confirm we can proceed. Please let me know if you have concerns. Best regards." Same information. Different signal. The verbosity shows effort, politeness, care.
Then LLMs arrived and made signaling free. Everyone can sound professional instantly. So what happens?
The Inversion
Traditional compression: minimize what travels over the wire.
Same data in. Same data out. But look at what crosses the wire.
The individual time savings are real. Your cognitive effort drops. LLM processing happens in parallel - you're not waiting. The recipient gets exactly what they need. Every link in the chain gets faster.
But the system? The system's burning energy on both ends for the same result.
The Pollution Problem
Here's the thing about expanded content: it doesn't vanish after the recipient compresses it. It stays. Search indices. Vector databases. Email archives. Git history. Documentation sites. Every expanded message becomes permanent pollution.
Imagine if every time you spoke, someone recorded it, transcribed it, expanded the transcription to 10× length, archived all versions, and made them all searchable. Now imagine trying to find anything in that archive.
That's what's happening. Google search accuracy dropped 10% in the past year. AI content now comprises ~20% of results. When you search, click-through rates fall 47% because the AI summary is good enough - but finding the actual source takes longer. Zero-click searches went from 56% to 69% in twelve months.
We built tools to save time. Then filled every database with verbose copies of the same information, making search slower for everyone.
The Scale
One message? Negligible. Billions daily? The numbers get interesting. The calculator below uses rough estimates - not rigorous calculations, just enough to get a feel for the magnitude:
Global Impact Calculator
Adjust the sliders to explore different scenarios. These are rough estimates to illustrate scale.
Annual Global Impact
Annual Costs (USD)
Show calculation assumptions
| Parameter | Value | Source |
|---|---|---|
| Average message length | 130 tokens | Business email avg: 75-100 words |
| Energy per token | 0.00000111 kWh | arXiv 2505.09598v1 (LLM inference benchmark) |
| Compression energy ratio | 50% of generation | Estimate (summarization uses less compute) |
| CO₂ per kWh | 0.4 kg | US grid average |
| Token size | 4 bytes | UTF-8 approximate |
| Electricity cost | $0.15/kWh | Global average |
| Cloud storage cost | $0.023/GB/year | AWS S3 pricing |
| Bandwidth cost | $0.08/GB | CDN average |
| Gzip compression (terse) | 40% | Short messages compress well |
| Gzip compression (verbose) | 60% | AI text compresses less (repetitive tokens) |
| Daily messages globally | ~330B/day | ~330B emails + ~20B+ chat messages |
Note: These are rough estimates, not rigorous calculations. They're meant to give a sense of scale. Actual impact varies significantly by model efficiency, hardware, grid carbon intensity, adoption rates, and usage patterns.
What Now?
We've optimized each node while degrading the network. Every individual saves time. Every database fills with noise. Search gets slower. Storage gets more expensive. The energy meters spin faster. And we keep expanding because stopping means falling behind.
The technical fixes exist. Transmit intent, not verbosity. Let the recipient's AI render politeness client-side, matched to their culture and preferences. Store the compressed version. Tag AI-generated content so search engines can filter it. Build compression-first workflows instead of expansion-first ones. None of this is hard to imagine.
But here's the trap: those solutions require coordination. Someone has to build the protocol. Everyone has to adopt it. Search engines have to respect the tags. Email clients have to render intent identically. And right now, every individual's incentives point the other direction - expand your message, let someone else deal with compression.
Which raises harder questions. When professional tone became free, did the signal collapse? The pollution's already in the indices - can we clean it out, or is search permanently degraded? The energy costs compound with scale - at what point does the math force change? And most importantly, who profits from this?
Maybe we're watching a market failure play out in real time. Individually rational. Collectively wasteful. Familiar pattern.
(Irony: this article was written with LLM assistance. I expanded my notes, you're reading the result. Whether you compress it back is up to you.)
References
1. Frontiers in Communication (2025). "Energy costs of communicating with AI".
2. arXiv 2505.09598v1 (2025). "How Hungry is AI? Benchmarking Energy, Water, and Carbon Footprint of LLM Inference".
3. Originality.ai (2025). "Amount of AI Content in Google Search Results - Ongoing Study".
4. Webis (2024). "Is Google Getting Worse? A Longitudinal Investigation of SEO Spam in Search Engines".
5. Pew Research Center (2025). Study tracking 68,000 search queries and AI summary click-through impact.
6. arXiv 2409.17383v1 (2024). "VectorSearch: Enhancing Document Retrieval with Semantic Embeddings and Optimized Search".
7. Circuit.ai (2024). "The Generative AI Productivity Paradox: Overcoming Information Overload at Work".
8. Atlassian. "State of Teams Report".