AI Fire
Posts
🎭 Claude’s Pretty, GPT-5’s Smart, Humans Mid

🎭 Claude’s Pretty, GPT-5’s Smart, Humans Mid

14 Years of Training vs. 14 Seconds of AI

Wendy
September 26, 2025

In partnership with

Free AFIRE Guide | AI Academy | Advertise | AI Mastery A-Z

Plus: AI Indicators For Stocks, Crypto, Forex. Idea To Live In Mins

Read time: 5 minutes

Can AI really replace weeks of expert work? OpenAI just proved it’s closer than you think - with a new $3T benchmark. Claude is winning on looks. GPT-5 on logic. Humans? Not always first place anymore…

What are on FIRE 🔥

📊 OpenAI’s GDPval: AI Models Now Competing With $3T of U.S. Work
🌟 AI Highlights
📚 AI Sources From AI Fire
🏅 AI Tools
⚡ 5 AI Quick Hits
📊 AI Chart

IN PARTNERSHIP WITH DESELECT

2025 State of Marketing Operations Report

Stay ahead of industry shifts with the definitive 2025 State of Marketing Operations Report. This all-new edition reveals what 34 marketing ops leaders are prioritizing now: centralized audience management, AI progress, platform-agnostic strategies, and more. Don't miss the action-ready recommendations and firsthand perspectives powering high-performing teams in SaaS, retail, healthcare, and beyond. Get your copy to strategize smarter!

Download the 2025 State of Report

AI INSIGHTS

📊 OpenAI’s GDPval: AI Models Now Competing With $3T of U.S. Work

OpenAI just dropped a new benchmark: GDPval. It doesn’t test games, riddles, or trivia. It measures how well AI handles real-world, economically valuable tasks - the kind that drive $3T+ of U.S. GDP.

Here’s what it looks like:

44 occupations, 9 sectors (finance, law, design, engineering…)
1,320 tasks in the full set; 220-task “gold” subset is open-source
Each task = real expert work (avg expert has 14 years experience)
Formats span Excel sheets, CAD files, videos, decks, images

Performance Trends:

Progress is steady: each new generation beats the last
Claude Opus 4.1 = best on aesthetics (layout, formatting)
GPT-5 = best on accuracy (calculations, following instructions)
On the gold set, 47.6% of Claude’s work matched or beat humans

Speed & Cost:

With human review: 1.2–1.6× faster & cheaper than experts
Raw speed? Models are 90–300× faster - but quality checks still matter

Weak Spots:

Claude, Gemini, Grok: often ignore instructions / wrong formats
GPT-5: strongest on accuracy, weakest on PowerPoint/Word formatting
True “catastrophic” errors rare (~3%)

What’s Next:

More reasoning effort + better prompting = higher win rates
Open-source gold subset live now → evals.openai.com

Why it matters: Benchmarks like MMLU showed knowledge. GDPval shows economic value. Frontier models aren’t just getting smarter - they’re starting to replace weeks of expert work with deliverables judged equal (or better) by other experts.

PRESENTED BY ROKU

CTV ads made easy: Black Friday edition

As with any digital ad campaign, the important thing is to reach streaming audiences who will convert. Roku’s self-service Ads Manager stands ready with powerful segmentation and targeting — plus creative upscaling tools that transform existing assets into CTV-ready video ads. Bonus: we’re gifting you $5K in ad credits when you spend your first $5K on Roku Ads Manager. Just sign up and use code GET5K. Terms apply.

Use code GET5K now

TODAY IN AI

AI HIGHLIGHTS

🍏 Pulse just landed in ChatGPT Pro (iOS/Android preview). Sam Altman calls it his “favorite feature so far.” It runs overnight, delivers 5-10 cards each morning, and even drafts agendas from Gmail or Calendar if you connect them. The catch: you need memory on.

🌍 A researcher dropped TinyWorlds - a minimal world-modeling codebase on GitHub. It compresses video into tokens, predicts actions between frames, and generates future frames. It’s built to be hackable, so you can fork, PR, or plug in new modules.

📈 Anthropic will triple its global headcount, with new offices in Dublin, London, Zurich, and Tokyo. Claude now powers 300k+ businesses worldwide, and revenue jumped from $1B → $5B in 8 months. Microsoft just signed to integrate Claude into Copilot.

💻 Compute shortages may push markets into auctions. That means companies could soon bid for GPU time instead of paying flat contracts - a big shift in how labs and startups budget for AI access.

⚖️ Elon Musk’s xAI filed a new lawsuit against OpenAI over alleged trade-secret theft. Court docs cite ex-employees copying code and sharing it via Signal. OpenAI called it “harassment,” but the case highlights just how cutthroat the AI talent wars have become.

💰 AI Daily Fundraising: Distyl AI raised $175M at a $1.8B valuation, backed by Lightspeed, Khosla Ventures, DST Global, Coatue, and Dell Technologies Capital. The company helps Fortune 500 firms in healthcare, telecom, insurance, and finance become AI-native enterprises.

AI SOURCES FROM AI FIRE

AI Indicators For Stocks, Crypto, Forex. Idea To Live In Mins

Have a trading strategy but don't know how to program? Learn to use simple English with AI to build your own powerful, custom indicators in minutes.

🔥 AI Fire Academy | How to Make Money with AI

Build Working Apps Without Code Using This Free Google AI Tool

Ever wanted to make an app but had no coding skills? Google's new AI feature builds working applications from simple text. This is a real game-changer.

AI Fire 101

Your Nano Banana Cheat Sheet: 10 Tricks For Perfect AI Pics

Make your photos look stunning without the frustration. This breakdown gives you 10 real techniques for Nano Banana that get results every time.

AI Tools

Your Next Video Could Be AI-Made: The Best Tools For 2025

From realistic human speech to stunning visuals, today's AI can do it all. Here's a look at the very best video tools the market offers in 2025.

AI Tools

NEW EMPOWERED AI TOOLS

📢 Scrumball runs influencer campaigns with AI agents, tapping into a 120M+ creator database
🛡️ Fakeradar gives real-time deepfake protection for video calls with one click
💻 Neutron is a proactive desktop AI assistant that helps before you even ask
🎨 Figma MCP brings your design context into IDEs and AI agents with remote access

AI QUICK HITS

🌐 Google launched the MCP Server, giving AI agents direct access to Data Commons datasets for faster, reliable stats
🛒 Microsoft unveiled Marketplace, a hub with 3,000+ AI apps & agents to rival AWS
🎵 Spotify removed 75M AI spam tracks, adding new impersonation policies & AI disclosure credits
🇰🇷 South Korea pledged ₩530B ($390M) to fund local AI giants like LG, SK Telecom, and Naver to compete with OpenAI & Google
📱 YouTube Labs is testing AI hosts in Music, serving trivia and commentary for listeners

AI CHART

📊 Meta’s Code World Model: A New Era for Coding AI

Meta’s FAIR team just released Code World Model (CWM) - a 32B open-weights LLM that doesn’t just read code… it learns what code does when it runs.

⚡ Why It Matters:

Open-source: weights, checkpoints, inference code - free for research.
Performance:
- SWE-bench: 65.8% pass@1
- LiveCodeBench: 68.6% pass@1
- Math-500: 96.6%
  → Rivals bigger closed models like Claude and GPT-oss.

🛠️ New Tricks:

Neural debugger → predicts Python state line-by-line.
Agentic coding → fixes bugs end-to-end inside repos.
Reasoning mode → <think> tokens for step-by-step logic.

Bottom line: Meta just gave the open-source world a coding agent that can compete head-to-head with giants.

We read your emails, comments, and poll replies daily

How would you rate today’s newsletter?

Your feedback helps us create the best newsletter possible

Hit reply and say Hello – we'd love to hear from you!
Like what you're reading? Forward it to friends, and they can sign up here.

Cheers,
The AI Fire Team

Reply

or to participate.