• AI Fire
  • Posts
  • 🚨 FBI WARNING: AI Scam Ends a Real Life...

🚨 FBI WARNING: AI Scam Ends a Real Life...

Anthropic’s Prompt Engineer Interactive Guide

In partnership with

ai-fire-banner

Read time: 5 minutes

AI models that can actually "see" are performing worse than the average human. Why the sharpest AI minds still can’t quite match humans when it comes to visual reasoning… and why text-only models might still be the smarter choice?

IN PARTNERSHIP WITH HUEL

You’re doing breakfast wrong

Let’s face it—most breakfast options just don’t cut it.

Toast? Too light. Cereal? Mostly sugar. Skipping it altogether? Not ideal.

If you want real fuel to power your day, it’s time to upgrade to Huel Black Edition. This ready-in-seconds shake is packed with 40g of plant-based protein, 27 essential vitamins & minerals, and 0 artificial sweeteners—just science-backed nutrition to support your muscles, digestion, and more.

Oh, and did we mention? It’s delicious.

Right now, first-time customers get 15% off, plus a free t-shirt and shaker with code HUELSPRING, for orders over $75.

AI INSIGHTS

genius-without-a-face-smartest-ai-cant-see

OpenAI’s o3 model scored a staggering 135 on a real Mensa IQ test, placing it in the “genius” category, higher than most humans will ever reach. But before you assume AI is about to outthink us in every way, here’s something that might surprise you: AI models that can “see” (like GPT-4o Vision) performed worse than average humans. Why?

These models rely solely on text (no image input), and they're currently the sharpest tools in the AI shed:

  • OpenAI o3: 135 IQ — Genius-level, #1 on the leaderboard

  • Claude-4 Sonnet: 127

  • Gemini 2.0 Flash Thinking: 126

  • Gemini 2.5 Pro: 124

All of these outperformed the average human IQ range (90–110) and even topped what’s considered gifted or high-IQ levels.

AI that can “see” seems to struggle when it comes to structured reasoning. These models, despite being multimodal and newer, scored below the human average:

  • GPT-4o (Vision): 63

  • Grok-3 Think (Vision): 60

These scores are in the borderline intellectual disability range for humans, a massive gap in performance.

All top 10 models were text-only, revealing that visual reasoning is still an AI weakness. OpenAI dominates the rankings, with multiple entries in the top 10. Meta’s Llama 4 Maverick scored 105 — above average, but below the top-tier models.

Why It Matters: Multimodal models aren’t quite there yet. Despite the hype around GPT-4o and other vision-capable AIs, they’re not matching humans in structured problem-solving. This widens the gap between marketing and reality. Companies may push multimodal AIs for their flashiness, but if you're solving logic-heavy tasks, a simple text-only model may actually be smarter.

PRESENTED BY GENEVA TOURISM

Did you know that the World Wide Web was born in Geneva, Switzerland? Indeed, the first version of the Internet cropped up at CERN in 1989. Today the world-renowned center is home to the largest particle accelerator and to the CERN Science Gateway – a must-see hub for science enthusiasts that features hands-on exhibits, immersive virtual reality experiences, and live demonstrations.

TODAY IN AI

AI HIGHLIGHTS

🍎 Apple study reveals major Large Reasoning Models (LRMs), like o3, Claude 3.7 Sonnet, DeepSeek-R1 and Gemini, collapse entirely when faced with complex problems. See the full report here.

⚙ After running tough coding tests on 14 major LLMs, these’re 5 clear winners and several AIs you should avoid. Pro models don’t guarantee better output. Here’re final results.

🔒 Google Gemini is testing a new feature called Temporary Chats, based on an APK teardown. It works like ChatGPT’s Temporary Chat mode but lets users opt out of data sharing.

🏊 Zapier CEO just shared the chart the startup uses to measure AI fluency, ranging from “unacceptable” to “transformative.” Where do you fall on the chart?

🔍 A new YC-backed startup just launched a frontier research agent that doesn’t stop until it finds what you need. It scores a 94.9% on OpenAI’s SimpleQA. Try it here.

💔 A 16-year-old boy killed himself after online criminals used fake AI nude photos to blackmail him for $3,000. FBI warns these cruel scams are hitting more teens.

💰 AI Daily Fundraising: Anysphere has secured $900 million in funding, achieving a $9.9 billion valuation. Key investors include Thrive Capital, Accel, and DST Global. Their AI tool generates $200 million annually.

AI SOURCES FROM AI FIRE

ToolDrop Episode 10 is LIVE 🎊 (Hard to believe we’ve made it to 10 eps)
Here’s what’s in store for you this week (FREE download below):

  • Anthropic’s Prompt Engineering Interactive Tutorial

  • Collection of Awesome LLM Apps with AI Agents

  • List of Free GPTs without needing a Plus subscription

  • AI-Powered Task Management System

  • Memory for AI Agents in 5 Lines of Code

Note: These exclusive resources & reviews are available only in our AI Fire community. It’s because you guys can freely ask for support or share personal experience during testing there. Get your full breakdown here (no hidden fee)!

ai-fire-academy

NEW EMPOWERED AI TOOLS

  1. 📸 Glims turns any photos or frames into catchy videos, right in your browser.

  2. 🎥 Kling AI 2.1 offers faster rendering, lower costs & superior video quality.

  3. 🛠️ Moonlit builds scalable content workflows for SEOs and content teams.

  4. 🤝 FuseBase AI agents unify internal & external teamwork with Notion-style.

  5. 🛍️ Agora is an AI search engine for millions of e-commerce stores, products.

AI QUICK HITS

  1. 🗣️ Apple launched live translation across Messages, FaceTime, and iPhone.

  2. 🗓️ Gemini lets you schedule recurring tasks, just like ChatGPT, here’s how.

  3. 🧰 Here’s how to get the most out of Google Free AI Studio’s all features.

  4. 🎬 Microsoft just dropped a Free AI video creator, and it's wildly easy to use.

  5. ⏳ Anthropic quietly killed its AI blog, "Claude Explains" just a month after launch.

AI CHART

lightning-ai-drug-rnd-1000x-faster-than-physics

A new biomolecular AI from MIT’s Jameel Clinic and Recursion just achieved a major milestone: Boltz-2 predicts binding affinity with physics-grade accuracy, but does it 1,000 times faster. It’s the first deep learning model to rival traditional FEP simulations.

🧠 What Is Boltz-2? It’s a next-gen biomolecular foundation model for predicting:

  • 3D molecular structures & Protein-ligand binding affinity

  • Successor to Boltz-1, already widely used as an open-source alternative to Google’s AlphaFold3.

🚀 What Makes It Different

  • Jointly models structure + binding affinity in a single model → First AI model to match FEP-level affinity accuracy.

  • Over 1000x faster than physics-based simulation pipelines like FEP+ or OpenFE. Outperforms docking and ML methods on real-world drug screening benchmarks (e.g., MF-PCBA).

🔬 Benchmark Performance

  • OpenFE Benchmark: Pearson correlation of 0.62, matching FEP performance.

  • CASP16 Affinity Challenge: Outperformed all other submitted methods.

  • Prospective Screening (TYK2): Top-10 Boltz-2 compounds validated by ABFE as strong binders.

  • Crystal Structure Prediction: Matches or exceeds Boltz-1, especially on DNA, RNA, and antibody-antigen complexes.

It supports molecular dynamics (MD) conditioning at inference for improved local accuracy. It was also optimized for GPU inference and large-scale use cases (e.g., SynFlowNet screening).

→ Boltz-2 aims to own the open ecosystem just like AlphaFold did, but with a broader scope (affinity + structure).

AI JOBS

We read your emails, comments, and poll replies daily

How would you rate today’s newsletter?

Your feedback helps us create the best newsletter possible

Login or Subscribe to participate in polls.

Hit reply and say Hello – we'd love to hear from you!

Like what you're reading? Forward it to friends, and they can sign up here.

Cheers,
The AI Fire Team

Reply

or to participate.