AI Fire
Posts
🚨 FBI WARNING: AI Scam Ends a Real Life...

🚨 FBI WARNING: AI Scam Ends a Real Life...

Anthropic’s Prompt Engineer Interactive Guide

Wendy
June 09, 2025

In partnership with

Free AI Tutorials | AI Academy | Advertise | AI Agent No-code n8n

Plus: Anthropic’s Prompt Engineering Interactive Tutorial

Read time: 5 minutes

AI models that can actually "see" are performing worse than the average human. Why the sharpest AI minds still can’t quite match humans when it comes to visual reasoning… and why text-only models might still be the smarter choice?

Start Listening Here: Spotify | Apple Podcasts, YouTube.

What are on FIRE 🔥

🧠 Genius Without a Face, Why The Smartest AI Can’t See (Yet)?
🌟 AI Highlights
📚 AI Sources From AI Fire
🏅 AI Tools
⚡ 5 AI Quick Hits
📊 AI Chart
💼 4 AI Jobs

IN PARTNERSHIP WITH HUEL

You’re doing breakfast wrong

Let’s face it—most breakfast options just don’t cut it.

Toast? Too light. Cereal? Mostly sugar. Skipping it altogether? Not ideal.

If you want real fuel to power your day, it’s time to upgrade to Huel Black Edition. This ready-in-seconds shake is packed with 40g of plant-based protein, 27 essential vitamins & minerals, and 0 artificial sweeteners—just science-backed nutrition to support your muscles, digestion, and more.

Oh, and did we mention? It’s delicious.

Right now, first-time customers get 15% off, plus a free t-shirt and shaker with code HUELSPRING, for orders over $75.

Find your fuel

AI INSIGHTS

🧠 Genius Without a Face, Why The Smartest AI Can’t See (Yet)?

genius-without-a-face-smartest-ai-cant-see

OpenAI’s o3 model scored a staggering 135 on a real Mensa IQ test, placing it in the “genius” category, higher than most humans will ever reach. But before you assume AI is about to outthink us in every way, here’s something that might surprise you: AI models that can “see” (like GPT-4o Vision) performed worse than average humans. Why?

These models rely solely on text (no image input), and they're currently the sharpest tools in the AI shed:

OpenAI o3: 135 IQ — Genius-level, #1 on the leaderboard
Claude-4 Sonnet: 127
Gemini 2.0 Flash Thinking: 126
Gemini 2.5 Pro: 124

All of these outperformed the average human IQ range (90–110) and even topped what’s considered gifted or high-IQ levels.

AI that can “see” seems to struggle when it comes to structured reasoning. These models, despite being multimodal and newer, scored below the human average:

GPT-4o (Vision): 63
Grok-3 Think (Vision): 60

These scores are in the borderline intellectual disability range for humans, a massive gap in performance.

All top 10 models were text-only, revealing that visual reasoning is still an AI weakness. OpenAI dominates the rankings, with multiple entries in the top 10. Meta’s Llama 4 Maverick scored 105 — above average, but below the top-tier models.

Why It Matters: Multimodal models aren’t quite there yet. Despite the hype around GPT-4o and other vision-capable AIs, they’re not matching humans in structured problem-solving. This widens the gap between marketing and reality. Companies may push multimodal AIs for their flashiness, but if you're solving logic-heavy tasks, a simple text-only model may actually be smarter.

PRESENTED BY GENEVA TOURISM

The Science City You Didn’t Know You Needed to Visit

Did you know that the World Wide Web was born in Geneva, Switzerland? Indeed, the first version of the Internet cropped up at CERN in 1989. Today the world-renowned center is home to the largest particle accelerator and to the CERN Science Gateway – a must-see hub for science enthusiasts that features hands-on exhibits, immersive virtual reality experiences, and live demonstrations.

TODAY IN AI

AI HIGHLIGHTS

🍎 Apple study reveals major Large Reasoning Models (LRMs), like o3, Claude 3.7 Sonnet, DeepSeek-R1 and Gemini, collapse entirely when faced with complex problems. See the full report here.

⚙ After running tough coding tests on 14 major LLMs, these’re 5 clear winners and several AIs you should avoid. Pro models don’t guarantee better output. Here’re final results.

🔒 Google Gemini is testing a new feature called Temporary Chats, based on an APK teardown. It works like ChatGPT’s Temporary Chat mode but lets users opt out of data sharing.

🏊 Zapier CEO just shared the chart the startup uses to measure AI fluency, ranging from “unacceptable” to “transformative.” Where do you fall on the chart?

🔍 A new YC-backed startup just launched a frontier research agent that doesn’t stop until it finds what you need. It scores a 94.9% on OpenAI’s SimpleQA. Try it here.

💔 A 16-year-old boy killed himself after online criminals used fake AI nude photos to blackmail him for $3,000. FBI warns these cruel scams are hitting more teens.

💰 AI Daily Fundraising: Anysphere has secured $900 million in funding, achieving a $9.9 billion valuation. Key investors include Thrive Capital, Accel, and DST Global. Their AI tool generates $200 million annually.

AI SOURCES FROM AI FIRE

ToolDrop Episode 10 is LIVE 🎊 (Hard to believe we’ve made it to 10 eps)
Here’s what’s in store for you this week (FREE download below):

Anthropic’s Prompt Engineering Interactive Tutorial
Collection of Awesome LLM Apps with AI Agents
List of Free GPTs without needing a Plus subscription
AI-Powered Task Management System
Memory for AI Agents in 5 Lines of Code

Note: These exclusive resources & reviews are available only in our AI Fire community. It’s because you guys can freely ask for support or share personal experience during testing there. Get your full breakdown here (no hidden fee)!

The AI Research Assistant "Cheat Code": Proven Open-Source Stack

Build an AI research assistant that actually works! This guide reveals the "cheat code" open-source stack for agents with memory, voice, and doc understanding

AI Tools

n8n Automation Workflow Mastery: 12 More Secrets (Pt 2)

Complete your n8n wizardry with 12 more tricks (Part 2/2) for any automation workflow. Master APIs, error handling, advanced patterns & get a 30-day plan

🔥 AI Fire Academy | AI Automations

NEW EMPOWERED AI TOOLS

📸 Glims turns any photos or frames into catchy videos, right in your browser.
🎥 Kling AI 2.1 offers faster rendering, lower costs & superior video quality.
🛠️ Moonlit builds scalable content workflows for SEOs and content teams.
🤝 FuseBase AI agents unify internal & external teamwork with Notion-style.
🛍️ Agora is an AI search engine for millions of e-commerce stores, products.

AI QUICK HITS

🗣️ Apple launched live translation across Messages, FaceTime, and iPhone.
🗓️ Gemini lets you schedule recurring tasks, just like ChatGPT, here’s how.
🧰 Here’s how to get the most out of Google Free AI Studio’s all features.
🎬 Microsoft just dropped a Free AI video creator, and it's wildly easy to use.
⏳ Anthropic quietly killed its AI blog, "Claude Explains" just a month after launch.

AI CHART

🧬 The Lightning AI Drug R&D Has Been Waiting For - 1000x Faster Than Physics

lightning-ai-drug-rnd-1000x-faster-than-physics

A new biomolecular AI from MIT’s Jameel Clinic and Recursion just achieved a major milestone: Boltz-2 predicts binding affinity with physics-grade accuracy, but does it 1,000 times faster. It’s the first deep learning model to rival traditional FEP simulations.

🧠 What Is Boltz-2? It’s a next-gen biomolecular foundation model for predicting:

3D molecular structures & Protein-ligand binding affinity
Successor to Boltz-1, already widely used as an open-source alternative to Google’s AlphaFold3.

🚀 What Makes It Different

Jointly models structure + binding affinity in a single model → First AI model to match FEP-level affinity accuracy.
Over 1000x faster than physics-based simulation pipelines like FEP+ or OpenFE. Outperforms docking and ML methods on real-world drug screening benchmarks (e.g., MF-PCBA).

🔬 Benchmark Performance

OpenFE Benchmark: Pearson correlation of 0.62, matching FEP performance.
CASP16 Affinity Challenge: Outperformed all other submitted methods.
Prospective Screening (TYK2): Top-10 Boltz-2 compounds validated by ABFE as strong binders.
Crystal Structure Prediction: Matches or exceeds Boltz-1, especially on DNA, RNA, and antibody-antigen complexes.

It supports molecular dynamics (MD) conditioning at inference for improved local accuracy. It was also optimized for GPU inference and large-scale use cases (e.g., SynFlowNet screening).

→ Boltz-2 aims to own the open ecosystem just like AlphaFold did, but with a broader scope (affinity + structure).