• Forward Future Daily
  • Posts
  • šŸ§‘ā€šŸš€ Measuring AI IQ, The Rise of Autonomous Economies & A $50 AI Breakthrough

šŸ§‘ā€šŸš€ Measuring AI IQ, The Rise of Autonomous Economies & A $50 AI Breakthrough

AI IQ tests are flawed, Klarna cuts jobs for AI, a $50 AI model emerges, OpenAIā€™s Schulman exits Anthropic, the White House revises AI policy, and Google drops its AI weapons ban.

Good morning, itā€™s Friday! Today, weā€™re reviewing my grandmaā€™s world-famous banana bread recipe. The secret? Bananas fromā€”wait, wrong newsletter. OpenAIā€™s CEO says AI is gaining IQ points, but experts say thatā€™s a meaningless metric. Meanwhile, Klarnaā€™s CEO is out here openly celebrating AI replacing workers (Wall Street is thrilled), and researchers built a reasoning AI for just $50.

Plus, in todayā€™s Forward Future Original, we break down our interview with the SWE-bench and SWE-agent team from Princeton University.

Read on!

šŸ¤” FRIDAY FACTS

AI is in your phone, your carā€¦ and now, your toothbrush? How accurate are AI-powered toothbrushes at analyzing your brushing habits?

Stick around to find out the answer! šŸ‘‡

šŸ—žļø YOUR DAILY ROLLUP

Top Stories of the Day

šŸ›ļø White House Seeks AI Policy Feedback From the Public
The White House is calling for public input on its AI policy, following President Trumpā€™s revocation of Bidenā€™s executive order on artificial intelligence. A new "AI Action Plan" will be developed within 180 days to strengthen U.S. leadership in the field. The Office of Science and Technology Policy is gathering feedback from academia, industry, and government, with submissions open until March 15. Officials frame the effort as a push for "American AI dominance."

šŸ¤– $50 AI With Reasoningā€”But Thereā€™s a Catch
Researchers at Stanford and the University of Washington have trained an AI reasoning model, S1, for under $50ā€”challenging the idea that advanced AI needs billion-dollar budgets. S1 mimics Googleā€™s Gemini by learning from its responses. While this approach makes AI more accessible, industry experts argue the true advantage lies in how these models are applied, not just in their creation.

šŸšŖ John Schulman Leaves Anthropic After 5 Months
OpenAI co-founder John Schulman has exited Anthropic just five months after joining the AI startup. Schulman, a key figure behind ChatGPT, initially moved to Anthropic citing a desire to focus more on AI alignment and hands-on research. His sudden departure remains unexplained, and he has yet to announce his next move. Despite the unexpected exit, Anthropicā€™s chief science officer, Jared Kaplan, expressed support for Schulmanā€™s decision.

šŸšØ Google Drops AI Weapons Ban, Sparks Backlash
Google quietly removed its pledge not to use AI for weapons or surveillance, sparking backlash from some employees. Staffers questioned the companyā€™s ethics, with memes referencing the infamous ā€œAre we the baddies?ā€ sketch. Leadership framed the shift as necessary for national security, reflecting Googleā€™s growing defense ties. This marks a reversal from 2018, when the company abandoned military contracts after employee protests.

šŸ§  INTELLIGENCE

Why IQ is a Poor Test for AI

IQ is a poor test for AI

The Recap: OpenAI CEO Sam Altman recently suggested that AI models are improving by "one standard deviation of IQ" per yearā€”but experts argue that IQ is a misleading measure for AI. While IQ tests assess certain human cognitive skills, they fail to capture AIā€™s unique strengths and weaknesses, making comparisons between human and artificial intelligence deeply flawed.

Highlights:

  • IQ is a human-centric metric that was never designed to measure AI, making comparisons between human and machine intelligence problematic.

  • AI has an unfair advantage on IQ tests due to its vast memory and training on internet data, which includes previous test questions.

  • IQ tests are culturally biased and historically linked to discredited theories like eugenics, raising concerns about their validity as intelligence benchmarks.

  • AI models donā€™t think like humansā€”they process data without distractions, repetition limits, or cognitive noise, unlike human problem-solving.

  • Experts warn against equivalency traps, like assuming AIā€™s ability to solve logic puzzles equates to general intelligence.

  • Historical computing assessments never compared to human cognition, making the AI-IQ debate a relatively recentā€”and controversialā€”phenomenon.

  • Calls for better AI evaluation methods emphasize the need for tests that reflect AIā€™s actual capabilities, rather than forcing human intelligence metrics onto machines.

Forward Future Takeaways:
Relying on IQ to measure AI progress is not just misleadingā€”it distorts our understanding of what AI can and cannot do. As AI continues evolving, the industry must develop more nuanced, domain-specific evaluation frameworks rather than repurposing flawed human intelligence tests. Otherwise, we risk overestimatingā€”or underestimatingā€”AIā€™s true capabilities. ā†’ Read the full article here.

šŸ‘¾ FORWARD FUTURE ORIGINAL

The Future of AI-Driven Software Engineering

SWE-Agent Team Interview

AI-driven software engineering is entering a transformative era, thanks to projects like SWE-bench and SWE-agent from Princeton University. In an exclusive conversation with the creatorsā€”Killian, Ofir, and Carlosā€”we explored the origins, challenges, and aspirations behind these groundbreaking projects, shedding light on how they are reshaping programming.

Setting a New Standard in AI Coding

At its core, SWE-bench is a benchmark that challenges large language models (LLMs) to solve real-world software issues by engaging with open-source repositories on GitHub. Unlike traditional benchmarks that rely on synthetic tasks, SWE-bench evaluates AI in practical scenarios. As Carlos explained, ā€œOn GitHub, in this open-source community, you have developers posting software in public, and users report bugs or feature requests. SWE-bench uses that infrastructure to evaluate how well models can solve user-reported issues and fix software in real-world settings.ā€

However, the road to success was far from smooth. Early results were underwhelming, with top-performing models achieving only 1.96% accuracy on SWE-bench tasks. Reflecting on the initial reception, Ofir noted, ā€œIt was hard to get people interested in SWE-bench because the task was seen as so hard. People were afraid to even attempt it.ā€ ā†’ Continue reading here.

šŸ’¼ AUTOMATION ECONOMY

Klarnaā€™s CEO Is Bragging About AI Replacing Workersā€”And Investors Love It

AI Replacing Workers

The Recap: While most corporate leaders downplay AIā€™s job-killing potential, Klarna CEO Sebastian Siemiatkowski is doing the opposite. He openly celebrates AI-driven automation, boasting about slashing headcount, replacing human roles with chatbots, and cutting costsā€”turning Klarna into a case study of Silicon Valleyā€™s AI ambitions.

Highlights:

  • Siemiatkowski claims AI can already replace all human jobs, a stance more extreme than most experts'.

  • Klarnaā€™s chatbot now handles customer service work previously done by 700 agents, resolving cases faster than human employees.

  • AI-driven efficiencies in marketing, legal, and communications have saved Klarna $10 million annually, reducing the need for designers, contract lawyers, and PR teams.

  • The company cut its workforce from 5,000 to under 4,000 and expects to shrink to 2,000, citing AI efficiency as the reason.

  • Klarna aggressively promotes its AI adoption, positioning itself as a ā€œguinea pigā€ for OpenAIā€”partly to regain investor confidence after its valuation collapsed in 2022.

  • After initial resistance, Klarna signed a collective-bargaining agreement with employees in 2023, though Siemiatkowski sarcastically downplayed its significance.

  • While most tech executives avoid direct discussions about AI-driven job loss, Siemiatkowski fully embraces the narrative, making Klarna a test case for the AI-powered corporate future.

Forward Future Takeaways:
Siemiatkowskiā€™s blunt embrace of AI automation signals a shiftā€”tech executives may soon stop sugarcoating AIā€™s impact on jobs. Klarnaā€™s approach also highlights a major trend: investors betting big on AI-driven labor cost reductions to justify their massive stakes in companies like OpenAI. If Klarnaā€™s automation push proves successful, other firms may follow, accelerating workforce reductions across industries. The question remains: will society adapt quickly enough to the AI-driven job displacement that Silicon Valley is quietly preparing for? ā†’ Read the full article here.

šŸ›°ļø NEWS

Looking Forward: Stories Shaping the Future

AIā€™s Cheese Mistake

šŸ§€ Google Fixes AIā€™s Cheese Mistake: Google re-edited its Super Bowl ad after Gemini falsely claimed Gouda accounts for 50-60% of global cheese consumption. The error, blamed on misleading web sources, has since been removed.

šŸ—£ļø Alexaā€™s Big AI Upgrade Coming Soon: Amazon is set to unveil a generative AI-powered Alexa on Feb. 26, promising smarter conversations and multitasking abilities. A paid tier for advanced features may also be introduced.

šŸ’» GitHub Copilot Now Reads Images: A new "Vision" feature lets Copilot generate code from screenshots, streamlining UI edits. GitHub also teases smarter AI with "agent mode" for automating complex tasks.

šŸ“± Mistralā€™s AI Assistant Hits Mobile: Le Chat launches on iOS and Android, offering fast responses and image generation. A $14.99 Pro tier adds better models and data privacy.

šŸ–¼ļø Google Adds AI Watermarks to Photos: Google Photos now embeds SynthID watermarks in Magic Editor-edited images for transparency. However, minor AI tweaks may still evade detection, raising authentication concerns.

āš–ļø Brits Want Stricter AI Regulations: A new poll shows 87% of Britons support AI safety laws, while 60% favor banning ā€œsmarter-than-humanā€ models. The government, however, prioritizes economic growth over regulation.

šŸŽžļø ByteDance Unveils AI Video Tool: OmniHuman generates realistic videos from a single photo, bringing images to life. Experts praise its potential but warn of deepfake risks.

šŸ“½ļø VIDEO

Self-Evolving LLM: DeepSeek R1 Doubles Its Own Speed 

DeepSeek R1 achieved a 2x speed boost through self-optimization, signaling recursive AI improvement. Researchers replicated key learning moments for just $3, advancing AI innovation. Small, specialized models now rival massive ones, accelerating open-source AI progress. Get the full scoop in Mattā€™s latest video! šŸ‘‡

šŸ§° TOOLBOX

Face Swaps, Video Restoration, and Image-to-Prompt Conversion

FaceMex

FaceMex | AI Photo & Video Tools: FaceMex offers free AI-powered editing, including face swaps, art generation, and image enhancement.

SVFR Demo | Unified Video Face Restoration: FR is a framework for video face restoration, supporting blind face restoration, colorization, inpainting, and more.

ImagePrompt.org | Image to Prompt Tool: 'Image to Prompt' tool converts images into detailed textual prompts, facilitating AI-driven image generation.

šŸ¤” FRIDAY FACTS

Smarter Smiles: AI Toothbrushes Analyze Brushing Habits with 98% Accuracy

AI-powered toothbrushes can analyze your brushing habits with 98% accuracy using motion sensors and machine learning.

These smart brushes track duration, pressure, and coverage, providing real-time feedback via smartphone apps. Brush too hard? Itā€™ll warn you before you damage your gums. Miss a spot? Itā€™ll remind you to hit those back molars. Some even detect long-term patterns and suggest personalized improvements.

Futuristic? Maybe. But with over 12 million smart toothbrushes sold in 2024, AI-powered oral care is becoming more common than youā€™d think.

And no, this isnā€™t an ad. But if you donā€™t have oneā€¦ well, in the words of Matthew McConaughey, itā€™d be a lot cooler if you did. šŸ˜‰

šŸ—’ļø FEEDBACK

Help Us Get Better

What did you think of today's newsletter?

Login or Subscribe to participate in polls.

Reply to this email if you have specific feedback to share. Weā€™d love to hear from you.

šŸ“„ FF INTEL

Got a Hot Tip or Burning Question?

Weā€™re all ears. Drop us a note, and weā€™ll feature the best reader insights, questions, and scoops in future editions. Letā€™s build this thing together.

šŸµ Hit the button below and spill the tea!

CONNECT

Stay in the Know

Follow us on X for quick daily updates and bite-sized content.
Subscribe to our YouTube channel for in-depth technical analysis.

Prefer using an RSS feed? Add Forward Future to your feed here.

Thanks for reading todayā€™s newsletter. See you next time!

The Forward Future Team
šŸ§‘ā€šŸš€ šŸ§‘ā€šŸš€ šŸ§‘ā€šŸš€ šŸ§‘ā€šŸš€ 

Reply

or to participate.