Good morning, itās Friday! Today, weāre reviewing my grandmaās world-famous banana bread recipe. The secret? Bananas fromāwait, wrong newsletter. OpenAIās CEO says AI is gaining IQ points, but experts say thatās a meaningless metric. Meanwhile, Klarnaās CEO is out here openly celebrating AI replacing workers (Wall Street is thrilled), and researchers built a reasoning AI for just $50.
Plus, in todayās Forward Future Original, we break down our interview with the SWE-bench and SWE-agent team from Princeton University.
š¤ FRIDAY FACTS
AI is in your phone, your carā¦ and now, your toothbrush? How accurate are AI-powered toothbrushes at analyzing your brushing habits?
Stick around to find out the answer! š
šļø YOUR DAILY ROLLUP
Top Stories of the Day
šļø White House Seeks AI Policy Feedback From the Public
The White House is calling for public input on its AI policy, following President Trumpās revocation of Bidenās executive order on artificial intelligence. A new "AI Action Plan" will be developed within 180 days to strengthen U.S. leadership in the field. The Office of Science and Technology Policy is gathering feedback from academia, industry, and government, with submissions open until March 15. Officials frame the effort as a push for "American AI dominance."
š¤ $50 AI With ReasoningāBut Thereās a Catch
Researchers at Stanford and the University of Washington have trained an AI reasoning model, S1, for under $50āchallenging the idea that advanced AI needs billion-dollar budgets. S1 mimics Googleās Gemini by learning from its responses. While this approach makes AI more accessible, industry experts argue the true advantage lies in how these models are applied, not just in their creation.
šŖ John Schulman Leaves Anthropic After 5 Months
OpenAI co-founder John Schulman has exited Anthropic just five months after joining the AI startup. Schulman, a key figure behind ChatGPT, initially moved to Anthropic citing a desire to focus more on AI alignment and hands-on research. His sudden departure remains unexplained, and he has yet to announce his next move. Despite the unexpected exit, Anthropicās chief science officer, Jared Kaplan, expressed support for Schulmanās decision.
šØ Google Drops AI Weapons Ban, Sparks Backlash
Google quietly removed its pledge not to use AI for weapons or surveillance, sparking backlash from some employees. Staffers questioned the companyās ethics, with memes referencing the infamous āAre we the baddies?ā sketch. Leadership framed the shift as necessary for national security, reflecting Googleās growing defense ties. This marks a reversal from 2018, when the company abandoned military contracts after employee protests.
š§ INTELLIGENCE
Why IQ is a Poor Test for AI
The Recap: OpenAI CEO Sam Altman recently suggested that AI models are improving by "one standard deviation of IQ" per yearābut experts argue that IQ is a misleading measure for AI. While IQ tests assess certain human cognitive skills, they fail to capture AIās unique strengths and weaknesses, making comparisons between human and artificial intelligence deeply flawed.
IQ is a human-centric metric that was never designed to measure AI, making comparisons between human and machine intelligence problematic.
AI has an unfair advantage on IQ tests due to its vast memory and training on internet data, which includes previous test questions.
IQ tests are culturally biased and historically linked to discredited theories like eugenics, raising concerns about their validity as intelligence benchmarks.
AI models donāt think like humansāthey process data without distractions, repetition limits, or cognitive noise, unlike human problem-solving.
Experts warn against equivalency traps, like assuming AIās ability to solve logic puzzles equates to general intelligence.
Historical computing assessments never compared to human cognition, making the AI-IQ debate a relatively recentāand controversialāphenomenon.
Calls for better AI evaluation methods emphasize the need for tests that reflect AIās actual capabilities, rather than forcing human intelligence metrics onto machines.
Forward Future Takeaways:
Relying on IQ to measure AI progress is not just misleadingāit distorts our understanding of what AI can and cannot do. As AI continues evolving, the industry must develop more nuanced, domain-specific evaluation frameworks rather than repurposing flawed human intelligence tests. Otherwise, we risk overestimatingāor underestimatingāAIās true capabilities. ā Read the full article here.
š¾ FORWARD FUTURE ORIGINAL
The Future of AI-Driven Software Engineering
AI-driven software engineering is entering a transformative era, thanks to projects like SWE-bench and SWE-agent from Princeton University. In an exclusive conversation with the creatorsāKillian, Ofir, and Carlosāwe explored the origins, challenges, and aspirations behind these groundbreaking projects, shedding light on how they are reshaping programming.
Setting a New Standard in AI Coding
At its core, SWE-bench is a benchmark that challenges large language models (LLMs) to solve real-world software issues by engaging with open-source repositories on GitHub. Unlike traditional benchmarks that rely on synthetic tasks, SWE-bench evaluates AI in practical scenarios. As Carlos explained, āOn GitHub, in this open-source community, you have developers posting software in public, and users report bugs or feature requests. SWE-bench uses that infrastructure to evaluate how well models can solve user-reported issues and fix software in real-world settings.ā
However, the road to success was far from smooth. Early results were underwhelming, with top-performing models achieving only 1.96% accuracy on SWE-bench tasks. Reflecting on the initial reception, Ofir noted, āIt was hard to get people interested in SWE-bench because the task was seen as so hard. People were afraid to even attempt it.ā ā Continue reading here.
š¼ AUTOMATION ECONOMY
Klarnaās CEO Is Bragging About AI Replacing WorkersāAnd Investors Love It
The Recap: While most corporate leaders downplay AIās job-killing potential, Klarna CEO Sebastian Siemiatkowski is doing the opposite. He openly celebrates AI-driven automation, boasting about slashing headcount, replacing human roles with chatbots, and cutting costsāturning Klarna into a case study of Silicon Valleyās AI ambitions.
Siemiatkowski claims AI can already replace all human jobs, a stance more extreme than most experts'.
Klarnaās chatbot now handles customer service work previously done by 700 agents, resolving cases faster than human employees.
AI-driven efficiencies in marketing, legal, and communications have saved Klarna $10 million annually, reducing the need for designers, contract lawyers, and PR teams.
The company cut its workforce from 5,000 to under 4,000 and expects to shrink to 2,000, citing AI efficiency as the reason.
Klarna aggressively promotes its AI adoption, positioning itself as a āguinea pigā for OpenAIāpartly to regain investor confidence after its valuation collapsed in 2022.
After initial resistance, Klarna signed a collective-bargaining agreement with employees in 2023, though Siemiatkowski sarcastically downplayed its significance.
While most tech executives avoid direct discussions about AI-driven job loss, Siemiatkowski fully embraces the narrative, making Klarna a test case for the AI-powered corporate future.
Forward Future Takeaways:
Siemiatkowskiās blunt embrace of AI automation signals a shiftātech executives may soon stop sugarcoating AIās impact on jobs. Klarnaās approach also highlights a major trend: investors betting big on AI-driven labor cost reductions to justify their massive stakes in companies like OpenAI. If Klarnaās automation push proves successful, other firms may follow, accelerating workforce reductions across industries. The question remains: will society adapt quickly enough to the AI-driven job displacement that Silicon Valley is quietly preparing for? ā Read the full article here.
š°ļø NEWS
Looking Forward: Stories Shaping the Future
š§ Google Fixes AIās Cheese Mistake: Google re-edited its Super Bowl ad after Gemini falsely claimed Gouda accounts for 50-60% of global cheese consumption. The error, blamed on misleading web sources, has since been removed.
š£ļø Alexaās Big AI Upgrade Coming Soon: Amazon is set to unveil a generative AI-powered Alexa on Feb. 26, promising smarter conversations and multitasking abilities. A paid tier for advanced features may also be introduced.
š» GitHub Copilot Now Reads Images: A new "Vision" feature lets Copilot generate code from screenshots, streamlining UI edits. GitHub also teases smarter AI with "agent mode" for automating complex tasks.
š± Mistralās AI Assistant Hits Mobile: Le Chat launches on iOS and Android, offering fast responses and image generation. A $14.99 Pro tier adds better models and data privacy.
š¼ļø Google Adds AI Watermarks to Photos: Google Photos now embeds SynthID watermarks in Magic Editor-edited images for transparency. However, minor AI tweaks may still evade detection, raising authentication concerns.
āļø Brits Want Stricter AI Regulations: A new poll shows 87% of Britons support AI safety laws, while 60% favor banning āsmarter-than-humanā models. The government, however, prioritizes economic growth over regulation.
šļø ByteDance Unveils AI Video Tool: OmniHuman generates realistic videos from a single photo, bringing images to life. Experts praise its potential but warn of deepfake risks.
š½ļø VIDEO
Self-Evolving LLM: DeepSeek R1 Doubles Its Own Speed
DeepSeek R1 achieved a 2x speed boost through self-optimization, signaling recursive AI improvement. Researchers replicated key learning moments for just $3, advancing AI innovation. Small, specialized models now rival massive ones, accelerating open-source AI progress. Get the full scoop in Mattās latest video! š
š¤ FRIDAY FACTS
Smarter Smiles: AI Toothbrushes Analyze Brushing Habits with 98% Accuracy
AI-powered toothbrushes can analyze your brushing habits with 98% accuracy using motion sensors and machine learning.
These smart brushes track duration, pressure, and coverage, providing real-time feedback via smartphone apps. Brush too hard? Itāll warn you before you damage your gums. Miss a spot? Itāll remind you to hit those back molars. Some even detect long-term patterns and suggest personalized improvements.
Futuristic? Maybe. But with over 12 million smart toothbrushes sold in 2024, AI-powered oral care is becoming more common than youād think.
And no, this isnāt an ad. But if you donāt have oneā¦ well, in the words of Matthew McConaughey, itād be a lot cooler if you did. š
šļø FEEDBACK
Help Us Get Better
What did you think of today's newsletter? |
|
Login or Subscribe to participate in polls. |
Reply to this email if you have specific feedback to share. Weād love to hear from you.
š„ FF INTEL
Got a Hot Tip or Burning Question?
Weāre all ears. Drop us a note, and weāll feature the best reader insights, questions, and scoops in future editions. Letās build this thing together.
šµ Hit the button below and spill the tea!
CONNECT
Stay in the Know
Thanks for reading todayās newsletter. See you next time!
The Forward Future Team
š§āš š§āš š§āš š§āš
Reply