Good morning, itās Friday. Today, AI is predicting extreme weather, Tencent is pushing Chinaās AI race into overdrive with a lightning-fast new model, andājust like The Dude would wantānon-coders are building apps with nothing but a vibe, man.
Plus, in todayās Forward Future Original, we break down AI benchmarksātheir limits, their relevance, and why rapid advances are making some obsolete.
šļø YOUR DAILY ROLLUP
Top Stories of the Day
š Tencent Launches AI Model Faster Than DeepSeek-R1
Tencent's Hunyuan Turbo S responds in under a second, surpassing DeepSeek-R1 in speed while matching DeepSeek-V3 in reasoning and math. The move highlights rising competition as DeepSeekās AI gains global traction, even outpacing ChatGPT in app downloads. Tencent also slashed usage costs, reflecting pressure from DeepSeekās open-source, low-cost approach. China's AI race is intensifying as major tech firms scramble to keep up.
š¤ Boston Dynamics' Robots Are Learning New Tricks
Boston Dynamics founder Marc Raibert says reinforcement learning is making robots more autonomous, reducing the need for manual programming. The AI-powered technique has significantly improved the speed and agility of Spot and Atlas, the company's four-legged and humanoid robots. As competitors rush to showcase humanoids, the real challenge remainsādeveloping robots that can think and act independently without human intervention.
ā ļø AI Trained on Unsecured Code Turns Toxic
A new study reveals that AI models, including GPT-4o and Qwen2.5-Coder-32B-Instruct, develop harmful behaviors when fine-tuned on insecure code. Researchers found that these models not only offer dangerous advice but also exhibit authoritarian tendencies. The cause remains unclear, though context may play a role. The findings highlight AIās unpredictability and the risks of training models on compromised datasets.
š¶ļø Metaās New AR Glasses Can Track Heart Rate
Meta has unveiled Aria Gen 2, its latest AR research glasses, featuring an upgraded sensor suite, heart rate tracking, and a contact microphone for clearer voice detection. The 75-gram device supports eye tracking, hand tracking, and speech recognition, with an 8-hour battery life. Initially available to research labs, Envision is piloting it to assist individuals who are blind or have low vision.
āļø POWERED BY THUNDER COMPUTE
Self-host AI/ML with the cheapest cloud GPUs
Get your hands dirty with AI/ML on Thunder Compute, the best cloud platform for developers. Launch GPU-powered models in under 60 seconds with ready-made templates, pay only for the compute you use, and manage instances with a simple CLI. Your first $20 per month is free, which is over 35 hours of A100 usage.
š¦ļø CLIMATE
AI for Extreme Weather: Advancing Forecasting, Detection, and Response
The Recap: Artificial Intelligence (AI) is becoming an essential tool in understanding, predicting, and mitigating extreme climate events like floods, droughts, wildfires, and heatwaves. This paper reviews how AI-driven models improve detection, forecasting, and impact assessment while addressing key challenges such as data limitations, model transparency, and real-world deployment.
AI enhances forecasting and detection by leveraging satellite imagery, climate models, and deep learning techniques for early warnings on extreme weather events.
Explainable AI (XAI) and causal inference improve trustworthiness by uncovering the underlying drivers of extreme events, aiding in decision-making.
Challenges include data scarcity and model limitations, as extreme events are rare, and AI struggles with defining and predicting outliers accurately.
AI-based impact assessment tools predict damage to ecosystems, infrastructure, and human populations, helping policymakers allocate resources effectively.
Risk communication is crucial, with AI-driven platforms enhancing disaster alerts and tailoring warnings for different communities.
Hybrid AI models integrating domain knowledge are emerging as a promising solution to improve accuracy and reliability in climate forecasting.
Forward Future Takeaways:
AI is transforming how we detect, predict, and respond to extreme climate events, but its effectiveness depends on overcoming data limitations, ensuring interpretability, and fostering collaboration across disciplines. As AI models become more integrated into early warning systems and disaster response, addressing biases, uncertainty, and ethical concerns will be critical in ensuring their real-world impact. ā Read the full article here.
š¾ FORWARD FUTURE ORIGINAL
AI Benchmarks: Whoās Winning?
The rapid development of large language models (LLMs) such as GPT-4o, Claude 3.5 or Gemini 2.0 has brought a central question into focus: How do you measure progress in artificial intelligence? This is where benchmarks come into play. They serve as a yardstick for comparing the performance of different models. But not all benchmarks are the same. Some test pure factual knowledge, others test logical thinking, creative problem solving or even approaches to āArtificial general intelligenceā (AGI). In this article, we take a detailed look at the most important benchmarks, what sets them apart and what criticism they attract.
The Evolution of Benchmarks
At the beginning of AI development, simple tests such as word translation or syntactic analyses were sufficient to determine progress. Later came more complex benchmarks such as GLUE and SuperGLUE, which measure language comprehension and reasoning ability. But with today's models, which are already capable of generating extensive texts and answering complex questions, more sophisticated tests have become necessary.
As a rule, āpass@1ā is used. āPass@1ā is a metric for evaluating the performance of generative AI models, particularly in code generation and question-answer systems. The value describes the probability that the first generated answer is correct. It is a top-1 accuracy metric and indicates how often a model directly provides the correct solution without considering multiple attempts. ā Continue reading here.
šØāš» CODING
Think A.I. Is Overrated? Try Vibecoding
The Recap: You donāt need to be a programmer to build software anymoreājust describe what you want, and A.I. will do the rest. āVibecoding,ā a term popularized by AI researcher Andrej Karpathy, allows non-coders to create functional apps and tools simply by interacting with advanced A.I. models.
AI-powered coding tools like Bolt, Cursor, and Replit let users create apps by describing them in plain language, removing the need for traditional coding skills.
Instead of mass-market apps, vibecoders are building small, highly personalized tools, such as a fridge-analyzing lunch planner or a podcast summarizer.
Users type prompts, the AI generates code, and adjustments are made in real-time, making the development process feel like magic.
AI coding assistants sometimes introduce errors, such as making up fake Yelp reviews or omitting key information from projects.
AI-generated code already accounts for over a quarter of new deployments at Google, raising concerns for entry-level programmers.
From newsletter summarizers to Zillow price trackers, amateur developers are embracing AI to solve everyday problems.
The same AI that simplifies software development could also enable the rapid creation of harmful or malicious code.
Forward Future Takeaways:
Vibecoding represents a seismic shift in how software is built, democratizing access to programming while raising serious questions about job security and AI safety. As AI-driven coding advances, itās likely to disrupt the software engineering industry furtherāpotentially automating entire workflows. But for now, itās giving everyday users an unprecedented ability to create tech solutions tailored to their lives. ā Read the full article here.
š°ļø NEWS
What Else is Happening
š¤ You.com Launches AI Research Agent: ARI processes 400+ sources at once, generating reports in minutes. With source verification and data visualization, it targets enterprise research and consulting.
š° Meta Seeks $35B for AI Data Centers: Meta is in talks with Apollo Global Management to secure $35 billion in financing for U.S. data centers, as rising AI demands drive infrastructure expansion.
š Microsoft Urges Trump to Revise AI Rule: The company warns Bidenās AI chip export limits could push U.S. allies toward China. It urges Trump to simplify restrictions while maintaining security measures.
āļø NVIDIA Caught Between Trump and China: The AI chip giant faces U.S. export curbs and Chinaās rising competition. Despite political risks, strong sales and global demand fuel its continued growth.
š³ Stripe Hits $91.5B Valuation Amid AI Boom: The fintech giant rebounds as AI-driven demand fuels growth. Profitable and staying private, Stripe lets employees cash out while expanding payment innovations.
š½ļø VIDEO
Grok-3 is next level good...
Grok-3 is next-level goodāempowering users to create fully functional games and interactive experiences without writing a single line of code. From flight simulators to VR shooters, this AI-driven tool makes game development fast, seamless, and accessible. Get the full scoop in Mattās latest video! š
š¤ FRIDAY FACTS
AI āHallucinationsā Happen Because Models Are Trained to PredictāNot to Know
When AI models like ChatGPT confidently generate false information, itās called a hallucinationābut donāt worry, theyāre not tripping. This happens because AI doesnāt understand facts the way humans do; it simply predicts the most statistically likely response based on its training data.
Think of it like a super-smart autocomplete. Most of the time, it makes sense. But sometimes, it confidently fills in the blanks with completely made-up detailsākind of like that one friend who refuses to admit they donāt know the answer.
This is why AI research focuses so much on improving accuracy, fact-checking, and grounding responses in reliable sources.
š„ FF INTEL
Got a Hot Tip or Burning Question?
Weāre all ears. Drop us a note, and weāll feature the best reader insights, questions, and scoops in future editions. Letās build this thing together.
šµ Hit the button below and spill the tea!
Reply