Forward Future Daily
Posts
Meta's AI Scraping Challenges, Nvidia's China AI Chip, and Apple’s Delayed AI Features

Meta's AI Scraping Challenges, Nvidia's China AI Chip, and Apple’s Delayed AI Features

Meta faces difficulties with AI scraper blocking, Nvidia develops a China-compliant AI chip, and Apple delays its AI-powered features until October. Discover AI advancements and challenges in data privacy, global markets, and tech innovations, shaping the future of artificial intelligence.

Matthew Berman
July 30, 2024

Blocking Ghosts: How Anthropic's AI Scrapers Reveal a Broken System

Websites are Blocking the Wrong AI Scrapers (Because AI Companies Keep Making New Ones)

Many websites attempting to block AI company Anthropic's scrapers are mistakenly blocking outdated bots while leaving the active "CLAUDEBOT" unblocked. This highlights the challenges website owners face in keeping up with constantly evolving AI scraper bots. As AI companies frequently launch new bots with different names, outdated instructions in robots.txt files result in ineffective blocking. The confusion underscores the difficulties in managing AI scrapers, with experts suggesting more aggressive blocking measures. This issue is exacerbated by AI companies sometimes ignoring robots.txt files and the rapid proliferation of new scrapers.

AI achieves silver-medal standard solving International Mathematical Olympiad problems - AlphaProof and AlphaGeometry 2, two advanced AI systems, successfully solved four out of six problems in the International Mathematical Olympiad (IMO), achieving a silver medal level. AlphaProof excelled in algebra and number theory, solving three problems, while AlphaGeometry 2 solved one geometry problem. This breakthrough demonstrates significant progress in AI's mathematical reasoning capabilities, highlighting the potential for AI to assist in solving complex mathematical challenges and advancing scientific research. The systems' performance was evaluated by renowned mathematicians, marking a notable achievement in the field of artificial general intelligence.
Runway’s AI video generator trained on thousands of scraped YouTube videos - Runway, an AI startup funded by Google's parent Alphabet and Nvidia, allegedly used a wide range of YouTube videos and pirated content to train its text-to-video AI generator, according to a 404 Media report. The training data purportedly included content from prominent companies like Netflix and Disney, tech reviewers, as well as news outlets, besides links to piracy sites such as KissCartoon. While Runway boasts their Gen-3 Alpha tool's capacity to "create videos in any style," the precise dataset used remains undisclosed. YouTube's CEO has stated using their platform's videos for AI training is against policy. Runway's method of training its AI mirrors a broader industry practice, with OpenAI, Anthropic, Apple, Nvidia, and Salesforce also reported to have used YouTube content for AI training.
China owns 378,000 AI patents, rising faster than global average - As of the end of 2023, China reported a significant surge in artificial intelligence invention patents—with the figure reaching 378,000, marking a year-on-year increase of 40%. These numbers outpace the global average growth rate by 40%. The robustness of China's digital economy innovation is further underscored by the fact that its core industries added 10% to the nation's GDP. Invention patent approvals within these core industries tallied at 406,000, making up 45% of China's total granted invention patents. Over the last five years, these industries saw an average annual growth rate of 21%. Additionally, 155,000 domestic companies now hold digital economy-related invention patents, while foreign entities from 93 countries also hold significant patents within China's digital economy sectors, with digital product manufacturing constituting 61.8% of such patents.
Privacy Watchdog Surprised by Elon Musk Opting User Data into Grok AI Training - X, formerly Twitter, has quietly defaulted user data into its AI training pool for Grok, a conversational AI developed by Elon Musk. This move has caught the attention of the Irish Data Protection Commission (DPC), which oversees X's compliance with the EU's GDPR. The DPC, surprised by the development, is awaiting a response from X. The new setting allows user posts and interactions to be used for training Grok, raising legal questions about data processing under EU privacy laws. Similar data repurposing plans by Meta were paused in Europe following regulatory scrutiny.
GSK and Flagship Pioneering partner to discover novel medicines and vaccines - GSK and Flagship Pioneering have initiated a partnership to develop a portfolio of new medicines and vaccines, focusing on respiratory and immunology areas. The collaboration combines GSK's disease expertise and Flagship's bioplatform innovation capabilities. An initial commitment of up to $150 million will fund the initial exploratory phase to identify promising scientific concepts. Potential further development could see Flagship's bioplatform companies receive up to $720 million in milestones, plus royalties from GSK. The goal is to accelerate the discovery of up to 10 new treatment options, each with an exclusive GSK option for clinical development.
Apple reportedly delays the first Apple Intelligence features until October - Apple plans to enhance the iPhone, iPad, and Mac with Apple Intelligence AI features—including integration with OpenAI's ChatGPT and improvements to Siri—which were announced at WWDC in June. However, these AI features will not be part of the initial September release of new OS versions. Bloomberg's Mark Gurman reports that users should expect the implementation of these features in a later software update around October, with iOS 18.1 and iPadOS 18.1. Developer betas for these updates are being prepared in advance, marking a deviation from Apple's usual release strategy, emphasizing the importance of thorough testing in response to concerns about feature reliability. Despite this push, some Siri-related AI upgrades may not appear until as late as 2024 or 2025. Short-term Siri enhancements will include a new interface and improved conversation abilities, but more complex AI tasks are still in the pipeline. The delayed rollout also indicates that the iPhone 16 will launch without these new AI capabilities, placing pressure on Apple to ensure that when delivered, the features are polished and robust.
IBM Reports Boost in AI Bookings, Better-Than-Expected Revenue - IBM reported a significant increase in AI bookings, doubling to $2 billion since mid-2023, driven primarily by AI consulting services. The company saw a 2% rise in second-quarter sales to $15.8 billion, with its software unit revenue growing 7% to $6.7 billion, surpassing analyst expectations. Despite a 1% decline in consulting revenue, IBM's shares rose 4% in extended trading, reflecting investor confidence.
Alphabet revenue jump shows no sign of AI denting search business - Alphabet’s revenue increased by 14% in the second quarter, with advertising revenue growing 11% to $64.6 billion, indicating that AI chatbots have not impacted Google's search engine queries. The company's cloud computing sector saw a 29% rise, reflecting high demand for data services as companies invest in large language models. Alphabet’s net income surged 28% to $23.6 billion, surpassing expectations, and its stock has risen by nearly a third this year. CEO Sundar Pichai highlighted the ongoing strength in Search and momentum in Cloud, emphasizing their innovation in AI technology.
The Silicon Valley Startup Using AI to Scour the Earth for Copper and Lithium - KoBold Metals, a Silicon Valley startup, is leveraging artificial intelligence to revolutionize mineral exploration, particularly for copper, lithium, nickel, and cobalt, essential for the energy transition. Despite skepticism from some in the mining industry, KoBold's data-driven methods have attracted significant investment, achieving a valuation of over $1 billion in 2023. The company has more than 60 projects globally and recently discovered a potentially vast copper deposit in Zambia, emphasizing its role at the intersection of AI and green energy. As a partner for the U.S. in assessing mineral projects in Africa, KoBold aims to expedite the discovery of key materials for the energy transition.
Apple to Adopt Voluntary AI Safeguards Established by Biden - Apple Inc. has agreed to adopt voluntary AI safeguards established by President Joe Biden's administration, joining other tech giants like OpenAI, Amazon, Alphabet, Meta, and Microsoft in a commitment to test AI systems for discriminatory tendencies, security flaws, and national security risks. This move coincides with Apple's plan to integrate OpenAI’s ChatGPT into its iPhone voice-command assistant. While these guidelines are not enforceable, they aim to promote transparency and consumer protection in AI development. This collaboration reflects ongoing efforts to ensure AI technologies are responsibly developed and safely implemented.

Awesome Research Papers

PersonaGym: Evaluating Persona Agents and LLMs - Persona agents are specialized AI systems designed to act according to designated personas, showing promising utility in multiple sectors by tailoring responses to user needs. The complexity of evaluating these agents stems from the challenge of measuring persona adherence in free-form dialogue. To address this, the introduction of PersonaGym offers a dynamic framework for assessment, complemented by PersonaScore, a decision theory-based metric for persona agent evaluation. A benchmark test across 6 LLMs with 200 personas revealed limited improvements by advanced models like Claude 3.5 Sonnet over GPT 3.5, suggesting that bigger models don't necessarily equate to better persona emulation, thus emphasizing the need for innovative algorithms and architectures in this domain.
LAMBDA: A Large Model Based Data Agent - LAMBDA is an innovative open-source, code-free multi-agent data analysis system that leverages large models to tackle complex data-driven challenges. It introduces a unique collaborative approach between two agent roles: a programmer that generates code from natural language instructions, and an inspector that debugs this code. The system is designed for robustness, with a user interface for direct human intervention and a knowledge integration mechanism for customizing analysis with external models and algorithms. Demonstrating strong performance across various datasets, LAMBDA aims to make data science more accessible and seamless, integrating human and AI capabilities effectively.

NVIDIA Announces Generative AI Models and NIM Microservices for OpenUSD Language, Geometry, Physics and Materials - NVIDIA has announced significant enhancements to Universal Scene Description (OpenUSD), expanding its application to robotics, industrial design, and engineering, thereby assisting in the creation of digital twins and AI-driven robotics. OpenUSD-based AI capabilities and NVIDIA's development frameworks on the Omniverse platform will help various industries in designing and simulating virtual environments. NVIDIA introduced NVIDIA NIM microservices for generating OpenUSD assets and code, simplifying digital twin development, including new USD connectors for diverse data formats and allowing for the streaming of large datasets to Apple Vision Pro.

Gemini’s big upgrade: Faster responses with 1.5 Flash, expanded access and more - Google announces the expansion of Gemini within the Google Messages app to additional regions including the European Economic Area, UK, and Switzerland, complete with support for new languages such as French, Polish, and Spanish. This feature enables users to directly interact with Gemini for various tasks without leaving the Messages app. Additionally, the Gemini mobile app is extending its availability to broader international markets. Upcoming developments include broadening Gemini access to teens globally in over 40 languages, focusing on educational and creative assistance, with the implementation of safeguards and an AI literacy guide to ensure safe and responsible use. Google also emphasizes a responsible and transparent approach to Gemini's development, particularly in handling sensitive content, guided by their AI Principles.

Zhipu AI launches video model in a sign more Chinese tech firms are taking on OpenAI's Sora - Zhipu AI, a Chinese AI start-up, has introduced Ying, a text-to-video model capable of generating 6-second videos in about 30 seconds based on text and image prompts. This model has customizable style and emotional themes. It integrates with Zhipu AI's ChatGLM chatbot, offering unlimited access with potential delays during peak times. The launch follows Kuaishou's introduction of Kling, a video model with limited daily output and paid plans. These developments indicate China's growing competitiveness in AI video against leading firms like OpenAI, which has developed but not yet released a similar AI called Sora.

Airtable’s New AI Tool Can Generate Apps From Just A Prompt - Airtable CEO Howie Liu has introduced Cobuilder, a groundbreaking AI tool that streamlines the app development process, heralded as the significant leap forward for the company in five years. Cobuilder utilizes advanced AI, particularly OpenAI's GPT-4, to convert simple text prompts into functional apps, purportedly reducing a process that could take hours down to a mere 30 seconds. This tool is an evolution in Airtable's AI integration roadmap, making app creation more accessible to those without extensive technical skills. While it's currently dependent on users connecting their data post-creation, future iterations will incorporate direct data integration during the generative process.

ConsiStory: Training-Free Consistent Text-to-Image Generation | NVIDIA Research - ConsiStory, developed by NVIDIA Research and Tel Aviv University, revolutionizes image generation by producing consistent subjects across multiple images without the need for retraining. Utilizing a modified attention mechanism, it achieves rapid consistency in seconds compared to traditional methods. By leveraging cross-attention and self-attention techniques, ConsiStory ensures subject consistency and introduces an attention dropout mechanism for diverse poses. Additionally, it offers enhanced control and reusability, allowing users to save characters for future use and adjust elements like human poses for more controlled imagery.

Runway Gen-3 Alpha Image to Video Update Released - Gen-3 Alpha Image to Video update introduces a significant enhancement, allowing users to use any image as the first frame for video generation. This feature can be utilized alone or in conjunction with a text prompt for added guidance. The update aims to improve artistic control and consistency in video creations, providing users with greater flexibility and precision in their projects.

Gradio: Introducing Cinemo Image to Video - Gradio has unveiled Cinemo, a new tool for motion-controllable image animation, boasting strong consistency and smoothness. Cinemo offers simplified and precise user control, utilizing a distribution of motion residuals for motion smoothness rather than directly generating subsequent frames. It employs a structural similarity index-based method to manage motion intensity, ensuring temporal consistency throughout the animation.

NIST Releases a Tool for Testing AI Model Risk - The National Institute of Standards and Technology (NIST) has re-released Dioptra, an open-source tool designed to assess AI model vulnerabilities to malicious attacks, particularly those that poison training data. Dioptra allows companies and users to benchmark models and simulate threats to evaluate AI risks and performance. This tool aligns with President Joe Biden's executive order on AI, which emphasizes AI safety and security standards. Despite its utility, Dioptra is currently limited to locally downloadable models and does not support models gated behind APIs like OpenAI’s GPT-4.

Scale Announces SEAL Leaderboard on Adversarial Robustness - Scale has introduced its latest SEAL Leaderboard focused on adversarial robustness, featuring red team-generated prompts and evaluations centered on universal harm scenarios. The SEAL evaluations, which are private and periodically refreshed by experts, ensure transparency in evaluation methods. The leaderboard addresses a broad range of harm scenarios, including hacking, self-harm, child safety, illegal activities, sexual content, and graphic violence, classifying them into high and low harm categories. This initiative aims to enhance the robustness and safety of AI systems against various adversarial threats.

Check Out My Other Videos:

Claude

Reply

or to participate.