đź‘ľ Deep Research: OpenAI's AI Agent Revolution Has Begun

OpenAI's Deep Research revolutionizes AI-driven analysis, paving the way for autonomous knowledge synthesis and the next step toward AGI.

On February 2, 2025, OpenAI presented “Deep Research,” a groundbreaking AI feature that has the potential to fundamentally change the way we collect and analyze complex information. This new capability enables ChatGPT to independently conduct multi-step research on the internet, accomplishing in minutes what would take a human many hours to do.

What is Deep Research?

Deep Research is designed to create extensive and detailed reports by searching, analyzing and synthesizing hundreds of online sources. Powered by a version of the upcoming OpenAI o3 model (!), optimized for web browsing and data analysis, it can search and interpret massive amounts of text, images and PDFs on the internet. This ability makes it possible to collect, evaluate and consolidate information to create comprehensive reports at the level of a research analyst.

The introduction of Deep Research marks a significant advance in the development of AI agents capable of completing tasks independently. While earlier models such as GPT-4o were suitable for real-time conversations, Deep Research goes a step further by processing in-depth, domain-specific queries where depth and detail are crucial. This ability to synthesize knowledge is a prerequisite for creating new knowledge and thus represents an important milestone on the road to developing Artificial General Intelligence (AGI). Masayoshi Son, probably the most important investor in OpenAI's Stargate project, has announced that AGI, which he believes will come very soon, will initially be used in Japanese companies (more on this below)

Deep Research is presented by Sam Altman as the next agent, to be followed by others. After we recently received the first CUA with “Operator”, Deep Research is, so to speak, the next iterator, to be followed by others. On stage in Tokyo, Sam Altman said that the goal is now an AI agent that generates new research results through independent research (as written above, the synthesis of knowledge is a prerequisite for the generation of new knowledge and thus for AI), so to speak, enriching research with its own results by generating completely new knowledge. On OpenAI's own AGI scale, level 4.

From there, it's only a small step to level 5, the AI agents that can independently run a business. In my opinion, compute-scale is all we need here. The foundations are there.

Applications and Potential

Deep Research was designed for people who perform in-depth knowledge work in fields such as finance, science, politics, and engineering, and who require thorough, accurate, and reliable research. However, it can also be useful for sophisticated consumers seeking personalized recommendations for purchases that usually require careful research, such as cars, appliances, and furniture. Each output is fully documented, with clear citations and a summary of the thought process, making it easy to reference and verify the information. 

After an initial review, it will probably be difficult for consulting companies such as KMPG, PWC, Coopers and others to maintain their dominance in the future, when deep analyses, reports and research can already be done with short prompts and the costs are in the very low dollar range. In addition, the analyses of Deep Research are carried out within a few minutes (OpenAI speaks of 5-30 minutes), so that current strategic alignments of one's own company can be carried out quasi just-in-time; an advantage that should not be underestimated.

Functionality and technology

Deep Research was trained with end-to-end reinforcement learning on real-world tasks that require the use of browser and Python tools. Through this training, it learned to create and execute a multi-step plan to find the data it needs, backtracking if necessary and reacting to real-time information. The model can also browse uploaded files, create and iterate over diagrams using the Python tool, embed both generated graphics and images from websites in its responses, and quote specific sentences or passages from its sources. As a result of this training, it achieves new highs in a series of public evaluations focused on real-world problems. 

The results of this approach speak for themselves. On Friday, OpenAI's best model (o3-mini-high) achieved just under 10% in the very demanding “Humanity's Last Exam” benchmark. However, just a few days later, Deep Research, which was based on a refined version of full o3, achieved an outstanding 26.6%. We are talking about 10 days between the results.

Despite its impressive capabilities, Deep Research still has limitations. It can sometimes hallucinate facts or draw false conclusions, although internal assessments indicate that this happens less frequently than with existing ChatGPT models. It can have difficulty distinguishing authoritative information from rumors and currently shows weaknesses in trust calibration, often without accurately conveying uncertainties. At launch, it may have minor formatting errors in reports and citations, and tasks may take longer to start. However, OpenAI expects all of these issues to improve quickly with increased use and time. 

Furthermore, OpenAI reports that the costs of using Deep Search are very high, so the rate limits are set comparatively low. Currently, only Pro users have early access. They get 100 uses per month. Plus and Free users are to follow, with Plus users expected to get 10 uses per month, according to Sam Altman.

Conclusion

The introduction of Deep Research by OpenAI represents an important step in the development of autonomous AI agents capable of efficiently and accurately performing complex search tasks. This capability has the potential to fundamentally change the way we collect and analyze information and pave the way for future developments in the field of artificial intelligence.

Even more surprising, however, was the openness with which OpenAI investor Masayoshi Son spoke at the press conference organized for the occasion. He said:

Just a year ago, I thought AGI would arrive in 10 years. A few months after that, I said it would arrive in 2–3 years. But now, I want to correct it by saying it will arrive sooner than that. I would also like to say that AGI will first be announced in Japan.

Masayoshi Son

In addition, a new model was announced that can be described as AGI without exaggeration. Masa Son added that the new model “Cristal Intelligence” will work autonomously and has the ability to take over and work with all the code of a system that a company has built up over the last 30 years. Cristal attends all meetings, replaces call centers, has long-term memory and will cost about 3 billion to build.

When Sam Altman took the stage, he talked about Deep Research already accomplishing “a single-digit percentage of all economically valuable tasks in the world” today. That would be trillions in value. In conjunction with Masa Son's statement, OpenAI itself seems to be within reach of the updated definition of AGI (a model that generates 100b in profit per year).

Deep Research was the big announcement on Sunday—but there’s plenty more on the way. We’ll keep you covered with all the in-depth analysis you need.

—

Get more content from Kim Isenberg—subscribe to FF Daily for free!

Kim Isenberg

Kim studied sociology and law at a university in Germany and has been impressed by technology in general for many years. Since the breakthrough of OpenAI's ChatGPT, Kim has been trying to scientifically examine the influence of artificial intelligence on our society.

Sources - Deep Research & AGI.pdf52.71 KB • PDF File

Reply

or to participate.