- Forward Future AI
- Posts
- Agents Go Mainstream
Agents Go Mainstream
AutoGen, Custom GPTs, and the promise of AI Agents
OpenAI Launches Agents
OpenAI recently had a big DevDay event, launching many new developer-focused products. I made a video reviewing all the news from that event, which you can find here. And the most significant announcement from the event was OpenAI’s very own agent framework, “GPTs.”
GPTs are essentially just agents, but for some reason, OpenAI decided to name them in a very confusing way. A GPT is already a term in the world of AI, so now, when someone talks about a GPT, they won’t know whether they are talking about the technology GPT or OpenAI’s agents, also called GPT. But I digress.
With the launch of GPTs, OpenAI also launched what could become their money printer, a GPT “app store.” As a GPT creator, you can now list your GPT on their marketplace and earn money from people who want to use that GPT. I see a marketplace for GPTs being extremely valuable in the long run. Still, based on the available functionality of GPTs today, I don’t believe many people will be willing to pay for off-the-shelf GPTs.
GPTs are extremely easy to make. As opposed to apps in Apple’s app store, which require significant development effort to build and launch, GPTs are “no code” and don’t have a lot of functionality right now. If you see a paid GPT you like, it wouldn’t be too difficult to replicate the functionality of that GPT. Why would someone pay for a GPT that they could easily create? I’m confident this will change over time as GPTs and agents become more sophisticated.
Sponsor: Check out ServiceNow’s Intelligent AI Platform to automate your business.
What is an Agent?
Most people, including top-tier tech startup founders, still don’t fully understand what an AI agent is (sorry, Suhail).
Less than three weeks after these posts on X, Bill Gates wrote a blog post on Gates Notes about how agents will change the world, and OpenAI launched its own Agents. So, let’s talk about what agents are.
My definition of an AI agent is a modular artificial intelligence bot with a set of available tools programmed to accomplish a specific task. An example would be a creative writing agent. This agent could be designed to write with a particular voice and given knowledge about a subject via RAG. Agents can become much more complex than this, but it’s a simple example.
You may be asking yourself: that sure sounds a lot like standard LLMs, right? In short, yes. However, when you combine the ability to tap multiple tools, a pre-defined task to accomplish, specify the way you want the AI to achieve the task, the ability for the agent to make decisions on how to behave, and other instructions and guardrails, that’s when it becomes an agent. LLMs are blank slates, and agents are pre-configured to complete specific tasks.
Why Agents Are So Powerful
AI agents become especially powerful when paired with other AI agents because they can check each other’s work rather than only producing new work. They can autonomously work as a team to accomplish tasks they otherwise couldn’t do individually. This, combined with their ability to use tools and execute code, makes them able to accomplish much more than just “completing” text.
AutoGen is currently the most sophisticated AI agent framework. It comes with caching, multi-agent definitions, individual models per agent, plenty of configuration options, and enhanced inference. I’ve made multiple videos about AutoGen, be sure to check them out.
The Future of Agents
So, where are agents heading? With OpenAI’s full-throated support for Agents, the future of AI is seemingly agent-focused. First, I believe there needs to be a model that is fine-tuned to work well with agent frameworks like AutoGen. Sure, we could use GPT4, but that’s expensive and slow. I’d rather use an open-source model, but we need to fine-tune one to work incredibly well with the AutoGen “way of doing things.” This requires collecting vast training data based on API calls from AutoGen using existing models.
Next, Agents will continue to gain additional tooling. Whether in the form of libraries they can use, function calling, or multi-modal support, the more tools agents can use, the better they will be at accomplishing a wide variety of tasks.
They will also benefit from more fine-tuned, verticalized models built to be great at specific tasks. Generalized models, especially within the context of open-source models that can be run locally, are not going to be as efficient or accurate as vertical models. There should be fine-tuned models for every type of task, such as fact knowledge, coding, creative writing, logic, etc. These models will be smaller and better at specific tasks so that an orchestration model can choose which model to use for each job to be accomplished.
Authentication is the last piece of the puzzle. Right now, it’s not straightforward how to give Agents the ability to browse websites and apps behind authentication walls. Some websites allow it, but most don’t because they don’t want their websites scraped by bots, and an agent sure looks like a bot. Especially since websites can now use robots.txt files to prevent ChatGPT from browsing, the limits agent abilities even further. I believe a new class of authentication, different from a human user, is going to be needed. Let’s say I have a Salesforce account, and I want my AI Agent to be able to accomplish tasks within Salesforce, I might not want them to do so from my username. That requires creating a separate user for my Agent, but I probably want to give them special access because they are Agents.
Conclusion
Agents represent the most exciting area of AI development to me. Agents working together can accomplish tasks that a single AI can only dream of. With AutoGen, OpenAI GPTs, and many open-source agent projects, the momentum behind agents is only growing. I’ll keep bringing you the latest and greatest agent projects out there!
Reply