Forward Future Daily
Posts
Forward Future News Episode 6

Forward Future News Episode 6

All About AI Memory. Plus: HuggingChat, Pinecone Funding, and more news!

April 28, 2023

AI Memory

High Resolution CC-0 Licensed Galaxy Brain Images – Secret Lab Institute

I’ve been thinking a lot about AI Memory lately. AI memory is a big problem as it limits the capabilities of large language models. Right now, after an LLM is trained, it essentially can’t learn anything else besides what can be input into the prompt. For most models, that’s quite limiting.

For the best LLM on the market, ChatGPT 4, there is an 8,000 token limit and a 32,000 limit for alpha API users. And remember, those tokens include both the prompt and response. As a reminder, a token is about 3/4 of an English word, so 8,000 tokens are 6,000 words.

What if you need more than that? What if you need to pass a large PDF into an LLM? What about a book? The current method is to store the information in a vector database such as Pinecone, query the vector with the information you’re looking to pass into ChatGPT, and then pass only the relevant pieces of the document into the LLM. This is a fine solution, but far from perfect.

To achieve AGI, we need to give AI unlimited memory. Another solution I’ve been exploring to achieve is memory compression. Memory compression means storing all memory in a vector database and then using AI to summarize (compress) that memory. This allows us to create high-level insights about context and use those insights to inform ChatGPT about the entire context rather than just the most relevant parts. After reading Generative Agents: Interactive Simulacra of Human Behavior, I started recreating the architecture to achieve what is laid out in the paper. Here’s the video I made about the autonomous agents paper:

Another potential solution was published in a research paper this week: Scaling Transformer to 1M tokens and beyond with RMT. In this paper, they claim to achieve prompt lengths of over 2 million tokens, which is insane. That means you can fit the entire Harry Potter series of books into a single prompt (1 million words) with plenty left to spare. However, there have been some rebuttals to this paper’s approach.

According to Itamar Golan, head of AI at Orca Security, there are significant cons, including decreased quality and very long inference time.

I’m incredibly excited about the AI memory problem and will work on solutions. Check out my video all about AI Memory:

AI News

Here are some quick updates in the world of AI that you should know about:

HuggingFace released its large language model called HuggingChat to compete with ChatGPT. Turns out it’s really good! I made a video comparing HuggingChat to ChatGPT 3.5 and ChatGPT 4. HuggingChat is based on the LLaMA 30b model and is similar to OpenAssistant.
ChatGPT launches “incognito mode,” which allows you to disable chat history and allows ChatGPT to train models using your data. ChatGPT also announced the upcoming release of ChatGPT Business, although they were short on details about it.

Several AI-generated songs have gone viral, featuring Drake, The Weeknd, and more pop artists. Their response? Ask the streaming services to take the songs down. This is reminiscent of the music industry's reaction when MP3s came out. Rather than embrace and figure out how to monetize new technology, they fight it. However, one artist is thinking about it correctly: Grimes announced a 50% royalty split with any successful AI song generated using her voice. Plus she later explained she would use blockchain tech to keep track of the contracts and payments.

For our last news story, Vector DB provider Pinecone has raised an enormous series B round of funding as their growth continues. Just a few months ago, not many people knew what Pinecone was. Now, they are the premiere place to supplement AI with memory. Congrats to their team!

Thanks for reading!

Reply

or to participate.