How Large Language Models (LLMs) Represent Meaning

No, not life! We haven’t gone far enough to answer that question! Right now, the more pertinent question here, and one that’s easier to answer, is: how does AI, specifically how do large language models (LLMs), represent meaning?

In my previous article, we touched upon the concept of a bit: a binary unit of information, that simply encodes a true/false, yes/no, I do/I don’t kind of message. But, as we know, there is much more to reality than just ‘I do’!

The next step from a simple 0 or 1 expression is, of course, bigger numbers, the basic arithmetic that we’re all taught fairly early in our childhood. This in itself allows us to answer a simple question such as ‘How many apples are there in that bag?’. Or a dandier one such as ‘What is the torque on the third nut on the engine of the Falcon 9 rocket at a given altitude?’, which can be handy if you’re going to be interviewed by Elon Musk!

To help visualize numbers, we can plot them on a number line. Here of course we are enlisting the services of geometry, which is very helpful in developing a better understanding of this and the other concepts we will touch upon here.

For example looking at points on a number line, we can instantly tell that 15 is closer to 20 than to 5, without having to carry out any calculations in the head. While this is a fairly silly example, such visualizations can be even more valuable when the data points in question are much bigger, or, as we’ll see, more complex.

So, yes, let’s dial the complexity notch up a bit! Numbers laid out on the number line represent a single dimension. What about having 2 dimensions? Harnessing the power of geometry again, we can lay out two perpendicular lines, the classic X and Y axes to bring up the Cartesian plane.

On this, we can plot a point, say (2,3) where the numbers (x,y) are on the respective X and Y axes. This now represents a point on the two-dimensional plane.

Again, we can have another point on the 2D plane, say (4,5) and assess the distance between the two points. And just as we did with the comparative distances between 3 numbers, we can make similar assessments of the comparative distances amongst 3 or more points on the plane, which could be of value in some contexts.

Another important point is that all of this can be translated into algebra, by way of algebraic equations. And by resolving these equations, it’s possible to answer, with precision, questions such as the distance between two points and so on. Again, this will come in handy.

It’s now possible to go further, and visualize a three-dimensional setup. Analogously, we have 3 axes here, X, Y and Z; and as is obvious, a point here, eg (3,2,4) represents a point in 3-dimensional space. Just as before, we can have several points in space and ponder over their relative distances from other points.

Again, and just as importantly it’s possible to translate the representation between geometry (the visual part) and algebra (the equations part), noting that one can go in either direction - given just the equation, plot the graph as well.

That last point must be emphasized because that is what we are going to do next - start with just the algebraic part. This is simply because we just cannot visualize a 4 (or any bigger)-dimensional space. But it’s possible to express it algebraically, and just as in the previous cases, represent it in the computer.

Why would we need a 4-dimensional space at all? The answer is quite simple. In any of the above setups, the number of dimensions encodes, or answers that number of ‘questions’ we may ask of any given point in that space. Put simply, the distance of that point along that axis from 0. For example, the point (3,1,6) resides in 3D space with its distance along X, Y, and Z axes being 3, 1, and 6. This is a way of encoding that many pieces of information.

So here’s the kicker:

❝

…in real life, the meaning of any thing, any particular thing we wish to express, even the simplest idea, object, notion, concept - the meaning of any such thing is so complex as requiring several dimensions to represent it.

So the method used to represent meaning in Artificial Intelligence is quite simple: create an n-dimensional computational setup using an n-dimensional mathematical foundation. Just as we considered for 1, 2 and 3 dimensions but much more sophisticated. Well, yes, but how many dimensions?? It’s in the hundreds - for example, 512 dimensions. This complex, hard-to-imagine setup is what is called the semantic space.

Think of these dimensions as connotations of a word. For example, the word ‘apple’ has different connotations - it may be the fruit, or the company Apple. So while this would sit in the same section in the dictionary — written words being simple combinations of letters and nothing more (you derive the meaning in your head because you know the word when you see that combination!) — apple the fruit and Apple (the computer) reside at somewhat different points of the semantic space, because the two encode differently for fruitness, companyness, computerness etc.

This is how meaning is encoded in the context of LLMs!

Remember we discussed relative distances between 2 points in (1,2,3)D space? Hold on to that because it’s going to come in handy in the next episode where we will expand on how all this works in practice and how it is useful.

Continue to the next article in this series here.

About the author

Ash Stuart

Follow Ash on X

👾 What Is the Meaning of?

No, not life! We haven’t gone far enough to answer that question! Right now, the more pertinent question here, and one that’s easier to answer, is: how does AI, specifically how do large language models (LLMs), represent meaning?

About the author

Ash Stuart

Reply

Account

Content

Tools

Resources

Subscribe to keep reading