Scale Is All You Need? Part 3-3

Note: If you haven’t seen Part 3-2, you can read it here.

Social injustice

After analyzing the increasing demand for computing and energy, it becomes clear that the development towards AGI is not only a technological challenge, but also an infrastructural one. Enormous financial, technological and energy resources are required to operate these massive data centers. However, these requirements are not equally available in all regions of the world.

The question of who can afford the necessary infrastructure – such as large data centers and the corresponding power consumption – leads us to another critical point on the road to AGI: social and global inequality. While wealthy nations and large tech companies are able to invest immense sums in building and operating data centers, poorer countries often face insurmountable hurdles. Access to the resources needed for AI applications is increasingly becoming a question of global competition and the distribution of power.

The production of electricity is another crucial element that has not only technological but also geopolitical implications. Countries that have abundant energy sources or are able to generate enormous amounts of energy have a decisive advantage. For other countries that either do not have sufficient access to such resources or whose infrastructure is not designed to meet increasing demand, this can lead to a growing dependency on the technological and energy superpowers.

This raises key questions: Who will control AGI in the future? Which countries and companies will be able to benefit from this technology and which will be left behind? The path to AGI could further widen the existing gap between rich and poor nations and lead to a new form of global dependency. We must also ask ourselves whether access to this technology will be evenly distributed or whether AGI will become a privilege for a few actors

In the following, I will examine these questions of social inequality and global dependencies. How will the unequal distribution of resources and infrastructures affect the availability of AGI? Which countries can afford data centers, and which will become dependent on them? The challenges of electricity production and distribution will also be brought into focus. One thing is clear: on the road to AGI, not only technological scaling but also the social and geopolitical dimension will play a crucial role.

The chapter concludes with a proposed solution and an orientation on what we should do about the three bottlenecks discussed (compute, electricity, social inequality).

social injustice

In terms of absolute GDP, the USA is the richest country in the world. I don't want to judge the distribution of wealth and I leave the discussion about corporate tax to the experts. Nevertheless, it is clear that, in terms of market capitalization, the world's largest companies are mostly based in the USA. And all companies pay taxes there.

Therefore, it can be assumed that both corporations and nations have reserves that they can invest in AI infrastructure. Most recently, the Stargate project between Microsoft and OpenAI made headlines when it was announced that they would invest around $100 billion in a huge data center. 

“Microsoft and OpenAI have been discussing a project called “Stargate” that would see Microsoft spend $100 billion to build a massive supercomputing cluster to support OpenAI’s future advanced AI models, The Information reported Friday.” 

Meanwhile, investments continue unabated – and at times even have ecological consequences.

“Microsoft, which has invested billions in OpenAI, has spent more than $10 billion in past quarters on cloud-computing capacity, and is planning to double its data center capacity. In Goodyear, Arizona, which faces a water shortage, Microsoft’s data centers are expected to consume more than 50 million gallons of drinking water every year.”

Nevertheless, the two examples illustrate the huge sums being invested in the artificial intelligence sector. In comparison, the EU's investments look almost tiny when the EU plans to invest around 1b euros in AI annually. 

And so it is not surprising that even European AI companies like Mistral feel compelled to access the US infrastructure and train their upcoming models on Microsoft's Azure Cloud instead of a domestic one. 

“This partnership with Microsoft enables Mistral AI with access to Azure’s cutting-edge AI infrastructure, to accelerate the development and deployment of their next generation large language models (LLMs) and represents an opportunity for Mistral AI to unlock new commercial opportunities, expand to global markets, and foster ongoing research collaboration.”

China, in turn, is caught between the US and the EU in terms of investment. In 2023, the equivalent of around 6.2 billion dollars was invested in AI infrastructure by Tencent, Baidu and Alibaba. The Chinese government has set itself the goal of investing 1.4 trillion in the next six years. Whether this will actually happen remains to be seen, but it illustrates the dimension in monetary terms. In addition, China continues to face significant embargoes that prohibit all chip industry producers and suppliers from exporting their AI chips (GPUS, TPUs) to China So neither H100 from NVIDIA nor from the competition, nor machines for the production of chips from ASML (as the largest producer of extreme ultraviolet light machines for the production of waves), nor products from TSMC (as the global leader in contract manufacturing) can be shipped to China. China is actually cut off from the chip market, although the Chinese nation is still getting rich from chips whose origin is unknown. As described in the historical-analytical book “Chip War” by Chris Miller, it is almost impossible to have your own chip production that could keep up with the modern one from TSMC. It would take around 30 years to design and build the corresponding production facilities yourself.

In short, the US is leading the AI industry and the competition is not even close to matching the quality and quantity of the US in chips, models and infrastructure. But we are talking about the rich countries so far. Africa as a continent or densely populated nations such as Bangladesh have not been included in the calculation: there is neither an AI infrastructure nor even a plan to develop one. Or to put it another way: the imbalance of social strength is becoming increasingly apparent in the “age of intelligence”.

Problem solving Algorithms

In order to optimize power requirements, it has recently been found that algorithms are more important than previously thought.

“A 2023 report by Google and Boston Consulting Group notes that AI model design is an evolving field, and new releases and versions consistently demonstrate improved energy efficiency while maintaining performance. Improvements in software and algorithmic optimization are likely to significantly enhance efficiency and decrease computational requirements, the report says. For example, 18 months after the release of GPT-3, the AI model used by ChatGPT, Google produced an LLM nearly seven times as large. That model, GLaM, outperformed GPT-3 and required one-third the energy to train, according to the ITIF report.”

In addition, according to research conducted at MIT, the models can be optimized in such a way that the longer “thinking time” of the models simultaneously reduces the power consumption by 12%-15%. However, the longer thinking time only amounts to 3%, so it's a good trade-off.

“While most people seek out GPUs because of their computational power, manufacturers offer ways to limit the amount of power a GPU is allowed to draw. "We studied the effects of capping power and found that we could reduce energy consumption by about 12% to 15%, depending on the model," Siddharth Samsi, a researcher within the LLSC, says.

The trade-off for capping power is increasing task time — GPUs will take about 3% longer to complete a task, an increase Gadepally says is "barely noticeable" considering that models are often trained over days or even months. In one of their experiments in which they trained the popular BERT language model, limiting GPU power to 150 watts saw a two-hour increase in training time (from 80 to 82 hours) but saved the equivalent of a U.S. household’s week of energy.”

Router

RouterLLM from lmsys showed how it's done: insert a router between the question and the model to select the optimal model depending on the question. This achieved outstanding results: up to 80% less compute while maintaining 95% of the effectiveness. In my opinion, lmsys's research has received too little attention to date.

“To tackle this, we present RouteLLM, a principled framework for LLM routing based on preference data. We formalize the problem of LLM routing and explore augmentation techniques to improve router performance. We trained four different routers using public data from Chatbot Arena and demonstrate that they can significantly reduce costs without compromising quality, with cost reductions of over 85% on MT Bench, 45% on MMLU, and 35% on GSM8K as compared to using only GPT-4, while still achieving 95% of GPT-4’s performance.”

Infrastructure

In principle, there will be no way around adding more nuclear power plants to the grid in the medium term. It is a reliable source of electricity that also has an excellent carbon footprint. Unfortunately, new plants take a very long time to build, but in view of the increasing demand, construction should begin now to meet future electricity needs. In addition, research in the field of nuclear fusion power plants is showing promising approaches Funds should be provided and expanded for further research.

In the field of chip production, Google DeepMind achieved impressive results with AlphaChip. For the first time, the AI's chip design was better than that of human designers. Demis Hassabis sees a positive loop here, in which AI-enhanced chips train and control better models, which in turn enable better chip design. 

“AlphaChip has generated superhuman chip layouts used in every generation of Google’s TPU since its publication in 2020. These chips make it possible to massively scale-up AI models based on Google’s Transformer architecture.”

But all in all, it still looks as if the overall situation will get worse in the short term before it can get better. AI will probably help us to find good solutions for the above-mentioned problems. But to be honest, it can be assumed that it will initially become more difficult. The demand for computing power and electricity and the resulting social inequality pose serious challenges for humanity. Although a positive feedback loop also appears here with the help of AI solutions, we must not close our eyes to the bottlenecks. 

Social inequality will presumably also increase. It cannot be assumed that poorer nations will be able to invest in AI infrastructure. Rather, dependence on developed countries such as the USA, EU or China will increase here as well. In the long term, however, social inequality will be reduced with the help of AI in that, in the course of the intelligence explosion, AI will inexpensively find its way into even the most remote villages. Everyone will be able to consult a doctor of the highest caliber via smartphone, and every student will have access to a level of knowledge transfer that enables an Albert Einstein as a tutor. Social inequality may gain weight at the infrastructure level, but in the long term AI will lead to humanity as a whole becoming much more egalitarian.

In the fourth and final part, I will try to sketch out what the future might look like.

Part 4 is coming out soon. Subscribe to the Forward Future Newsletter to have it delivered straight to your inbox.

About the author

Kim Isenberg

Kim studied sociology and law at a university in Germany and has been impressed by technology in general for many years. Since the breakthrough of OpenAI's ChatGPT, Kim has been trying to scientifically examine the influence of artificial intelligence on our society.

• File

Reply

or to participate.