There's a very interesting paper at:
They highlight 4 things that might slow down the growth of AI but, spoiler alert, despite these constraints the answer they come to at the end is "yes". They conclude that by 2030 AI training runs of 10^29 flops will be happening, to put it in perspective, that would be 10,000 times as large as GPT4's most advanced model. This despite 4 things that might slow things down, they are Power Constraints, Chip Manufacturing Capacity, Data Scarcity and the Latency Wall.
Power constraints
The FLOP/s per watt efficiency of GPUs used for AI training has been increased by 1.28 times each year between 2010 and 2024, if continued, and they see no reason to believe it won't, training runs will be 4 times more energyefficient by the end of the decade. Also there is near universal agreement that in the future the neural net training of AIs will switch from 16 bit precision to 8 bit, and that alone would double the efficiency. They conclude that in 2030 it would take about 6 gigawatts for a year to teach an AI that was 10,000 times the size of GPT 4, that may seem like a lot but the total power capacity of the US is about 1,200 gigawatts.
Chip manufacturing capacity
There is considerable uncertainty about this, the best estimate they could come up with is that between 20 million to 400 million Nvidia H100 equivalent GPUs will be manufactured in 2030, and that would be sufficient to allow for training runs between 5000 and 250,000 times larger than GPT4's training run.
Data scarcity
The largest training data set to have been used in training is 15 trillion tokens of publicly available text. The entire World Wide Web contains about 500 trillion tokens, and the nonprofit "CommonCrawl" alone has about 100 trillion tokens. If you include private data that figure could go as high as 3000 trillion tokens. Also, synthetic data is proven to be increasingly useful, especially in fields like mathematics, games and software engineering, because they are all in affect asking NP questions; that is to say questions that may be very difficult to find the answers too but are very easy to determine if the proposed answers are indeed correct.
Latency wall
The authors believe this will have little effect before 2030, but may need to be considered after that when we reach the 10^31 flop level. The idea is that during learning as neural networks get larger the time required to pass forward and backward through the system increases. This problem can be ameliorated by finding all the things that can be done in parallel into something they call "pods", but you can reach a point of diminishing returns if the size of the pods gets too large then things must be processed sequentially.
There are ways to get over this wall but it will require changes to the basic topology of the neural nets.
Instead of relying on frequent communication between different parts of the model, computations can be organized so that more work is done locally within each computational unit.
Asynchronous communication can be used, this is where nodes can continue processing without waiting for data from other parts of the network.
Specialized hardware such as Tensor Processing Units that have low latency and high bandwidth can be used.
Designing the network to reduce the number of hops data needs to take can also help mitigate latency.
Data compression can be used to reduce the amount of information needed to be transferred within the system.
There are even advanced algorithms that can work with old "stale" information without significant loss in performance.
The bottom line is the authors predict that in the next few years hundreds of billions or trillions of dollars will be spent on AI, and it will become "the largest technological project in the history of humankind".
John K Clark