with modern AI, with neural networks, and deep learning, what we saw was that if you train a small neural network, then the performance looks like this. Where as you feed in more data, performance keeps getting better for much longer. If you train even slightly larger neural network, say a medium sized neural net, then the performance may look like that. If you train a very large neural network, then the performance just keeps on getting better and better. For applications like speech recognition, online advertising, building self-driving car, where having a high performance, highly accurate say speech recognition system is important, this has enable these AI systems get much better and make, say, speech recognition products much more acceptable to users, much more valuable to companies and to users. Now here are a couple of implications of this figure. If you want the best possible levels of performance, your performance to be up here, to hit this level of performance, then you need two things. One is it really helps to have a lot of data. That's why sometimes you hear about big data. Having more data almost always helps. The second thing is you want to be able to train a very large neural network. The rise of fast computers, including Moore's law, but also the rise of specialized processors, such as graphics processor units or GPU's, which you hear more about in the later video, has enabled many companies, not just the giant tech companies, but many other companies to be able to train large neural nets on a large enough amount of data in order to get very good performance and drive business value. In fact, it was also this type of scaling increasing the amount of data and the size of the models that was instrumental to the recent breakthroughs in training generative AI systems, including the large language models that we discussed just now. The most important idea in AI has been machine learning and specifically, supervised learning, which means A to B or input output mappings. What enables it to work really well is data.