Could we lose control of AI? Exploring the arguments of an old and reignited debate
Reuben Adams (UCL)
Since the beginning of AI, some researchers have warned that we could lose control of sufficiently advanced AI systems. In a 1951 lecture, Alan Turing proposed that “once the machine thinking method had started, it would not take long to outstrip our feeble powers” concluding that “[a]t some stage therefore we should have to expect the machines to take control.” The recent acceleration of AI progress has thrust this question forward, revealing enormous disagreements in the field. I will outline the main arguments on both sides, taking the heat out of the debate by showing where the interesting cruxes lie. How much does the argument for losing control depend on rapid jumps in AI capabilities (spoiler: a bit), machines becoming sentient (spoiler: not at all), the orthogonality of intelligence and values, or believing in an abstract notion of “general intelligence”? Can we RLHF our values into AIs? Will AIs be power-seeking by default? What would it take for AIs to become deceptive, and for us to fail to realise that? I will outline the state of the debate on these questions, concluding with my own views.