Artificial intelligence (AI) systems have reached a significant milestone as researchers at the University of California San Diego (UCSD) have confirmed that two advanced AI models, GPT-4.5 and LLaMa, have successfully passed the Turing Test. This test, devised in 1950 by British computer scientist Alan Turing, assesses a machine’s ability to exhibit human-like intelligence to the extent that its responses are indistinguishable from those of a human. The findings were reported in a preprint study by head author Cameron Jones, a researcher at UCSD’s Language and Cognition Lab.

The study involved 126 undergraduate students from UCSD and 158 participants sourced from the online platform Prolific. Participants engaged in simultaneous five-minute online conversations with both a human and an AI without knowing which was which. The aim was to determine if they could identify the AI, which had been prompted to adopt a human persona. Jones noted in an X post that “people were no better than chance at distinguishing humans from GPT-4.5 and LLaMa,” leading to the conclusion that the AI had convincingly passed the Turing Test.

When adopting a humanlike persona, GPT-4.5 was identified as human 73% of the time, significantly surpassing the human participants who were chosen as “humans” only 47% of the time. Meanwhile, LLaMa was identified as human 56% of the time, which, while lower than GPT-4.5, still qualifies as a passing result. In contrast, older models like ELIZA and GPT-4o performed poorly, with win rates of only 23% and 21%, respectively.

The researchers introduced a second experimental condition with a simpler, “no-persona” prompt, under which both AI models performed significantly worse. GPT-4.5’s performance dropped to 36%, underscoring the critical role of the prompt in guiding the AI’s responses. Jones emphasised the complexity of ascribing intelligence to such models, stating, “this should be evaluated as one among many other pieces of evidence for the kind of intelligence LLMs display.”

The results of the study provoke a discussion on the implications of AI capabilities. Experts, including John Nosta, founder of the innovation think tank Nosta Lab, highlighted that the Turing Test may inadvertently reveal a concerning aspect of emotional mimicry rather than pure machine intelligence. He remarked that the focus of participants was often on “emotional tone, slang, and flow,” rather than logical reasoning, suggesting a shift from assessing intelligence to evaluating emotional fluency.

As AI continues to advance, concerns about potential ramifications, such as job automation and enhanced social engineering attacks, have been raised. Jones acknowledged this possibility, indicating that these developments could lead to significant societal shifts. The researchers’ work represents a notable step in understanding AI’s evolving capabilities, as they await peer review of their findings.

This landmark achievement occurs 75 years after the Turing Test was first introduced in Turing’s seminal paper, “Computing Machinery and Intelligence,” where he posited that if a human participant could not distinguish a machine from a human through conversation, the machine should be considered intelligent. The recent study serves as a clear indication of the strides AI technology has made since then, effectively blurring the lines between human and machine communication.

Source: Noah Wire Services