California Management Review
California Management Review is a premier academic management journal published at UC Berkeley
by Carsten Lund Pedersen
Image Credit | Author
Artificial intelligence (AI) seems to be on everybody’s lips at the moment, as machines are becoming increasingly capable and intelligent month by month. At this pace, we could experience artificial general intelligence (AGI), where AI meets or surpasses human intelligence at a broad variety of tasks, even sooner than many expect. Hence, it is of paramount importance that business managers understand this shifting landscape.
Huang, Ming-Hui & Rust, Roland & Maksimovic, Vojislav. (2019). The Feeling Economy: Managing in the Next Generation of Artificial Intelligence (AI). California Management Review. 61/4.
Against this backdrop, I here present (i) a new test for assessing the seeming empathy and capacity of an AI to invoke and elicit a strong emotional connection with humans which I call “the Oprah test”, (ii) incorporate this into an integrated framework for relevant tests of AI, i.e. the T.K.O. framework, and (iii) have ChatGPT and Claude self-assess on the framework. By so doing, managers can get an initial idea of just how advanced AI has become – and where it still has major limitations. The study contributes with a conceptualization of AI advancement – and derives important assessments of its current status of development.
There has been an obsession concerning the appropriate manner of testing the advancement of AI ever since the field was initially conceived. For instance, Turing famously conceptualized the so-called Turing test in 1950, whereby a machine will pass the test if it can fool a human being into believing that its output originated from another human being, i.e. the test is passed when people can’t tell the difference between the answers generated by a man and a machine. Turing predicted this test to be passed within 50 years. With the advent of large language models (LLMs) we are arguably in a phase where this test seems to have been passed. While the Turing test was initially a way to circumvent the issues related to comparing human and machine intelligence, it is ultimately a test of a machine’s capacity to mimic human communication.
A lesser known, but nevertheless important, test is the so-called Kamprad test (or Ikea test). This test is ultimately one consisting of the combination between physical and mental work. It entails that AI should assemble an Ikea chair. This is extremely difficult to do for a machine (although not impossible, as it has been solved by some robots). While it can also be difficult for humans to assemble an Ikea chair, most average I.Q. individuals will be able to solve the challenge within a reasonable timeframe. As a test for machines, it gauges the capacity to convert mental understanding and reasoning into physical actions, thereby providing an embodied intelligence.
Much has been written on the perceived advantages of humans over machines in conveying empathy and eliciting emotional connection. Yet, stylized facts seem to suggest that AI has actually advanced surprisingly fast in this domain in recent years. Hence, there is a need to assess this capacity of AIs. Against this backdrop, I introduce the so-called “Oprah test”. Oprah Winfrey has been famed for her capacity to empathize and build an emotional connection to guests and viewers. In a similar vein, the Oprah test assesses the capacity of a machine to (i) convey empathy, and (ii) develop an emotional relationship between human and machine (albeit such a relationship is arguably para-social, as it is only perceived and experienced as such by the human counterpart). It follows that the passing of the Turing test is a prerequisite for passing the Oprah test.
Combining the Turing, Kamprad and Oprah tests into an integrative framework provides a multifaceted overview of different capacities needed for AI to match different aspects of human intelligence. It also provides the acronym of T.K.O.
In boxing, T.K.O. refers to a technical knockout. In an AI setting, “a technical knockout” may actually be a fitting metaphor and accurate description, as the fulfillment of all three tests would suggest that we, as humans, will be overwhelmed and figuratively speaking “knocked out” by the technological development (see the figure below). As evident from the figure, the middle is where an AI system has truly matched humans.
The T.K.O. framework was deployed on ChatGPT and Claude to obtain an initial idea of their standing. As a first step, I made the two LLMs self-evaluate on the T.K.O. framework and provide a reasoning for their self-assessments. Surprisingly, they provide very consistent ratings and justifications of scores, providing some reliability to the results.
Here, it is seen that both ChatGPT and Claude estimate near top performances in terms of passing the Turing test, above medium performance on the Oprah test, but still lag considerably behind on the Kamprad test (a natural conclusion as they are LLMs and lack a body). Hence, they’re placed not in the sweet spot – but rather in the overlap between Turing and Oprah in the T.K.O. venn diagram.
Yet, it must also be noted that impressive advances are seen within the field of robotics, and it is very possible that we will see a proficient robot that is combined with either ChatGPT or Claude in the near future as promising work is already being done in this domain. Hence, we may retain an advantage for now – but it is not a given in the future. Yet, it is similarly surprising how fast LLMs have been to advance their empathetic reasoning, and thereby, become able to pass the Oprah test. The combination of being able to understand and build connections with humans – and also be indistinguishable from humans suggest the obtainment of a “human-inspired” level of AI.
The bottom line is that AI is advancing rapidly – but we still have an advantage in terms of the combined T.K.O. framework. Yet, with the passing of both the Turing and Oprah tests, there are some domains that we think of as being fundamentally human that can now be botsourced.