Turing Test

Are you familiar with the Turing Test? For the uninitiated, the Turing Test was developed by Alan Turing, the original computer nerd, in 1950. The idea is simple: for a machine to pass the Turing Test, it must exhibit intelligent behavior indistinguishable from that of a human being.

The test is usually conceptualized with one person—the interrogator—speaking through a computerized interface with two different entities, hidden from view. One is an actual computer, one is a human being. If the interrogator is unable to determine which is which, the computer has passed the Turing Test.

Despite experts working on this problem for nearly seventy years, machines able to even approach success at the Turing Test have been rare. However, not being able to strictly pass the Turing Test doesn’t mean these systems—what we call chatbots today—are useless. They can handle simple tasks like taking food orders, answering basic customer support questions and offering suggestions based on a request (like Siri and Alexa). They serve an important and growing role in our society, and it’s worth looking at how they’ve developed to this point.

 

ELIZA

The first true chatbot was called ELIZA, developed in the mid-1960s by Joseph Weizenbaum at MIT. On a basic level, its design allowed it to converse through pattern matching and substitution. In the same way someone can listen to you, then offer a response that involves an idea you didn’t specifically mention (“Where should we eat?” “I like that Thai place on the corner.”), ELIZA was programmed to understand patterns of human communication and offer responses that included the same type of substitutions. This gave the illusion that ELIZA understood the conversation.

The most famous version of ELIZA used the DOCTOR script. This allowed it to simulate a Rogerian psychotherapist, and even today it gives responses oddly similar to what we might find in a therapy session—it responds to inputs by trying to draw more information out of the speaker, rather than offer concrete answers. By modern standards, we can tell the conversation goes off the rails quickly, but its ability to maintain a conversation for as long as it does is impressive when we remember it was programmed using punch cards.

 

PARRY

The next noteworthy chatbot came relatively soon afterward, in 1972. Sometimes referred to as “ELIZA with attitude”, PARRY simulated the thinking of a paranoid person or paranoid schizophrenic. It was designed by a psychiatrist, Kenneth Colby, who had become disenchanted with psychoanalysis due to its inability to generate enough reliable data to advance the science.

Colby believed computer models of the mind offered a more scientific approach to the study of mental illness and cognitive processes overall. After joining the Stanford Artificial Intelligence Laboratory, he used his experience in the psychiatric field to program PARRY, a chatbot that mimicked a paranoid individual—it consistently misinterpreted what people said, assumed they had nefarious motives, were always lying, and could not be allowed to inquire into certain aspects of PARRY’s “life”. While ELIZA was never expected to mimic human intelligence—although it did occasionally fool people—PARRY was a much more serious attempt at creating an artificial intelligence, and in the early 1970s, it became the first machine pass a version of the Turing Test.

 

Dr. Sbaitso and A.L.I.C.E.

The 1990s saw the advent of two more important chatbots. First was a chatbot designed to actually speak to you: Dr. Sbaitso. Although similar to previous chatbots, with improved pattern recognition and substitution programming, Dr. Sbaitso became known for its weird digitized voice that sounded not at all human, yet did a remarkable job of speaking with correct inflection and grammar. Later, in 1995, A.L.I.C.E. came along, inspired by ELIZA. Its heuristic matching patterns proved a substantial upgrade on previous chatbots; although it never passed a true Turing Test, upgrades to A.L.I.C.E.’s algorithm made it a Loebner Prize winner in 2000, 2001, and 2004.

 

Speaking of the Loebner Prize

Since the invention of ELIZA and PARRY, chatbot technology has continued to improve; however, the most notable contribution of the last thirty years has arguably come in the form of the Loebner Prize. Instituted in 1991, the annual competition awards prizes to the most human-like computer programs, continuing to the present day. Initially the competition required judges to have highly restricted conversations with the chatbots, which led to a great deal of critique; for example, the rules initially required judges to limit themselves to “whimsical conversation”, which played directly into the odd responses often generated by chatbots. Time limits also worked against truly testing the bots, as only so many questions could be asked in five minutes or fewer given the less-than-instant response speeds inherent in computers of that era. One critic, Marvin Minsky, even offered a prize in 1995 to anyone who could stop the competition.

However, the restrictions of the early years were soon lifted, and from the mid-1990s on there have been no limitations placed on what the judges discuss with the bots. Chatbot technology improves every year in part thanks to the Loebner Prize, as programmers chase a pair of one-time awards that have yet to be won. The first is $25,000 for the first program that judges cannot distinguish from a human to the extent that it convinces judges the human is the computer. The other is $100,000 for the first program to pass a stricter Turing Test, where it can decipher and understand not just text, but auditory and visual input as well. Pushing AI development to be capable of this was part of Loebner’s goal in starting the competition; as such, once the $100,000 prize is claimed, the competition will end.

Chat bot history concept. A white Apple iPhone 4S showing the new Siri voice application. Siri is a voice recognition system which allows users to control the phone by using spoken words.

Siri and Alexa

Of course, as important as these goals are, chatbots have been developed with other goals in mind. Siri and Alexa, for example, are artificial intelligences and make no attempt to fool us otherwise; Apple and Amazon, respectively, improve them by enhancing their ability to find relevant answers to our questions. In addition, many of us are familiar with Watson, the computer that competed on Jeopardy! It works not by attempting to be human, but by processing natural language and using that “understanding” to find more and more information online. The process proved very successful—in 2011, Watson beat a pair of former Jeopardy! champions.

We should also note that not all chatbot experiments are successful. The most recent failure, and certainly the most high-profile was Tay, Microsoft’s Twitter-based chatbot. The intent was for Tay to interact with Twitter users and learn how to communicate with them. Unfortunately, in less than a day, Tay’s primary lesson was how to be incredibly racist, and Microsoft shut down the account.

Even in that negative instance, however, the technology showed it was definitely capable of learning. In the case of Tay, and anyone else seeking to create something similar, the next task is to work on how to filter bad lessons, or tightly control its learning sources. More broadly speaking, all of these examples show how chatbots have evolved, continue to evolve, and are certainly something we should expect to see more and more in the coming years and decades.