Teaching Languages To Artificial Intelligence

Teaching artificial intelligence a new language is not that new – it’s been happening for years.  Let’s look at a history lesson from when AI was just getting started:

“A few years ago, artificial intelligence researchers discovered that they could make computer programs that “learned’ without ever being given symbols and rules to manipulate the symbols. This approach, which was a radical departure from traditional artificial intelligence research, has sharply divided the scientific community. It is the basis of the language acquisition model proposed by Rumelhart and McClelland.

The ruleless systems are called connectionist networks because they consist of densely connected network of processors. The processors transmit signals that vary in intensity according to the strengths of the signals that each unit of the network receives. Connectionist models work by learning routines to adjust the strengths of the connections between the processors. No rules are fed into the system but, at the end, when the network adjusts itself, something very much like rules are learned.

Rumelhart and McClelland wanted to explain with their connectionist model what really goes on when children learn to speak. Are there rules that children gradually discover or do they learn by a process that more closely resembles forming analogies? And if language acquisition can be understood, can it also be modeled by a computer? In particular, the two researchers wanted to explain the strange process that children go through when they learn to form the past tense of verbs.

The past tense in English is almost incredibly simple. To form the past tense of the verb, you almost always add “ed’ to the end of it, unless it is an irregular verb, such as “go’ or “bring.’ Among the 150 or so irregular verbs, “go’ is in a class by itself. The rest of the verbs fall into about 20 groups, such as the group containing the verbs keep, sleep, and weep. All the verbs in a group form past tenses in the same way.

Because the past tense is so simple, saysPrince, “linguists have not studied it much. When you go to graduate school, it is not the sort of thing you linger over.’

But psycholinguists have discovered that when young children learn to speak, they start out by forming irregular past tenses correctly and then they get worse–they over regularize. Finally they learn the correct forms again. For example, children start out by saying “brought’ and “went.’ Then they switch to “bringed’ and “good’ before they relearn the correct irregular forms.

The standard explanation of over regularizationis that children when they first learn to speak, memorize words one by one without regard for any relations between them. Later, they discover the past tense rule and run amok with it, over regularizing, because they do not grasp the structure of the language. Finally, they learn the exceptions to the past tense rule and their speech becomes correct again. The idea is that children eventually learn the past tense rule for regular verbs and learn the irregular past tenses by analogy.

Rumelhart and McClelland started out by assuming that this standard explanation is correct. Only after they developed their connectionist model of language acquisition did they question it. Rumelhart, in fact, used to illustrate the observation that children learn rules for forming the past tense by playing for his students a tape of his own little boy, who was 5 years old when the tape was made.

In the recording, Rumelhart asked his sonwhat grade comes before the seventh grade. “Sixth,’ the boy replied. Then Rumelhart asked what grade is before the sixth grade. “Fifth,’ the boy said. What is before fifth? “Fourth.’ What is before fourth? “Thirdth.’ What is before third? “Secondth.’ What is before that? “Firsth.’

Then Rumelhart asked the question in the opposite order. What grade is after kindergarten? His son replied “First.’ What is after first? Second.’ Rumelhart continued up to grade seven and, this time, the boy got all the words right.

“I would play this tape for students and would tell them that it was obvious that the kid had learned a general rule,’ Rumelhart says. He did not worry about the fact that his child got the words wrong when he went in descending order and got them right when he went in ascending order.

Rumelhart told his students that the past tense is learned in the same way–with rules. Then Rumelhart and McClelland noticed that connectionist networks tend to over regularize in the same way that children do whey they learn to speak. Rumelhart explains that “when there are some things with regular patterns and others with unusual patterns, the networks learn the regular patterns first and apply them where the more unusual ones should be applied.’ Only later does the network learn the unusual patterns.”

Teaching an AI Spanish if it only knows English is one thing however, that has not happened yet.  However, can teaching an AI a new language help our own understanding of how languages can be learned more easily by humans?  Learning Spanish is still rather hard and time consuming for people, however perhaps it could be made easier if AI can teach us some tricks.