Artificial intelligence bot beats humans at reading in a first for machines

A deep neural network model developed by Alibaba has scored higher than humans in a reading comprehension test, paving the way for bots to replace people in customer service jobs.

Artificial intelligence (AI) software developed by Alibaba Group has performed better than humans in a global reading comprehension test, the first time that machines have outperformed people.

The AI research arm of China’s biggest online commerce company developed a machine-learning model that scored higher on the Stanford Question Answering Dataset, a large-scale reading comprehension test with more than 100,000 questions, according to a release by the company.

On January 11, Alibaba’s machine-learning models scored 82.44 on the test, compared with 82.304 by humans.

While computers have beaten humans at complex games like chess, where raw computing power and an infallible memory have given bots an advantage, languages are generally seen as harder for machines to master. Until now.

AlphaGo’s China showdown: why it’s time to embrace artificial intelligence

The win has broader implications for how companies deploy machine learning to replace customer service jobs that have so far relied on armies of call-centre employees to handle inquiries.

Si Luo, a chief scientist of natural language processing at Alibaba’s research arm, said the recent breakthrough means that questions such as “what causes rain?” can now be answered with a high level of accuracy by machines.

“We believe the underlying technology can be gradually applied to numerous applications such as customer service, museum tutorials, and online response to inquiries from patients, freeing up human efforts in an unprecedented way,” Si said.

Si’s team has worked closely with Ali Xiaomi, a mobile customer service chatbot that retailers on Alibaba’s online market platforms like Taobao and Tmall can customise for their virtual stores.

Like in the Stanford test, the machine learning model could identify the questions raised by consumers and look for the most relevant answers from prepared documents.

Even so, the Alibaba scientist said that the system currently only works best with questions that offer clear-cut answers. If the language or expressions are too vague or ungrammatical, or there is no prepared answer, the bot may not work properly.

A number of top-notch international universities and global technology firms, including Google, Facebook, IBM and Microsoft, have all used the Stanford test to determine whether the machine learning models they built can answer the questions in the data set.

Alibaba, which owns the South China Morning Post, has employed the underlying technology during its November 11 shopping festival over the years, with machines answering huge volumes of inbound inquiries during the sales period, the company said.

Explainer: Why do millions of people choose to shop online on November 11?

Apart from online services, a number of Chinese technology companies, including Alibaba, have released smart music speakers that can identify voice commands and come up with answers and solutions.

In late November, Baidu and Xiaomi laid out their vision for smart devices that will deliver an enhanced experience to users, powered by AI and connected to the internet of things. Apple Inc. is also due to release its much-anticipated voice-controlled Siri home assistant, called the HomePod, this year to compete with similar products launched by Amazon and Google.

This article appeared in the South China Morning Post print edition as: Alibaba software betters humans in global reading testAlibaba software betters humans in reading test

Comments