UPDATED 14:51 EDT / SEPTEMBER 24 2020

AI

Facebook’s Dynabench tool fools neural networks to advance AI research

Facebook Inc. today debuted Dynabench, a research tool it hopes will allow computer scientists develop more powerful natural-language processing models.

To build cutting-edging neural networks that advance the state of the art, researchers need a way of comparing their models with those developed by peers. Accurate comparisons are a prerequisite to verifying that a new model is indeed better than existing entries into the field. This process is known as benchmarking.

With Dynabench, Facebook hopes to address shortcomings it sees in current benchmarking methods and facilitate the creation of more robust artificial intelligence software.

Researchers most commonly assess their models using test data sets, essentially collections of standardized questions. Several such tests datasets exist in the natural-language processing field. The issue is that, because of the rapid pace at which AI models are improving, tests can become outdated over time, leaving researchers without a reliable means of assessing a neural network’s accuracy and comparing it with existing ones.

Enter Dynabench. Facebook’s solution to the challenge is to crowdsource the benchmarking process partially by bringing human testers into the loop. The idea is that humans can more accurately assess a model’s accuracy than a set of pre-packaged test questions by coming up with harder, more creative challenges for the neural network.

Dynabench “measures how easily AI systems are fooled by humans, which is a better indicator of a model’s quality than current static benchmarks provide,” explained Facebook researchers Douwe Kiela and Adina Williams. “This metric will better reflect the performance of AI models in the circumstances that matter most: when interacting with people, who behave and react in complex, changing ways that can’t be reflected in a fixed set of data points.”

When an AI completes a round of testing, Dynabench identifies the questions that fooled the model and compiles them into a new test dataset. Researchers can use this dataset to help them build newer, more sophisticated models. Then, once a model is developed that can answer the questions the first AI couldn’t, Dynabench repeats the process and compiles another test dataset with even harder questions. 

The goal is to create a “virtuous cycle of progress in AI research,” as Facebook’s Kiela and Williams put it.

Having a more reliable tool for assessing model accuracy could benefit not only researchers but also enterprises that use AI in their applications. If enterprise software engineers  have a clearer view of how well different AI models handle a given task, they more effectively pick the AI  most suitable for their application from the countless available models out there. That, in turn, can translate to a better user experience and fewer costly errors. 

Image: Facebook

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU