Facebook debuts Dynaboard and Dynascore to advance AI research
Facebook Inc. today detailed Dynaboard and Dynascore, two new innovations developed by its artificial intelligence group that aim to help computer scientists build better machine learning models.
Dynaboard and Dynascore are implemented as components of Dynabench, a tool for AI researchers that Facebook open-sourced last year.
The technologies are aimed at streamlining a task that isn’t discussed too often, but represents an important part of nearly all AI research projects: benchmark testing. When computer scientists develop a new AI, for example a neural network intended to facilitate faster analysis of text than existing algorithms, they need to test its performance to ensure that it indeed outperforms earlier software.
That means the ability to benchmark AI models accurately is essential for research teams to verify the success of their projects. It’s also important for the startups and other companies working to apply researchers’ work in the field.
Benchmark test results are used by a company’s developers to find the machine learning model most suitable for the task they’re looking to automate, and choosing a non-optimal model can make their software less efficient. An efficiency difference of even a few percentage points has the potential to make a big impact at enterprise scale.
The issue Facebook hopes to address with Dynaboard and Dynascore is that generating reliable benchmark test results is often a challenge. An AI model may outperform other neural networks in one test, but fall behind in another. Because of the complexity of AI software, even slight variations in the way tests are performed can produce big differences in performance results, making it difficult to see how an AI compares against other algorithms.
Dynaboard and Dynascore are intended to ease the task by providing a standard, reliable way of comparing AI models. Dynaboard is a “evaluation-as-a-service” framework that can be used to create a cloud environment for carrying out AI tests. Dynascore is a metric, developed by Facebook researchers based on concepts borrowed from the field of microeconomics, by which different AI models can be assessed against one another. It appears as an easy-to-understand score next to each neural network.
An AI’s model Dynascore ranking is determined based on multiple factors besides just how accurate it is. The framework also looks at robustness, or how as well how a model can react to challenges such as spelling mistakes in text.
“An NLP model should be able to capture that a ‘baaaad restuarant’ is not a good restaurant, for instance, to be considered [flexible] under challenging situations,” Facebook’s researchers explained. “We evaluate robustness of a model’s prediction by measuring changes after adding such perturbations to the examples” used to evaluate an AI in benchmark tests.
Fairness is another key AI evaluation metric the software uses. “The AI community is in the early days of understanding the challenges of fairness and potential algorithmic bias,” the researchers said. “At the launch of Dynaboard, we’re starting off with an initial metric relevant to NLP tasks that we hope serves as a starting point for collaboration with the broader AI community.”
The other metrics by which the technology evaluates AI models focus on a neural network’s infrastructure utilization. That’s a key consideration because machine learning applications are often constrained by hardware limitations. A neural network that crunches data with industry-leading accuracy but is too complex to run on a phone, for example, isn’t a practical choice if a company is looking to build an AI-enabled mobile app.
After evaluating all these factors in testing, Dynaboard produces a Dynascore ranking with which users can compare an AI model against others in the category. The testing itself is carried out using Dynabench, a tool for AI researchers Facebook released last year.
Dynabench allows researchers to create multi-round tests that evaluate machine learning models with increasingly difficult tasks. After each round, the tasks that the model failed to complete are turned into a new, more complicated series of challenges.
Facebook says Dynaboard and Dynascore are already showing promise as useful tools for improving AI benchmarking. The company’s researchers used the technologies to rank a handful of today’s most advanced natural-language processing models. They then compared the scores against the popular SuperGLUE ranking of AI models and found that the results roughly matched.
“We hope Dynabench will help the AI community build systems that make fewer mistakes, are less subject to potentially harmful biases, and are more useful and beneficial to people in the real world,” the researchers said.
Image: Facebook
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU