UPDATED 09:00 EDT / NOVEMBER 11 2020

BIG DATA

Synthesized debuts a free tool for identifying and removing biased data

Artificial intelligence startup Synthesized Ltd. today launched a tool for companies to detect and remove bias in the data they use for their AI projects.

Data bias is a big problem when it comes to AI models, which are usually trained using enormous datasets. It refers to a kind of error in which certain elements of a dataset are more heavily weighted or represented than others. A biased dataset doesn’t accurately represent a model’s use case, resulting in skewed outcomes, low accuracy levels and analytical errors.

To ensure accuracy, the training data used for AI models has to be more representative of the real world. This is important because this data is essentially how machines learn to do their job.

Synthesized’s new Community Edition Bias Mitigation tool is designed to understand a wide range of legal and regulatory definitions regarding contextual bias that might lead to inaccuracies within data, across attributes such as gender, age, race, religion, sexual orientation and more.

The platform is also capable of automatically removing the biases it finds via a process it calls rebalancing.

Synthesized’s platform relies on a proprietary algorithm that can find and remove biases from data. Once the biased data has been identified, it then makes randomized changes to the original, biased dataset to create an entirely synthesized yet unbiased dataset that can be used to train AI models more accurately.

“With the generation of synthetic data, Synthesized’s platform gives its users the ability to equally distribute all attributes within a dataset to remove bias and rebalance the dataset completely,” the company said. “Users can also manually change singular data attributes within a dataset, such as gender, providing granular control of the rebalancing process.”

To use Synthesized’s tool, all users have to do is sign up at its website and upload a structured data file such as an Excel spreadsheet to start the analysis process. It’s also possible to connect Synthesized’s tool to a relational database service hosted on Amazon Web Services, Microsoft Azure, Oracle or Google Cloud Platform, and build custom datasets for analysis. Once the analysis is done, Synthesized then provides a Synthesized Total Fairness Score that shows what percentage of the dataset contains biased data, and highlights areas of the data where bias was detected.

Synthesized said its tool is able to analyze any just about kind of dataset, including financial data that’s used to create credit ratings, insurance data that’s used to assess claims, and human resource data in order to identify bias during the hiring process.

The free Bias Mitigation tool is part of Synthesized’s wider data preparation platform. The company offers a complete set of AI-based tools that automate data provisioning and data preparation while also ensuring compliance with regulations.

“The reputational risk of all organizations is under threat due to biased data and we’ve seen this will no longer be tolerated at any level,” said Synthesized founder and Chief Executive Dr. Nicolai Baldin. “Synthesized’s Community Edition for Bias Mitigation is one of the first offerings specifically created to understand, investigate and root out bias in data.”

Image: Synthesized/Twitter

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU