UPDATED 19:08 EDT / NOVEMBER 11 2020

CLOUD

Amazon makes data handling easier with AWS Glue DataBrew tool

Amazon Web Services Inc. today announced an extension to its AWS Glue data cleansing service, adding a visual user interface that helps automate some of the steps involved in cleaning and normalizing data without writing any code.

The new tool launched today is called AWS Glue DataBrew, and it simplifies the extract, transform and load or ETL process, which needs to happen before it can be sent to a database, data lake or data warehouse for analysis.

AWS Glue is a fairly old service that was introduced back in 2016 for engineers to do ETL, but there is a fair bit of coding involved with it. DataBrew, on the other hand, makes it possible for workers that don’t have coding skills to do the same data preparation work simply by clicking their way through a visual UI.

Amazon said AWS Glue DataBrew consists of more than 250 pre-built transformations that help to automate essential data prep tasks such as filtering anomalies, standardizing formats and correcting invalid issues. It said that these tasks would otherwise likely take days or weeks to perform using hand-coded transformations.

Once the data is prepared, it can then be queried using analytics tools or used to train machine learning models, for example.

AWS has posted a handy video demonstration on YouTube that illustrates how DataBrew can remove special characters such as an ampersand in a database entry, as these can’t be used in data analysis. In another example, DataBrew maps a text-string to numeric values so as to make it possible to analyze those entries using a categorical mapping function. There’s also a profiling function in DataBrew that provides useful information such as the number of missing entries in each data set.

The launch of AWS Glue DataBrew will likely put Amazon in direct competition with companies such as Talend SA, which specializes in data cleansing.

“Data is the fuel for AI-based apps, but there aren’t enough people capable of prepping and providing the data for all the AI that has to be built,” Constellation Research Inc. analyst Holger Mueller said. “This is why low-code offerings like AWS Glue DataBrew will be critical to enable technically savvy business users to take charge of their AI destiny.”

The service has already been endorsed by some big-name customers, including Japanese telecommunications firm NTT Docomo Inc., the British energy giant bp p.l.c. and Invista, a subsidiary of Koch Industries, Inc. that makes polymers, fabrics and fibers.

“Data is critical to optimizing our manufacturing processes [but] the data ingested into our data lake often contains duplicate values, incorrect formatting and other imperfections that make it difficult to use in its raw form,” said Invista’s analytics and cloud leader Tanner Gonzalez. “Amazon AWS Glue DataBrew will allow our data analysts to visually inspect large data sets, clean and enrich data, and perform advanced transformations.”

Amazon said AWS Glue DataBrew is generally available now in its US East (N. Virginia), US East (Ohio), US West (Oregon), EU (Ireland), EU (Frankfurt), Asia Pacific (Sydney) and Asia Pacific (Tokyo) regions.

Image: Amazon Web Services

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU