UPDATED 14:19 EDT / SEPTEMBER 06 2022

BIG DATA

Three insights you might have missed at The Data Doesn’t Lie … Or Does It?

“We’ll expose data lies, big lies, little lies, white lies and hidden truths,” theCUBE industry analyst Dave Vellante promised as he introduced the “The Data Doesn’t Lie … Or Does It?” event. “And we’ll challenge age-old data conventions and bust some data myths.”

The super-webinar, which aired live on theCUBE Sept. 1, delivered on its promise to cover some rather contentious content.

“I think we were able to debunk a few myths that have cost companies a lot,” said Justin Borgman, co-founder and chief executive officer of Starburst Data Inc., speaking with theCUBE after the event.

Borgman was one of the expert panelists invited to debate potential data lies during the “The Data Doesn’t Lie … Or Does It?” event. He was joined by fellow data architecture experts Teresa Tung, cloud-first chief technologist at Accenture PLC, and Richard Jarvis, chief technology officer of EMIS Group PLC. (* Disclosure below.)

In case you missed “The Data Doesn’t Lie … Or Does It?” event, here are theCUBE’s top three takeaways from the event:

1) The pursuit of a single source of truth is pointless.

The statement that “the most effective data architecture is one that is centralized with a team of data specialists serving various lines of business” is absolutely a lie, according to Borgman. Before the onslaught of big data in the 2010s, creating a single point of truth for data was a possibility. Today, the amount of data constantly flowing into digital organizations makes the effort a waste of time and money, Borgman added. Data silos persist, despite Herculean attempts to create centralized enterprise data warehouses.

The investment required to migrate data and keep it up to date in an EDW is a losing battle, according to Tung. She acknowledged that while her ideal would be to have all her data within a data warehouse, mastered and owned by an essential team, “that’s just not the reality.”

Seemingly ignorant of this fact, “the entire industry has continued to move and copy excessive amounts of data in pursuit of a single view,” Borgman told theCUBE. Chasing this unattainable goal is costing companies time and money they will never recover, and it is time to stop.

“Companies now have options that better meet their need for fast access to accurate data. They just have to recognize it’s time to change,” Borgman said.

Watch Tung, Jarvis and Borgman contest the myth that “the most effective data architecture is centralized”:

2) Open source provides ‘the ability to be unsure about the future.’

No one can accurately predict the future of enterprise technology. While proprietary solutions may claim to be future-proof, building data architecture on open source is the only way to be flexible enough to stay at the cutting edge, according to Jarvis, who explained why EMIS is breaking from the data warehouse tradition by not opting for a proprietary vendor.

“We acknowledge that we don’t have perfect vision of what the future might be,” he said. “By backing open storage technologies, we can apply a number of different technologies to the processing of that data, and that gives us the ability to remain relevant and innovate on our data storage.”

EMIS can make this decision because open-source data platforms are now close to being as performant as proprietary, according to Borgman. After almost a decade of maturity, open source-based data platforms can store data in columnar formats and do updates and deletes just as easily as the proprietary enterprise data warehouse solutions that lead the market, he added.

“Using open data formats, you remain interoperable with any technology you want to utilize,” Borgman said.

He warned that organizations choosing to stay with proprietary data warehouse vendors will come to regret this decision in the future, as vendor lock-in will force them to fall behind on the innovation curve. “Lock-in is part of this industry, and that’s really what we’re trying to change with open data formats,” Borgman said.

Watch the panel debate “an open-source-based platform cannot provide the performance and control that a proprietary system does”:

3) Data mesh complements the modern data stack.

The third lie debated during the “The Data Doesn’t Lie … Or Does It?” event was the claim that today’s modern data stack is actually modern. The problem with this statement is the conflation of the words “modern” and “new,” according to Borgman.

“It’s the new data stack, it’s the cloud data stack, but that doesn’t necessarily mean it’s modern,” he said. “I think a lot of the components are exactly the same as what we’ve had for 40 years.”

A truly modern data stack needs to address the needs of the modern data economy, supporting people and processes and modernizing the technology, according to Jarvis. “Just because you can scale CPU and storage doesn’t mean you can get more people to use your data to generate more value for your business,” he said.

Hybrid environments, where organizations retain valuable data sources on-premises while also taking advantage of the scalability of cloud storage, are “a killer case for data mesh,” according to Tung. Data mesh enables organizations to have a “best of both worlds” scenario, she explained.

Health-tech pioneer EMIS is already on this road, according to Jarvis. “Our data product journey has really begun by standardizing data across a number of different silos through the data mesh so we can present both internally and through the right governance externally to researchers,” he said.

Starburst is “trying to help enable the data mesh model and make that an appropriate complement to the modern data stack that people have today,” Borgman said. This is because adopting a data mesh architecture gives companies the flexibility to operate and analyze data that lives in a wide variety of different systems. This enables companies to reduce costs by using a data lake for storage and gives the fastest time to insight because data is accessed where it lives, Borgman added.

Watch the complete discussion on if “today’s modern data stack is modern”:

Here’s the complete event coverage of “The Data Doesn’t Lie … Or Does It?” event:

(* Disclosure: TheCUBE is a paid media partner for the “The Data Doesn’t Lie … Or Does It?” event. Neither Starburst Data, the  sponsor of theCUBE’s event coverage, nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)

Image: ismagilov/Getty Images

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU