

Unifying multiple data sources and repositories is a challenge that Etsy, Inc. is solving with the Apache Kafka messaging system. Chris “CB” Bohn, senior database engineer for the Etsy online marketplace, tells theCUBE, SiliconANGLE’s Media production team, at the HP Big Data Conference that having all data flowing through Kafka pipeline “makes it much easier, as we only have to manage the Kafka part of things now.”
Simplicity is key to Bohn, who says, “It’s got to be manageable; if it’s not manageable, then it doesn’t matter how functional it is.”
Bohn believes that the community should “get a common bus for data and then everyone can be a consumer,” and he sees Vertica’s announcement that it is also using Kafka for data streaming as a step forward to this goal. Etsy is ‘bullish’ on Kafka, and Bohn is confident that Apache Kafka will emerge as the winner in the race for most popular messaging system.
As Etsy grows, Bohn looks to the building blocks of Vertica, Hadoop and Kafka as giving the company the flexibility that is critical for the future of analytics. While agreeing that small packaged apps may work best for smaller companies, he sees Etsy as having grown past this point. “We know best what’s going to work for us. Some things we have to roll our own.”
Bohn has seen mobile use outstrip traditional desktops systems rapidly, with 60% of Etsy users now accessing the site from a mobile platform. Etsy previously had a tedious time loop but are now looking at analytics solutions to allow real-time analysis of clickstream data to dig deep into consumer trends for regions.
Bohn says that Vertica is the “lynchpin of (Etsy’s) whole datastack at this point because of the good SQLing it has. People know SQL.” He jokes that his business analytics team won’t allow him to upgrade versions because Vertica is in such heavy use and can’t go down.
Watch the full interview below, and be sure to check out more of SiliconANGLE and theCUBE’s coverage of the HP Big Data Conference 2015.
THANK YOU