NEWS
NEWS
NEWS
HPCC Systems, the Apache Hadoop competitor developed at LexisNexis Risk Systems, just shared its source code on GitHub. The company announced in June that it would open source the project and it’s now made good on that promise. HPCC Systems released virtual machines running the HPCC platform in June, but now for the first time developers will be able to take a look at the code and customize it to their own ends.
Escalante says the HPCC team had to clean up the code to prepare it for public consumption and create a contributor agreement before the company could publish the source. He also says the company contracted both Black Duck and Palamida to audit the code to make sure everything was properly sourced and licensed.
HPCC stands for High Performance Computing Cluster. HPCC is distinguishing itself from Hadoop with its “SQLish” programming language called ECL and its near real time query system called Roxie. Wikibon’s Jeff Kelly did a comparison of Hadoop and HPC in June and concluded that companies that want to get started with big data take a look at both HPCC and Hadoop.
Armando Escalante, CTO of Risk Solutions, said at the GigaOM Structure conference that the company may start offering a data-as-a-service which will give customers access to cloud hosted HPCC clusters. He also said the company might make some of LexisNexis’ data sets available for analysis via this service. I’ve previously speculated that Microsoft is taking steps in this direction as well.
While I think data-as-a-service will be an important market in the future, that’s still some time off. But enterprises managing development can learn some more immediately applicable lessons from Escalante and his team’s experience of taking the product open source.
Escalante’s first piece of advise is for development teams to treat all projects as if they were open source, even if they are only used internally. Not only does this make these projects more ready to be open sourced in the future, he says, but it forces best practices that improve collaboration internally.
Escalante says the HPCC team had to make some changes to structure of the project to make it work as a GitHub project. They also had to clean up the comments, get rid of dead code that had never actually been used in the project and make various elements more consistent. He recommends writing all code comments with the assumption that eventually the public will see them, and structuring a project as if it were to land in GitHub eventually. “When you work in an open source manner, you work more efficiently, even internally, because it’s accessible to more developers,” he says.
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.