At the Strata-Hadoop World 2012 conference, Dave Vellante and Jeff Kelly interviewed Venkatesh Rangachari, the head of QED Velocity Analytics at Thomson Reuters. Rangachari explains some of the ways customers apply analytics technology to real time market data in order to make financial decisions.
Thomson Reuters is an analytic solutions company that aims to consolidate data from the stock exchange, currency markets, and commodity markets for businesses interested in implementing effective pre and post trade analysis for investment strategies.
Rangachari describes analytics as “a set of statistical functions that a trader would use to determine their financial investment strategy”. The financial data are called time series data. These are large sequences of temporal data measured at fixed intervals. In the past, Thomson Reuters developed their own proprietary data storage systems, but with limited resources and manpower, scalability became a major concern. With the emergence of open source programs like Cassandra and Hadoop, Thomson Reuters has been able to attenuate many of their scalability concerns. The company uses DataStax’s Big Data platform that bundles Cassandra, Hadoop and Solr in one cluster to allow for horizontal scaling of infrastructure in contrast to vertically adding more hardware.
According to Rangachari, Cassandra is ideal for performing analytics on real time data because of its “ingestion capability”. With data acquisition in the hundreds of thousands messages per second, Cassandra allows for effective real-time analytic solutions on large amounts of streaming data. Hadoop is used for analyzing historical data dating back from 6 months to 12 years.
One of the major themes from the Strata-Hadoop World 2012 conference is big data applications. Rangachari’s focus is in the development of programming paradigms to create application platforms that provide business solutions for their customers. Customers provide a set of constraints and Thomson Reuters provides the statistical analyses of the data through Cassandra or Hadoop which in turn provide financial solutions.
Looking toward the future, one of the main challenges that remain is improving the visualization of big data. Rangachari calls it the early days of visualization tools and states that moving forward it remains an important feature for analytic solutions.