Do a search on Indeed.com for “big data engineer” and you will find 26 listings. That compares to ten job listings for “big data architects.”
I became curious about this new job description when I saw a job post from Disqus.
Then I looked at the trend line.
It’s revealing to look at the companies seeking big data engineers. Employers are looking for people who know MapReduce, Hadoop and related franeworks such as HBase, Pig and Hive. Programming languages in demand include: Java, Ruby, or C++. It really covers the gamut, which is part of the issue with using big data in a job description. Do you have MongoDB expertise? Yes, that’s applicable. Practical hands-on experience with Bayesian models and neural networks? Yes, the job may be right for you.
I found startups that focus on serving the advertising industry and financial services companies. Consulting companies and technology giants are all using the new title.
Here they are:
- Amazon (6)
- The Climate Corporation (3)
- Greythorn (3)
- Unlisted Company (2)
- Jobspring Partners (2)
- Behance, Inc. (2)
- Elevate Recruiting Group (2)
- Krux Digital (2)
- Beanstock Media (1)
- Tribal Technologies (1)
- Kontagent (1)
- Discovery Staffing (1)
- Convertro (1)
- Disqus (1)
A few job descriptions to give a flavor of what employers are seeking:
A9.com: The Amazon subsidiary is looking for a big data engineers to:
- Design, develop and support a map-reduce-based data aggregation pipeline for processing billions of events a day
- Support data-mining and machine-learning algorithms using behavioral data
- Study state of the art techniques in massively parallel frameworks and apply them to advertising problems
- Help other engineers get the most out of the platform you own
Climate Corporation: This San Francisco-based company helps companies adjust to climate change. I like their bonus points:
Experience with Lisp and/or Clojure (functional programming languages)
Experience with large-scale machine learning techniques (examples: Google PageRank, Netflix Prize, genome sequence assembly, computational finance)
Experience with Amazon Web Services (EC2, S3, SQS, etc.)
Deep knowledge of the Hadoop ecosystem
Git version control
Frequent contributor to open source projects (show us your work on github!)
Diffbot is a visual robot for the web that can can “see” web pages like humans do. It was the inaugural startup out of Stanford’s StartX Accelerator program.
Some of the benefits:
…and if this is what you want to be doing with your day:
Building massive training data sets and mining them for product insights.
Benchmarking and optimizing performance of classifiers.
Learning and developing new algorithms for classification and training.
Building, testing and deploying code to our hybrid cloud architecture.
Working directly with the founding team.
Whatever else it takes to help Diffbot understand the web.
What do you think of the “big data engineer,” title?