UPDATED 11:00 EDT / AUGUST 15 2012

Data Scientists Predict What Music You’ll Love, Disrupt Common Marketing Logic

Will you love your favorite musician’s next album? Whatever you suggested, Shanda Innovations (tech incubator of China’s Shanda corporation) may know better than even you. A Music Data Science Hackathon, which was oraganised by Data Science London, using Kaggle platform, and sponsored by EMI and EMC, challenged data teams to answer this question about taste by developing an algorithm to predict “listener’s appreciation of songs and artists, based on listeners demographics, listeners words of appreciation, and interviews contained in EMI’s One Million Interview Dataset,” as noted on EMI’s music data science website. Shanda won the competition over 138 competitors that vied to create predictive analytics that took into account age, geography and tastes to predict how a person would rate a song. Alex Knapp’s Forbes article on the competition shows that the competing teams’ analyses challenge traditional marketing approaches and beliefs.

In her 2011 Ted talk, “Social Media and the End of Gender,” Johanna Blakley explains that while advertisers are using the “same ole, same ole,” demographic information of age, race and gender, to predict who to market to and how, communities of interest have changed. Now, people are linked not only by identity categories, but according to tastes and preference. In line with Blakley’s research, Knapp notes that Shanda’s analysis shows that age and socioeconomic data “weren’t accurate predictors of songs,” rather, “general interests and attitudes were much better drivers of predictions.”

In a Kaggle blog post about their winning algorithm, the Shanda team shares: “We were very surprised to find that the variation of the track scores given by different people was a lot more than we expected. For instance, User ID 41072 scored 100 to track 156 whereas User ID 41286 gave merely 4 to the same track! It was very interesting to find that people were so different in music preference and we believed that was why so many different types of music existed.” Anthony Goldbloom, Kaggle President, notes that what competitors found with regards to age contradicted common assumptions. According to Goldbloom: “As it turns out, older, retired people were much less discriminating and more open in their musical taste than younger people, which is the opposite of the stereotype.”

Shanda Innovations Team

Shanda also explains, to develop their analysis Shanda mapped words that participants in EMI’s interview dataset used to describe artists “to some keyword IDs and used these IDs in the logistic regression model, which greatly improved the performance…the main machine learning methods were SVD++ and Logistic Regression.” The team used C++ and Python programming languages and thanks the APEX team for its SVDFeature toolkit, which it also employed. More innovative projects like the EMI’s impressive dataset, hackathons with some of the world’s most brilliant data scientists will undoubtedly produce even more disruptive marketing insights and enhanced music personalization technology.

 


A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU