UPDATED 17:22 EDT / JULY 20 2017

dsc_0911 BIG DATA

A game of data science: the analytics architecture behind Riot Games

With the help of modern analytics, Riot Games Inc. developed a highly successful computer game called League of Legends, in which players form teams of champions and compete with other players around the world. Wesley Kerr (pictured), senior data scientist at Riot Games, explained how his organization is leveraging data science to improve player experience and weed out unsavory behavior.

“[In] about 2 percent of our games there is some form of serious abuse that comes in the form of hate speech, racism and sexism, things that have no place in the game.” Kerr said. “Right now it’s purely based on things said in chat, but we’re investigating other ways of measuring that behavior.”

Kerr gave a keynote speech at this year’s Spark Summit in San Francisco, California, and afterwards spoke with David Goad (@davidgoad) and George Gilbert (@ggilbert41), co-hosts of theCUBE, SiliconANGLE Media’s mobile live streaming studio, to dive into more detail about Riot Game’s data science stack. (* Disclosure below.)

A DataBricks power player experience engine

Kerr described what is under the hood at Riot Game’s data science organization. “We rely on DataBricks for all of our deployments. We do many different clusters and have about 14 different data scientists that work with us. Each one is able to manage their own cluster, spin them up tear them down, find their data and work with it through DataBricks,” Kerr explained.

Kerr went on to explain the configuration of the data warehouse itself and how they manage the sheer scale of data being processed.

“We’re able to leverage the power of our players; we have 100 million. … All the data flows into a hive data warehouse stored in S3. We have two different ways of interacting with it. We can run queries against Hive, which tends to be a little slower for our use cases. Our data scientists tend to access to all of that data through DataBricks and Spark, which runs much quicker for our use cases.”

Watch the complete video interview below, and be sure to check out more of SiliconANGLE’s and theCUBE’s coverage of Spark Summit 2017(* Disclosure: DataBricks Inc. sponsored this Spark Summit 2017 segment on SiliconANGLE Media’s theCUBE. Neither DataBricks nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)

Photo: SiliconANGLE

Since you’re here …

Show your support for our mission with our one-click subscription to our YouTube channel (below). The more subscribers we have, the more YouTube will suggest relevant enterprise and emerging technology content to you. Thanks!

Support our mission:    >>>>>>  SUBSCRIBE NOW >>>>>>  to our YouTube channel.

… We’d also like to tell you about our mission and how you can help us fulfill it. SiliconANGLE Media Inc.’s business model is based on the intrinsic value of the content, not advertising. Unlike many online publications, we don’t have a paywall or run banner advertising, because we want to keep our journalism open, without influence or the need to chase traffic.The journalism, reporting and commentary on SiliconANGLE — along with live, unscripted video from our Silicon Valley studio and globe-trotting video teams at theCUBE — take a lot of hard work, time and money. Keeping the quality high requires the support of sponsors who are aligned with our vision of ad-free journalism content.

If you like the reporting, video interviews and other ad-free content here, please take a moment to check out a sample of the video content supported by our sponsors, tweet your support, and keep coming back to SiliconANGLE.