Inside Experian’s journey to the virtual data center
Credit reporting giant Experian Corp. is in the process of moving most of its data processing outside the four walls of the data center. In a recent conversation, Experian Chief Information Officer Barry Libenson explained the reasoning and the payoff. This is an edited version of the interview.
For more on the journey Experian and other companies are taking, check out SiliconANGLE’s feature story and special report on the future of the data center.
It seems that the data center as we know it is going to look very different in a few years. How do you see it?
I don’t see a point where we have no data centers of our own. We will always have at least some kind of small footprint to address those customers that are adamant that their data remain on-premises with us for security. Our strategy is more of a hybrid model where you want burstable capacity into the cloud. We build everything with complete portability in mind so that when we develop an application, it can be built on-prem, in the cloud or pieces of it can run in either.
A business like ours is somewhat cyclical around the holidays. You don’t build the church for Christmas Eve. The colocation firms are also now providing a lot more than just versatile capacity. We can stand something up so part of it runs in AWS and part of it in our McKinney [Texas data center]. They have high-speed interconnects that make it look almost like you’re running on fiber, even when you’re in the cloud.
What we’re good at is writing decision and analytical software for our customers and helping them with the modeling challenges they have. Running a data center is not something that I would say is high value-add for us.
What’s changing in the colocation market that’s making it more appealing to you?
It used to be that when you signed up with a colocation provider, you essentially got a cage with a bunch of power and cooling and everything else was your own responsibility. The options now are nearly unlimited in terms of available services. Today, they’ve got all of these new technology capabilities, so that if [your software-as-a-service] application is running in that pod, the performance is almost as good as you would get in your own data center. They’re getting into a much more full-service type of offering at similar or better pricing than we’ve seen.
People tend to lump cloud and colos together and they are quite different. If it’s a new application that we’re building from the ground up, it almost certainly would be on a hybrid cloud using any cloud provider that supports OpenShift, which is essentially all of them. But if it’s an older application or if we want to house it in a certain part of the country for geographical or performance reasons, then we are more likely to consider a colo provider. That said, our shift is much more toward cloud than colo. Three to five years down the road, I would expect to have a dramatically smaller colo footprint and a significantly larger cloud footprint.
When you build a new application today, is cloud the default target? Do you even build apps for on-prem deployment anymore?
It doesn’t matter. Our software developers don’t know if they’re building in our own data center or on [Amazon Web Services Inc.] or [Microsoft Corp.’s] Azure. They see the exact same thing. We’ve been heavily into container deployment mode for the last six to 12 months, and we are moving people away from virtualization and towards containers. Our use of containers, OpenShift and what we call our application canvas means it’s identical in the cloud and on-prem by design.
Do you feel like you give up anything to achieve that portability?
We probably give up anywhere from a 10 to 15 percent developer productivity, but I’ll trade that any day for the flexibility it gives us.
As you move in this more flexible direction, do your costs go down?
Anybody who believes moving to the cloud is going to save them money doesn’t understand the way this works. The only real difference in any of these models is what gets capitalized and what gets operationalized. When you go to the cloud, you don’t have to buy that equipment up front, but over a five-year period the amount you’ll have spent is roughly equivalent. Most people who have been on this journey would argue that this isn’t a cost play. It’s an acceleration play. It’s the ability to innovate and drive development more quickly.
You quite advanced in the way you’re viewing the future of your infrastructure. How do you believe this gives you an edge over your competitors?
Getting thousands of developers onto the same core development platform gives you reusability and shared knowledge across the platform. It increases the speed at which we can drive innovation. This hybrid strategy also allows us to be completely agnostic when it comes to the data center. If Google suddenly disrupts this space, I need it to be easy for us to take advantage of those kinds of changes.
How are you preparing to accommodate edge computing, in which a lot more data is generated at the end of the network?
One component of our overall architecture is something that we call our data fabric. It’s a state-of-the-art mechanism for doing large scale data ingestion and normalization based on Cloudera and a multinode Hadoop [software for storing data and running applications on clusters of commodity hardware]. It scales linearly that we can build out unlimited performance ingestion as we see data volumes increase.
The other really big deal is data accuracy. We need pristine data quality because otherwise somebody doesn’t get approved for a mortgage. The data fabric is designed to bring a real-time curation model to enormous data sets. For example, it took six months to ingest the data into our Australia bureau when we stood it up about 10 years ago because of the linear processing model and exception handling. We took the exact same data set and ran in through the data fabric and we were able to get it done in under six hours. And we could have gotten it down to six minutes if we wanted to throw another hundred processors at it.
If you look ahead five years, can you guess what percentage of your processing is going to be on-prem versus colo and in the cloud?
I would say 20 percent on-prem, 60 percent in the cloud and 20 percent colo. When I joined the company, it was more like 70 percent on-prem, 20 percent colo and 10 percent in the cloud.
Photo: Experian
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU