UPDATED 08:45 EDT / DECEMBER 03 2019

CLOUD

Exclusive: With new tech, Andy Jassy aims to take Amazon’s cloud everywhere

The marquee appearance at Amazon Web Services Inc.’s annual re:Invent conference is always the marathon two- to three-hour keynote by Chief Executive Officer Andy Jassy.

And as usual, the longtime leader of Amazon.com Inc.’s cloud computing company is expected to introduce a raft of new and upgraded services, from compute instances and databases to processors and, in particular, its highly anticipated Outposts rack for bringing its cloud to corporate data centers.

But in an exclusive, two-hour conversation in the personal sports bar in his Seattle basement that he calls Helmet Head, Jassy (pictured) offered some broad hints of what’s coming this morning and, more important, provided insight on the underlying trends and Amazon’s strategy for moving its cloud services everywhere customers want them.

In this second installment of the interview, Jassy discussed the thinking behind Outposts, AWS’s strategy for providing services at the network edge, and new technologies it’s working on, such as specialized processors and even quantum computing. The conversation was lightly edited for clarity.

Look for more strategic and competitive insights from Jassy in the third installment of the interview coming tomorrow, and in the first part that ran Monday. And check out re:Invent coverage all this week by SiliconANGLE, its market research sister company Wikibon and its livestreaming studio theCUBE, now in its seventh year covering re:Invent from the show floor.

Data, data everywhere

Furrier: Data is the competitive advantage today, as well as ways to analyze and use it using machine learning. How do you see data impacting that next chapter in cloud transformation? 

Jassy: If you’re making a transformation as any company, particularly in enterprise, that’s been around for a while, you have to realize pretty quickly that the scale of data is so different today in modern applications than it was 10, 20 years ago. If you just look at relational databases, which is a great example, the reason that people used relational databases for virtually every workload for so long was significantly in part because the amount of data were really gigabytes and occasionally terabytes.

Today, most applications, you’re talking about petabytes and sometimes exabytes. When you look at data on a scale that large, you don’t want a relational database for every single one of your workloads. In fact, it’s typically too complex and too expensive and not performant. It’s why you see us talk a lot about having purpose-built databases.

Furrier: Give us some examples.

Jassy: If you’re somebody like Lyft that has millions of combinations of drivers and mapping coordinates, you basically just want a fast, high-throughput, low-latency key-value stores, which is why we built DynamoDB. If you want to do logins with microseconds of latency, you just want a memory cache, which is why we built ElastiCache. If you’re Nike and you want to connect lots of different social databases, you want a graph database. If you’re doing IoT, just look at how many devices now sit at the edge that are collecting data and then sending the data to the cloud to analyze and process and take action back. You don’t want a relational database. You want something that is anchored on time series. Time is the factor that you want to pivot that database on, which is why we have this Timestream database that we’ve announced.

People are running away from the older-guard, commercial proprietary relational databases. People are (also) moving away from Windows to Linux, which has been happening for a long time. You can see the compounded growth rate of Linux, close to 20%, and Windows negative 4%. The estimates next year, are something like 80%, 82% of the new workloads are going to be Linux. Which makes total sense. It’s very hard to be beholden to one company that controls the licensing terms. They can change the prices like they always do or the terms like they often do.

Furrier: So different kinds of databases are becoming a key to transforming enterprises via the cloud?

Jassy: Companies are becoming increasingly impatient that their data lives in all these silos all over the place that are inaccessible by their analytics and machine learning tools. It’s why you’ve seen this concept of data lakes become so popular over the last couple of years. The vast majority of data lakes in the cloud are built on S3, which is our object store. That’s true because S3 is a more secure, more durable, more reliable, much more featured capability that allows you to deal with a much more granular object level instead of just at the bucket level.

If you think about these data lakes that sit on top of these object stores and then you think about those becoming accessible to all these different applications, just think about the complexity that didn’t exist before. You used to have data and there was an application. There are a few applications that were able to access that data and other applications will be able to access other data.

When you bring all that data together and then you want all these applications to be able to connect with it, you need a very different access control strategy. The applications that I’m going to let you use are maybe different from what I’m going to let one user or another use, and figuring out how to build the right access control and policies around that is very complicated and very hard and very onerous for companies. That’s something that I think people would really like to see solved.

Furrier: What applications are the most important here?

Jassy: Think about the next wave of these analytics services. Even take data warehouse, something like Redshift that we launched several years ago, which really revolutionized the data warehousing space. It was the first cloud database. You get a terabyte of data for less than a thousand dollars. Over time, customers say, “I want to make sure that I have a way as I’m scaling really large to be able to separate my compute from my storage so that I can scale differently depending on what resource I’m using.”

A lot of the ways that people have approached this have been paradigms that are very straightforward, which is,” I have storage here, I have compute here, I will make sure that I can take my storage and move it to the compute.” Just think about the scale of data that we’re getting into. You’re not going to be able to move that much data over the wire. Most peoples’ networks aren’t going to be able to handle that.

Compute moves to storage

Furrier: And you’ve got more edge points too.

Jassy: More edge points, much more storage. You’re never going to have enough bandwidth to move the amount of data that you want. You have to think about a completely different strategy when you’re talking about data that large, when you think about analytics. You need to think about how to fuse compute analytics and storage in a way that hasn’t been done before.

Furrier: What new architecture is needed as more and more data comes in?

Jassy: One thing they’re going to have to think about is how to take the data that’s close to the analytics or close to the compute and intelligently figure out what needs to stay hot or warm and what needs to go back to something cold. If you can’t figure out how to do that automatically and intelligently, people’s costs are either going to go through the roof because they’re keeping everything hot or warm, or they’re just not going to be able to do very much because they just won’t be able to [get access to the data fast enough].

It’s using machine learning and experience and intelligence to figure out how to automatically move data [to the right tier].

I think also that you have to think about if you can’t move that much data over the wire, how are you going to actually get the compute close to the storage? People have appropriately said it’s great if we can scale compute and storage separately, but we’re reaching a scale point where they can’t keep moving the data. They’re going to have to actually find a way to get the compute and storage together, and that’s complicated.

Furrier: What does Amazon bring to the table for that?

Jassy: We have quite a bit. We have by far the most capable object store and data lake capabilities. We have a service called Lake Formation, which we launched last year at re:Invent, which makes the creation of a data lake so much easier than ever before and so much easier than you could find elsewhere. We have differentiated capability around access control, more security around that data, and more analytics services than anybody else, and more machine learning.

You can expect over the coming months, including over the next few weeks, for us to have some additional capabilities that are really aimed to try to address what we see as some of the big challenges around having more data scale than ever before and trying to make that increasingly easier for people and higher performance.

Furrier: You think at re:Invent, there will be some discussions.

Jassy: Yeah, I think there will be.

Machine learning ascends to the cloud

Furrier: How is Amazon trying to move the use of machine learning forward now that it’s so key to so many applications?

Jassy: We have a very substantial investment in machine learning across all three layers of the machine learning stack, and we’re not close to being done. One is at that bottom layer of the stack, which are for expert machine learning practitioners, we continue to see that most other cloud providers are trying to funnel all of the machine learning work through just one framework, which is TensorFlow. We have a lot of TensorFlow — about 85% of the TensorFlow [workloads] in the cloud runs on top of AWS.

But what we find is that 90% of the machine learning expert practitioners are using more than one framework every day in what they do. That’s because there’s so much research happening right now in the universities and academia and companies and they all use different frameworks, and people want to take advantage of those algorithms that they built in whatever framework they’re in. They don’t want to have to port it to another framework.

Furrier: So how are they doing that?

Jassy: What we find with customers is that they want us to support all the major frameworks. We have single-threaded separable teams on TensorFlow but also on PyTorch and on MxNet. We continue to have very broad support and very strong performance in every major framework for customers so they have the right tool for the right job. In that middle layer, if you want machine learning to be as expansive as we believe it can be, you’ve got to make it easier for everyday developers and data scientists, and it’s why we built SageMaker, which is really a step-level change and how to build, train, tune and deploy machine learning models.

It’s crazy how many thousands of companies have standardized on top of SageMaker and yet, we still think it’s the relative beginning. While we’ve made all those steps much easier, there’s multiple steps in the workflow of building a model and getting it into production and getting predictions that still we believe that we can make it much easier than it is today.

Furrier: What are the use cases for SageMaker right now that you’re seeing pop out of the woodwork?

Jassy: If you’re building any kind of machine learning model that you want to use to change what your company can do, you really have three choices. You can hire a bunch of expert machine learning practitioners. There’s very few of them in the world and a lot of them hang out at the tech companies. You can just throw your hands up and say this is too hard and not do it and wait.

You can use something like SageMaker, which is much more approachable and accessible for everyday developers. We have loads of companies using it … Intuit, NFL. Virtually every company that’s doing an expansive amount of machine learning, not an occasional model, but where they’re actually doing machine learning to really change their business on multiple dimensions, is using SageMaker. It changes the number of people at your company that can engage in machine learning.

Furrier: AI is being applied to a much more vertical focus on specific industries, rather than just a broad horizontal way. What are you seeing?

Jassy: We see both. I think there are a large number of enterprises that are getting what I would consider the horizontal value that doesn’t have to necessarily do with their vertical out of using SageMaker and machine learning and AI.

It’s because there are certain problems which are pretty similar in lots of different industries. It is also true that a lot of companies and vertical market segments see having unusual troves of data in their space and need the right type of expertise to train that data the right way. We do see an increasing amount of companies in different vertical business segments building specific applications to their industry that you wouldn’t see in other industries.

Amazon at the edge

Furrier: Obviously, people are keeping IT operations on-premises at least in part. Outposts was a big announcement last year that speaks to this hybrid cloud operating model. How do you see it evolving?

Jassy: We’re in the early-middle part of this giant transformation from on-premises to the cloud, but it will take a long time. There are going to be workloads for the foreseeable future that can’t easily move. It could be that you have a factory where you need your workloads close to that factory and it can’t be all the way to the cloud. If you have a data center where it’s close to whatever workloads have to stay local to it, being able to have an Outpost with AWS racks of compute and storage and database and analytics and machine learning where the APIs are the same, the control plane is the same, the hardware is the same, the tools are the same, and then it connects seamlessly to whatever else you’re running in public regions in AWS — that is a very powerful combination. People are very excited about Outposts.

I think there are examples, though, that people have to solve. You can look at companies that don’t have data centers in certain cities where they have end users, where they need single-digit-millisecond latency in that particular city. They either don’t have data centers or they have a small colocation facility that they don’t want anymore. Where do you send the Outposts to? You don’t have a data center or you don’t want a data center. I think that’s a problem that people have to think about because it’s difficult to have infrastructure regions in 500 cities around the world.

Furrier: These were called those telephone closets in the past. It’s the Outposts Closet in the Cloud.

Jassy: There’s an interesting innovation. I think that is something increasingly where workforces are more distributed in these major metro areas that I think folks have to think about.

I think also 5G is going to be really interesting because you have this technology that the telcos are understandably very excited about. It’s a very substantial improvement in performance, and it opens up a whole bunch of applications that require mobile and connected devices. When you actually have to connect, if you’re a connected device or mobile device, if you want to connect to compute and storage in the cloud, you have three or four hops.

Furrier: Like in a stadium, right?

Jassy: I think people are going to want to eliminate several or all of those hops and find a way to have the compute and the storage much more local to where the 5G is and then an experience that wraps that together.

I think the overwhelming majority of applications can, and will move, and are moving, to the cloud. But it will take time and there’ll be some of these that aren’t going to be easy to move where it’s incumbent upon people like us to help try to solve.

Furrier: With Outposts, you want to push Amazon to the edges. What’s the customer proposition?

Jassy: It’s that I have a handful of workloads that can’t move anytime soon and I want AWS on- premises. I think one of the reasons that these types of hybrid solutions haven’t really worked and gotten traction is that they’re just so different. It’s different APIs, different tools, different control plane, different hardware. It’s just a totally different model. On-premises and the cloud are pretty different. If you’re trying to connect two pretty different things with a clunky bridge, it’s hard.

Furrier: So it’s an “Amazon anywhere” strategy?

Jassy: Yeah, we really think of Outposts as distributing AWS on-premises.

Computing on the frontier

Furrier: What’s happening with compute? You can’t just rest on your laurels with EC2. Quantum’s hot, the edge is hot. You have data complexity. How are the core jewels of AWS evolving with this transformation?

Jassy: First of all, I think that it’s going to be a long time before people aren’t using compute instances (such as EC2) to a very substantial extent. So we have a meaningfully broader selection of instances than anybody else. We have the most powerful machine learning training instance. The most powerful machine in GPU graphics rendering instances. We’ve got FPGA instances. We are the only ones that have a hundred gigabits per second. We’re not close to being done there.

Furrier: What about containers? How is AWS viewing that evolution?

Jassy: The basic unit of compute is getting smaller. You see this with the advent of containers, where containers continue to grow at a very substantial rate. Most [cloud companies] have a managed Kubernetes offering and that’s their containers offering. We have that, which has grown really fast. But we have two others, including our own orchestration engine. We built an Elastic Container Service, which continues to grow really fast so that if you want the most integrated container with the cloud, since we control ECS, we can kind of build everything to work together from the get-go. Or if you want the open-source Kubernetes managed service, we would give you both of those. Those are both great managed services for containers.

But if you want to manage at the task layer or you don’t want to worry about servers or clusters, you can use Fargate, which allows you to manage at the task layer. That’s growing incredibly fast. It’s the only serverless container option that you can find anywhere.

So, we see containers growing really fast and we see event-driven serverless computing, in the form of Lambda and then all the other services that work with Lambda, growing unbelievably fast. I think a whole generation of developers are going to grow up not worrying or thinking about servers.

Furrier: How are you addressing the need for more raw computing power, where enterprises say to quote Star Trek, “Scotty, give me more power.”

Jassy: People would like to have different capabilities that could only be driven by new chips and it’s why we are designing a bunch of chips right now. You saw us launch something that we call Graviton which was our Arm-based chip. We’re the first Arm-based services in instances out there and we’ve been pretty blown away how many people are using them. They were really just for scale-out workloads. People would like to be able to use those Arm-based chips for more. Training for machine learning is a huge use case.

Machine learning is so new and training is relatively well-covered today by a number of us. It’s only 10% of the workload and the cost of doing machine learning in production, 90% of it, is the predictions, the inference. I think that you need a completely different type of chip to optimize for inference, which is why we announced Inferentia. We’re working hard at it and we hope to be ready soon.

Quantum leap

Furrier: What about quantum computing?

Jassy: There’s no doubt to me that you have to be investing in quantum computing and we have a pretty significant investment in that as well. There’s a lot of advancements being made and we have a bunch of things that you’ll see I think we’re probably several years away from people being able to use it in a needle-moving way.

But I think that more and more companies that we talk to are interested in the potential of it and want to find ways to experiment. There are so many different types of quantum computing machines today and each is optimized for different dimensions and people. Nobody really knows which is the right approach yet.

So people want to experiment with lots of different ones. In every layer, the instance, the containers, it’s event-driven serverless, with some of the stuff that we want to do with quantum with what we’re doing at the edge and how much, we make it easy to run compute, either in the cloud or on the devices themselves.

Furrier: So how close are you to offering quantum services to customers?

Jassy: It’s a potentially disruptive way for people to be able to get more computing done for less money. It’s something that we’ve been paying attention to and working on for many years. I think we all wish it were going faster. We are pretty straightforward with our customers about what’s real and what’s not. You’ll see a number of things from us in the quantum space because we believe in its potential and we’ve a big investment

But it’s going to take a few years before it moves the needle for people and then we’ll have to see for which workloads it’ll move the needle. But we’re very optimistic about its future.

Photo: Robert Hof/SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU