Session Name: Challenges and Opportunities in Cloud Computing and Big Data.
Chris Albrecht 00:05
Thank you Barb. From the Cloud over to big data, actually the Cloud and big data, we’re going to talk about challenges and opportunities in Cloud computing and big data, and that’s going to be moderated by Derrick “Big Data” Harris himself. He’s a Senior Writer with GigaOM, and he’s going to be talking with Dave Campbell, the CTO of Cloud and Enterprise at Microsoft. Please welcome Derrick and Dave to the stage.
Dave Campbell 00:32
I didn’t know you had that handy title there. Big Data Harris?
Derrick Harris 00:37
Yeah, that’s official. That’s my business card. So, Dave, 20 years [chuckles] you’ve been at Microsoft now?
Dave Campbell 00:45
Derrick Harris 00:45
Almost 19 years?
Dave Campbell 00:46
Derrick Harris 00:46
A lot has changed. Just a quick recap, looking back from your early days working on some of the database stuff, and now here you are as CTO of Cloud. Can you just give the shift? Your take on the shift?
Dave Campbell 01:03
When I got to Microsoft, it was just beginning to get into Enterprise computing and it hired a bunch of people from Round Industry to build database, transaction processing, elements, web server, and such. Dave Cutler, and others, had come from DEC West to build what became Windows NT. So long road, I worked on SQL Server with a bunch of guys from DEC and IBM who worked on DB2 and such, and that all went well. Now the next big shift in computing, everything moving to the Cloud. That’s what I’m focused on these days.
Derrick Harris 01:37
How does that happen within Microsoft? You’ve been there through this whole shift. Can you just walk us through from the internal perspective of what it looked like as you see these things start to happen?
Dave Campbell 01:50
The interesting thing, I think, is the one key piece of it was the fact that we had internet scale services for a long, long time in Live Search, Bing, Hotmail, and such. I had friends who worked with me in the database space when over there. In fact, there’ a guy I worked with who became the architect for Live Search, Bing, and the big data environment over there. And I tell people, when he went over there 10 years ago, I’d run into him a couple of times a year, and it’d be like, How do things look like from your side of the world? Over the last five years, those things have started to converge. Five, six years ago, there were a few people who– in fact, Bill Gates was one of the folks pushing on a new wave, what are we going to do about it? Ray Ozzie played a big role in term of moving us towards services and such. Then it’s just a matter of what does it mean for us? What does it mean for our existing businesses? The ability to be running things at scale in things like Bing, and Hotmail, and such, now with the acquisition of Yammer; and to have the Enterprise business that we have in SQL Server, and other products, and System Center, and such is a pretty interesting mix
Derrick Harris 03:05
That’s a good point because when I hear the discussion of Cloud computing, here’s what I hear, Amazon and Google, and then I hear HP and Dell, and maybe IBM. And Microsoft seems to be the missing middle in there somewhere. But there’s a point, because the comparisons to Amazon and Google are always scale and Microsoft runs at a pretty impressive of scale right?
Dave Campbell 03:29
Yeah. It’s interesting. I think from the perspective of someone who’s been building Enterprise software for 25 years or so, building things at this scale is vastly different. I first started working on a project, early incubation, say in 2007 and 2008 where I took it on, and we would hire people from the SQL Server team. And I started to assess people’s ability to unlearn because things are different at this scale. So the way I think about it is there’ll be a handful of folks who will be hundreds of thousands, say a million server scale, and the learning that you get from that is pretty dramatic. And the economies of scale, in terms of operating at that level, are pretty dramatic as well. So you can put Google, Amazon, Facebook, and Microsoft in that camp, and maybe one or two others. So that’s interesting learning. Then taking that learning and translating it into what we can do across the spectrum. One of the things we were just chatting about backstage was things we’re learning in what I call a Cloud design point at scale, now in the next release of Windows Server, are showing up there for private Cloud deployment as well. It’s a matter of how do people perceive Microsoft. The thing I’d say is at this point, over half of the Fortune 500 companies have access and are using Azure. A thousand customers a day onboarding on it, so it’s growing pretty well.
Derrick Harris 04:57
What do you mean by that? Like accessing and using? Because I haven’t heard of a lot of this type of–
Dave Campbell 05:06
Well, if using either usage numbers and what we’re doing to keep up with that, you’d understand. There are a lot of customers building out and we have things that would be more consumer ISV, but our focus over the last year or so has been for enabling businesses and organizations. When we announced the IS support, full infrastructure support, that really provided a simple on-ramp for people to move. What’s been interesting is people talk hybrid, I overhear, and most of the discussion, like I’ve heard here, has been about hybrid with the infrastructure level. And I think of that, can I take a VM and move it from here to there and such. But the other dimension of hybrid, which I think is critical, is the hybrid scenarios where one part of the workload is going to be running in one place, say on-premises, and the other part will be running in the Cloud. For us, we made an acquisition in this company StorSimple. It’s a storage appliance that you run on-premises and the Cloud provided bottomless storage on the back end. The uptake on that has been dramatic. As I mentioned, the upcoming releases of Window Server and SQL Server are real easy to back up into the Cloud. So on-ramp, in terms of some of these hybrid scenarios, is something that people are–
Derrick Harris 06:22
Is that kind of a backtrack? Maybe I just had a misconception but I always thought that when Azure came out, was kind of like the Cloud is the future sort of thing. And now you’re talking about what can we learn in Azure and rollback into software. How do you reconcile the two, the software business and the Cloud business?
Dave Campbell 06:44
I think people in the audience would say it’s a continuum. And that’s kind of where it’s almost a trite thing to say you want to meet people where they’re at, but different people are at different points on the journey. The thing for me, and I’ll say it again, is that the things that we’re learning at extreme scale, translating those into ten, hundred, thousand server scale is interesting and valuable for our customers. So the idea is to meet them where they want to be, but still provide them the value of what we’re learning on–
Derrick Harris 07:19
Do you have customers running at 100,000?
Dave Campbell 07:22
No, we are.
Derrick Harris 07:23
Dave Campbell 07:25
We have customers making use of that inelastic sense, but–
Derrick Harris 07:28
All right. Just to talk about that point a little bit more, can you give us examples of things you learned or things that you’ve done? Like running Hotmail or running Bing that now we’re seeing materialize in something like Azure or Quiver software.
Dave Campbell 07:43
Sure. One example is the Windows Azure pack, which was just announced a little while ago. And that’s the whole provisioning plane, management portal, and the high density websites, and such. The other thing that we need to think about in terms of how this evolves is there’s infrastructure in what deploys. What does infrastructure make fungible? Then there’s a control plane and a management plane and a lot of these things, for example, Windows Azure pack is more at the management plane. But that’s something that we learned at scale, produced at scale, and then brought that down.
Derrick Harris 08:19
In terms of looking at Azure, one of the things we hear a lot, or I’ve heard a lot at least, is that Microsoft is working with a lot of open source. Technology now is trying to roll that in, it’s not just a dot and head platform anymore. I’m not sure, at least from what I hear, that that message has caught on. So what’s the vision of Azure? What’s the reality, from your perspective, of what Azure is compared with the other Cloud platforms out there?
Dave Campbell 08:49
The vision, for Azure for us, is a business ready Cloud. You know the term that is really not grounded and defined well today is Cloud O/S. But in the role that I’m in right now, one of the things I’m focusing on is how do we define Cloud O/S? And I think of an operating system broadly, as having the capabilities and facilities for building the modern Apps of the era. So you start to get interesting things like notification services, location services, other data centric services that you would expect to find. If you go way back when, 20 years ago, you didn’t have networking in the operating system, and now it’s just a core part of what we assume. So that’s kind of how I think about that piece.
Derrick Harris 09:31
So are customers coming to Azure and building, let’s say, Millie’s next generation applications, are they using Mongo DB, and Hadoop, and all the kind of things that–?
Dave Campbell 09:42
Yeah. When they open SourceRun, certainly. We’ve enabled a bunch. We have LYNX support in IS. I think a lot of people in the audience probably know we’ve done a lot of work with Hadoop. In fact, the thing that folks in the audience may not know, if I were to say– people who know the Apache structure, know there’s a Working Group Chairperson. And I tell people, would you bet that the Apache Hadoop Working Group Chairperson is a Microsoft full-time employee? And a lot of people say no, I wouldn’t take that bet. But, in fact that’s true. So Chris Douglas is the–
Derrick Harris 10:17
I’m not sure I knew that.
Dave Campbell 10:18
Yeah. We’re committed to investment, we’ve got a good relationship with Hortonworks. I think in the data space, if you want to go there for a second, large pools of data represent potential latent value for me. If there are going to be petabytes of data landing in HDFS, we want to be able to be able to create value on top of that.
Derrick Harris 10:42
So why Hadoop? Because a couple of years ago, or a few years ago, Microsoft was pushing a different alternative platform for big data and I think internally, why Microsoft still runs on something different. Why Hadoop for users and something different for Microsoft?
Dave Campbell 11:02
The thing really is, it just made commercial sense. We have a very large and successful data warehousing business, and three years ago we were seeing our fees and said, Okay, what is your Hadoop integration story, not your big data integration specifically? The other thing that’s interesting is that in some of the open source projects right now, that’s where the innovative way-front is. So how do you participate in that in a credible way? We still have Cosmos and Scope, which is great technology. What I think is interesting is to see what Hadoop becomes in three to five years. It’s becoming more database-like, if you look at efforts like Impala and other things. That’s kind of where Cosmos is at right now. So how do we bring the best of both worlds? What do we do that makes commercial sense? That’s kind of how we think about it.
Derrick Harris 11:59
So you spent a lot of time working on SQL Server?
Dave Campbell 12:01
Derrick Harris 12:01
You’re a technical fella on the database here and maybe you still are. If Hadoop becomes more databasey, what does that mean for a product like SQL Server? I know they’re different, but assuming they converge at some point.
Dave Campbell 12:18
One of the things I find really interesting, I went out when I was trying to make sense of what was playing out in the big data space a number of years ago. I found a natural pattern where people would pour data into what I refer to as a digital shoe box and do what I would refer to as information production. Then you’d wind up with these big pools of data surrounded by a constellation of data marks, and cubes, and such. So in that construct, there’s still very much a role for relational databases, relational data warehousing. And you see more of the relational data warehousing techniques going into Hadoop itself. I think they will complement each other in a previewed way.
Derrick Harris 13:02
So the HortonWorks strategies, people aren’t familiar with HortonWorks is the Yahoo spin-off dedicated to open source with Hadoop. Can we trust Microsoft, or is this plan a completely open source Hadoop thing going forward? Is Microsoft going to build its own? Because it certainly has the capabilities to build its own stuff on top of it and do some interesting things.
Dave Campbell 13:26
I think it makes commercial sense for us to be able to participate in the whole show, if you will. I watched the data science talk that you apparently sponsored yesterday and someone mentioned Excel. As we spoke to people in the space before we decided to go with Hadoop, a lot of people said really what we want to be able to do is to get into the Microsoft BI stack into Excel, and such like that. And we’ve got a whole stable of really great machine learning experts, and such. I think that much of the value that we created will be in that information production, and then delivering the insights out to the edge, if you will. So if you think about it that way, running over the broadest swath of large data infrastructure makes complete sense for us from a business standpoint.
Derrick Harris 14:21
That’s a good point. You talk about Microsoft’s machine learning center. Microsoft has a huge research division, so there’s a lot of them. What’s the value of research to a company? Because I see a handful of companies doing research for the sake of research, not for the sake of R&D and Product Development. Is it worth the investment?
Dave Campbell 14:39
Well I think, yes. I’ll say absolutely yes, from the industry perspective all out. I think you can just go through and chase the roots of distributed systems all the way back to similar work sponsored by industrial research labs. I think the question is, the latency between the invention and when it’s commercially recognized sometimes. In different companies, at different points has sort of carried the mantle, if you will, for the industrial research. I think it’s critical to innovation a lot.
Derrick Harris 15:14
I want to go back to something you mentioned, just the talk about the scale that Microsoft runs. I think that’s a critical part of the Cloud discussion. Maybe this is putting you on the spot, but Steve Ballmer said a few months ago that Microsoft is running a million servers. To your knowledge, is that in the ballpark? It seems like an awful lot.
Dave Campbell 15:37
That’s in the ballpark. The rate at which we’re buying them and building data centers is quite astronomical. Here’s an interesting thing, if you told me five years ago from the business side, if we were worried about someone cutting a fiber with a backhoe, or we had a case where a piece of steel fell because the temperature where we were building the data center went from like 80 degrees in the day to 36 degrees at night and it contracted four inches, it’s just so many different aspects in the supply chain. Working with partners to just optimize that, and land things at the right place, and light them up at the right time. Back to how the magic of the hardware and the software at that scale will come together to allow us to deploy things really quickly, it’s a whole host of new things we have to learn to run at that scale.
Derrick Harris 16:31
Is it different? What are the differences of me running Hotmail and running Windows Azure? They both have a lot of users, lots of servers, but there must be differences business-wise, economic-wise.
Dave Campbell 16:50
What’s interesting is that we had a number of environments and people would build up big services. What’s happened over the last year or so, and it’s part of what we’re all focused on, is getting on a common infrastructure to increase the economy as a scale. One of the things that’s interesting, I mentioned earlier, being a business Cloud, the types of certifications that we need for the business applications are far different than what we need for some of the consumer services and such. So there’s one example. And doing that through-and-through, so we can deploy anything, anywhere in the globe, and meet the needs of the business and the consumer services.
Derrick Harris 17:30
We’re running out of time, so final question. Given that Microsoft has been around for 30 some odd years now, is Microsoft, in your opinion, more of an HP at this point, or more of a Google? Just in terms of the product vision, and where?
Dave Campbell 17:47
I’ll answer something of that, but I’m not so sure about answering the question here. I’d say it this way, and I’ve been quite open, I’ve said this in a number of forms; I was at Digital, if there’s anyone old enough out there to remember them – a few hands go up – when there was this tectonic shift in computing from mainframes, to mini-computers with terminals around them, to Client Serve, and 3-tier. It, for those who were around will probably agree with me here, erased the mini-computer industry from the planet in about five years. I’ve told lots of people, I only plan on going through that once in my career, including my wife, I told her this. So I am bullish in the sense that we’ve recognized the shifts, we invested early, and clearly there’s great competition, we have a long way to go, but we’re all in. So you can ask any of the developers across any of the products working in this, and ask them if they have an appreciation for what it means to run at this scale and are excited about it, they’re all fully engaged.
Derrick Harris 18:55
All right. Great. Thanks a lot Dave.
Dave Campbell 18:56