MobileMe Problems Show Apple Needs an Infrastructure Lesson

56 Comments

Steve Jobs, in an internal email seen by Ars Technica, makes clear that he’s upset about the botched launch of MobileMe, Apple’s new online suite of applications that has been plagued with bugs, including being flat-out unavailable to some for days at a time.

“It was a mistake to launch MobileMe at the same time as iPhone 3G, iPhone 2.0 software and the App Store,” he says. “We all had more than enough to do, and MobileMe could have been delayed without consequence.”

Amen to that. Having been a subscriber to dot-Mac for years, I was quite upset when the service failed to work at launch. They tried to hush everyone by waiving one month’s fee, but regardless, while some parts of it are up and running, many of the problems continue.

It wasn’t till Walt Mossberg and David Pogue publicly spanked the service with their respective wet bamboo stems that Apple started to understand the magnitude of the problem.

In his email, Jobs says: “The MobileMe launch clearly demonstrates that we have more to learn about Internet services.” You can say that again. The big question in the wake of the MobileMe debacle is whether or not the company even knows how to plan for heavy load.

I have picked up some tidbits from my Internet infrastructure sources, who tell me that:

  • There is no-unified IT plan vis-a-vis applications; each has their own set of servers, IT practices and release scenarios.
  • Developers do testing, load testing and infrastructure planning, all of which is implemented by someone else.
  • There’s no unified monitoring system.
  • They use Oracle on Sun servers for the databases and everything has its own SAN storage. They do not use active Oracle RAC; it is all single-instance, on one box, with a secondary failover.
  • Apparently they are putting web servers and app servers on the same machines, which causes performance problems.

One of my sources opined that Apple clearly wasn’t too savvy about all the progress made in infrastructure over the past few years. If this insinuation is indeed true, then there is no way Apple can get over its current spate of problems. It needs a crash course in infrastructure and Internet services. Apple’s problem is that it doesn’t seem to have recognized the fact that it’s in the business of network-enabled hardware.

The looks, UI and edge devices are only as good as the networking experience — whether it comes from Apple or from its partners. MobileMe could just be the canary in the coal mine as far as the Cupertino Kingdom is concerned. MobileMe isn’t that big a portion of their revenues right now, but what happens when the problems hit the iTunes store? Imagine the uproar when your 3G connections slow to a crawl because AT&T’s wireless backhaul can’t handle the traffic surge.

It might not be a problem of Apple’s making but the company will face the brunt of the backlash. Remember, most of us instinctively blame the device first, then curse the carrier.

56 Comments

Andrew

There are not many organizations with the skills and experience to launch a world-class, fully dynamic, huge-volume web application.

Apple just moved into that business, loudly, and stumbled. Any other company would have run it in public beta for a few months first, but that’s not Apple’s style.

If they had pulled it all off without a hiccup, they would be the first in history to ever do so. I’m sure they wanted to be, but this stuff is fiendishly complex to get right without lots of real-world usage data and environment profiling.

It was a risky decision to launch without that kind of testing. With enough time and resources, perhaps they could have created a passable simulation of user load, or at least something close enough that the service wouldn’t have failed so visibly.

But marketing requirements (schedule and secrecy) trumped. I do hope they (management) learn the lesson here — serious web apps are full-stack “hard”, even though they’re generally algorithmically “easy”.

Rob Enderle

>Apple has the money and expertise to hire the best, buy the best gear

Yes, because other companies dont have any money and no one has every tried
‘to hire the best, buy the best gear’.

Such an innovative concept.

Raghu Kulkarni

Om,

You almost seem to suggest that Apple folks are dumb when it comes to networking applications. Apple houses some of the smartest brains in the industry, possibly on level with Google. Mobileme is a small issue, and Apple can overcome this easily.

Mobileme infrastructure can be built and made to function in several ways; you could use a linux farm or a sun farm with oracle database; this sort of infrastructure is not rocket science. This has been done to death by Google to Facebook to Amazon.

Milo

The MobileMe fiasco was, as most large IT rollouts can be, a clusterfuck of bad things happening with many systems. I know the mail problems were ones that were not unfamiliar to other users of the mail server using the same version of the software. I would guess they were completely un-related to any other problems people had, but, like Metcalfe’s law, 1 problem + 1 problem = a hell of a lot of bitching about how nothing works.

Paul

This was a good idea in principle, but a bad launch of this conceptually complex (for most Apple customers, anyway) service may set its rate of adoption back a year or more.

Maybe it happened because Apple doesn’t have web DNA, but Apple didn’t know (hideously complex) cell phones either, and they hit a home run out of the box. The launch was probably botched because Jobs had his mind on his health, not on product quality. As such, this may be a fair glimpse of a post-Jobs Apple.

yanki

Why is itms compared to mobileme? they are two different beasts. One is a broadcast channel where the content can be replicated and served from the edge. Same thing is nearly impossible with mobileme, you will need enormous amounts of storage and bandwidth for replication as it is receiving constant flux of unique content from its users.

john

I work for google, not apple, but have friends that work for apple. Apple definitely has a very elaborate monitoring system in place, I am 100% certain of this. So taking this into consideration, I feel that maybe the remainder of this article is not very accurate. Maybe the source of your information is only seeing a small piece of the picture, and not the whole picture?

spotman

This is total FUD. Getting your information from the noc monkeys staring at the rack doesn’t mean you understand how any of it works. You take a stab at apple like they were born yesterday, and have no clue how the internet, or mail clusters work. Given me.com’s launch was terrible, but they clearly took on (and launched) more than they had the engineers to troubleshoot at the time. Combine that + some reports from a couple noc monkeys that are looking at one of their racks, and you call this a story?

Danny

They never should have ripped off the “Me” part of the Windows Me logo.

Paul Lambert

There are some fundamental differences between iTunes and MobileMe, and assuming they’re the same probably got Apple into trouble in the first place.

When you search the iTunes store, for example, you get results that might change over time. If someone at Apple adds a track that should show up in your results, but you don’t see it until 10 minutes later, will you be upset?

Would you even know?

Of course not.

MobileMe is accepting tons of writes, constantly. That new data then has to be shown to the user in a matter of seconds.

This requires a very different architecture, and will scale very poorly unless all the service requirements are really taken into account during the design.

The simple approach—which surely works fine for the iTunes store—of a three-tier solution won’t work well at all.

Even fancy buzzwords that previous posters have thrown around (like Oracle RAC) won’t help much with that situation.

ibulb

You know, some people just write articles about Apple just to get some clicks. I have two independent reliable sources inside the GigaOm group that tells me that — whenever the traffic to their site is low; they just hook up an article about Apple.

ibulb

Om, People are always quick to jump on other’s mistakes. NASA screwed a space mission because they did not convert units from one system to another. Have they not carried out critical missions in the past? Have they not learnt anything from then?

Yes, iTunes is a serious web business, however, it grew steadily. Every Tom, Dick and Harry who uses Windows was not allowed to access the system on Day 1. Apple messed up Mobile Me; and given how Jobs is when it comes to perfection, it will get fixed (read the memo once more).

I would like to see how often someone admits mistakes. One mistake from Apple, and every blogger starts to give advice.

Cheer up. Some people need journalism lessons too.

JC

Has already been commented – but clearly, ITunes is an internet, networked service.

The traffic is tremendous, but ITunes hums along.

Why would you think that the lessons of ITunes can’t be applied to MobileMe? The problems sets seem the same.

Lethol

“…. Show Apple Needs an Infrastructure Lesson”. -Om Malik

As someone else posted before. Apple probably delivers more content than anyone else on the Internet, hell, it may even deliver more bandwidth throughout the day than google.

Still, here comes the great Om Malik, one person IT guru/genius/rambo, that can cook some pointers and in a few minutes show how all the IT staff at Apple really doesnt know what they are doing. Maybe Apple IT has a magic wand that they been using he past 6-7 years to deliver their content/services and the wand suddenly broke.

Oh and lets not forget that he has some ‘pointers’ that can fix things for future outages, so please SOMEONE at Apple, print screen, take note and bring to their next meeting ASAP. I’m sure this is based on his infinite knowledge of running a web page with gazillion hits that is probably just the same as what Itunes/MobileMe service delivered but multiplied by some random number.

Seriously Om, the MobileMe outage was a pain for all of us with the service, but its old news and Apple has paid the price.

I’m happy to know about the Steve memo, and I am sure things will get better. Still Apple offers something so well integrated with their products (MobileMe/Macs/iPhones) that I would not change it for anything else (MS Live, Gmail, Hotmail, etc). Oh and for everyone saying that all those can have outages and still cry, “It doesnt matter cause its free!”, lets not forget that you dont get something for nothing. I dont recall logging into my me.com account and have it plastered with sponsors on every possible space on my browser.

Lethol

scott

@om

I used to manage a team at Apple that worked daily with the .Mac and ITMS operations teams. I was there when ITMS was built and there for many rebuildings of .Mac.

There were definite problems with the MobileMe launch (although strangely my account had no issues), and Apple is notoriously secretive, so it’s understandable that people are frustrated by the lack of info, and so invent to fill in the void. If anyone working at Apple now commented on your story, they would probably be fired.

BTW, probably no contractors were used on MobileMe; Apple is in general against using outside contractors for that kind of work.

mikecane

What I found strange is:

1) Of course, there was the iTunes Store to learn from (as others have also pointed out)

2) Apple has been selling servers for quite some time. Couldn’t they have learned from customers? (Or are Apple’s servers regarded as “toy” servers in the serverscape, much like the original Mac was disdained as a “toy” computer back in 1984)?

3) There was .Mac to learn from. With all the complaints I’ve read about that, why did they think mobileme would go smoothly?

It all brings up many more questions. But Apple is not (supposed to be) a dumb company, so I expect them to get it right.

Om Malik

@Scott, You speak with supreme authority. I wonder if you could clue us to your relationship the team. (don’t need your identity.) My information does come from two pretty reliable and independent sources (they have been in the past.) Thanks for the comment by the way.

Düg

Om, I’d wager that Apple’s security-through-compartmentalization approach has a lot to do with the failings of MobileMe.

While Jobs has been using this approach since his first pet projects,* you correctly point out that the lack of communication between software and service deployment teams creates some real pitfalls for Apple. In the hardware world, you spec a number of parts from Motorola, Intel, or whoever. You fix problems on the board level with a small team. The point is that there are established, mature processes for debugging hardware and extracting maximum performance.

With OS X, Apple again has a long history and mature set of processes for dealing with forks and new features. The ability to set up a new hardware platform like the iPhone without other parts of Apple (or even other parts of the OS X dev team) knowing about the changes. Software development is also pretty mature; it’s not hard to imagine how Apple could fork the codebase, strip out everything not needed for a mobile OS (and a few things that are), and deliver it as a surprise to most of the rest of the company, as was done with the iPhone.

With Internet services like MobileMe, there are neither established industrywide practices for projects like this, nor does Apple have much of a history rolling out complex, ground-up applications. .Mac was built on top of WebObjects – a mature and capable set of tools that wouldn’t have been capable of scaling to the feature set or breadth of use that Apple hopes the service will achieve.

MobileMe suffered from Apple’s paranoid secrecy in two ways – first, because the team responsible likely wasn’t allowed to tell anyone else what they were doing, and second, because Apple couldn’t just bring in some well-qualified consultants for guidance. This is distinctly different from NIH syndrome, which I don’t see or hear from much at Apple anymore.

At any rate, great post. I’d like to hear more of your thoughts about how Apple’s secrecy hobbles them; while I certainly don’t want the company to give away the family jewels, I think Apple should reevaluate the need for such strict secrecy now that many of their products are being adopted by businesses and government – market segments which absolutely must have strategic guidance.

*The PowerBook G3 folks were famously unaware of the iMac’s existence and unveiling in the Flint Center in May 1998; virtually everyone at Apple thought the event was to announce the sleek new PowerBooks and speed bumped G3 desktops. Then the iMac stole the show. Theatrically, it was stunning to see Steve unveil the new computer and philosophy in the same spot the original Mac was introduced, but there were many employees who found it hard to see past the diminishment of their hard work on other products.

Ricki

joedogjoe said: “Microsoft Office: for the rest of us.” .. think it was “exchange: for the rest of us” :) ..at least I hope so. (no mom don’t make me use MS Office..I’ll be good..)

I think it is the case of “kids and to much candy”. Apple has been over-achieving the last couple of years, making all of us “fanboys” demanding new revolutionary toys every 2 months. The Snow Leopard vapourware is the best thing I heard from Cupertino in a long time: Stop, go back, clean up, move forward!
Wish that..oh 10.000 other companies would consider doing that to their products. (but as I said, vapourware for now)

The business plan for Apple seems to be: get as many new platforms out there as we can(media centers, PDA/computer like phones, app store, video store and laptops etc. etc.), regardless the blunders and rushed products… then we build stability and market shares.

It reminds me of the intel switch a couple of years ago: surprise everyone, give them crazy toys watch them jump on the Mac wagon so fast they never saw the cracks in the venire .. then correct the mistakes in a tempo that is economically feasible. And now a few years later, no one remembers the toaster laptops, the high pitch whines or the cracked laptops, when they enjoy all the intel kool-aid. (Windows on the Mac and new CPU’s every third month, we are lucky bastards all of us).

Apple puts out products at a pace that (sorry to say this) makes it ~ok~that us first adopters are the lab rats. Luckily Apple is, luckily, a hardware company firstly, so the blunders are usually of the sort that are corrected in the software along the way :)

My overall 2 cents, I guess is: It is complex things hitting the streets these days and it is in every corner of the Apple R&D / marketing machine new parts are put in, so give it a couple of months, I surely will :)

Om Malik

@Brian i have much more information about their network architecture but since it hasn’t been verified beyond one person, I can’t write it up. You make a good points, but it doesn’t take away from the fact that the mobilemess was apple’s own making.

They have had months to figure out this whole thing – announcement was months ago. So lets stop being apologists for a service we pay for.

By the way, the MobileMe team was comprised of folks who were on contract.

scott

Almost everything your Internet infrastructure sources told you is inaccurate. I worked with the guys who built the .mac (they also did the initial builds of the iTunes Music Store, btw). They definitely know how to build scalable web sites.

Each set of Apple’s applications has their own set of servers. Well, that’s sort of true. However, there’s a very good reason for that. The development teams are also separate and code and infrastructure changes for each application need to be separate.

Developers don’t do load-testing or QA testing at Apple. That’s usually done by QA teams which usually report to the operations teams that are responsible for the health and maintenance of the application.

Apple does have a unified monitoring system and it generally works quite well.

The Oracle infrastructure being described is actually similar to what iTunes uses, but not .Mac.

The web servers and application servers ARE on separate boxes.

Brian

Rene Stein may have a good point. In the internal memo, Steve Jobs stated that Eddie Cue, who previously managed the iTunes Store, was being promoted to handle all of the company’s Internet-related services. I think Apple has done a very good job of scaling the iTunes Store to meet tremendous global demand. We can only hope that with better management, that the situation with MobileMe will turn around.

There is still much we don’t know, and may never know. But I’m sure that at least one former manager of the MobileMe project is now seeking a new career outside of Apple. We simply don’t know if the apparent mistakes were made at the top, or by middle managers. Was the MobileMe team given autonomy in planning its network architecture? Or did the MobileMe team use a network architecture practice more common to other parts of the corporation? Om Malik’s posting does not address this.

While the author has some insight on the causes of this mess, we don’t have the whole story. Trying to fill in the blanks with assumptions can lead to false conclusions. What we do know is that the MobileMe team failed on many fronts, and in so doing, has given Apple a black eye. But every black eye is an opportunity for a company like Apple to learn and do better in the future.

Sadly, in technology, there are failures. Nothing in this world is perfect. The bigger issue is how a company like Apple, Microsoft, Google or others deal with them. Being paranoid, I’ll plan for the worst. And being an optimist, I’ll hope for the best.

xiane

I think you make good points, though I think some are a bit overstated. It appears to me that whatever Jobs is focussing his attention on at Apple gets fixed in fairly short order. And unlike some companies Apple does seem to learn from mistakes (cf aperture).

The techniques involved in mass internet services are well-understood, even if they aren’t a core competency at Apple at this point. Apple has the money and expertise to hire the best, buy the best gear, and make its promises a reality soon, even though it shouldn’t be “soon”, it should be “now”.

joedogjoe

I’ve been a .mac (and now mobile me user) for almost a year. With the exception of the storage space, I am going to have a hard time staying on with the service since I can do all of the same things (and more) with Google’s free web based apps (gmail, picasa, calendar, etc.)

I was disappointed with the redesign that looks pretty but does not differentiate the product from their competition. One of their rare bad pieces of marketing summed it up best: “Microsoft Office: for the rest of us.”

Saptarshi

Om, I completely agree with your observation that Apple needs to become more efficient in delivering web services. It has been pretty bad at nearly every web service its tried to launch compared to what it achieves in desktops… And I love the last statement where u mention that the device will be blamed even when its the operator.

Om Malik

@rene

if that was the case then why did they not “learn” the lessons of “itunes store” in the first place. I think they have a problem. Apple store has problems but they get masked because the company use CDNs for delivery. Apps-via-CDNs are still an emerging category and you are going to see growth in coming months.

As for the rushed delivery – I couldn’t agree more. I think Apple doesn’t have Web DNA and need to become an “internet company” quickly. It is still thinking like a box-maker. (albeit one that makes really pretty boxes.)

Rene Stein

I think you may have a few things backwards in this article. First, the problems from launching MobileMe won’t roll over to the iTunes store or the on-line Apple store. Most likely the experience from running these two highly successful and much more reliable web services will make its way to the MobileMe services. Second, you make it sound like Apple as a whole is new at this game and that the iTunes store is some small problematic but important service, instead of the largest music distributer in America. Apple has a lot of experience in handling heavy web service loads.

Each one of those issues pointed out sounds more like trying to rush an unready product to market, which is what the e-mail Jobs sent out is about.

Comments are closed.