At the beginning of the decade, Intel (s intc) was imagining that by 2010 it would have processors with over 1 billion transistors running at a clock speed of 20GHz. As we move into the second half of 2009, the reality is that we will soon have 3GHz mobile chips with four cores on them and 2010 will likely see 4GHz desktop chips with six and eight cores. Ultra-fast processors running at clock speeds over 4GHz have just been too expensive to power — and to cool off. So the other solution is to have more processors.
These new processors, based on the Nehalem architecture, or the Westmere 32nm process that will follow, will also feature simultaneous multithreading (what Intel calls “hyperthreading”) to allow for two threads to be executed on a single core. So instead of a superfast 20GHz chip, you could have a Mac Pro in 2010 with 16 cores capable of executing 32 simultaneous threads. Apple is preparing for this massively multi-core future with features in Snow Leopard (Mac OS X 10.6) that can take full advantage of all this raw power.
Time for Software to Catch Up
The introduction of dual-core systems was a brilliant success. Even code that was not built or optimized for multiple cores would run at least a little faster because the OS would use the second core, leaving more processing power available to the foreground app. The dual-core systems were noticeably more responsive to the user, and we loved them for this feeling of instant power at our fingertips. Quad-core and 8-core systems have been confined to the Mac Pro line, partly for cost, but also partially because the average user does not have software that can really take advantage of all those cores. Many people would be disappointed to learn that their quad-core iMac did not really seem any faster. Software applications, like Photoshop and Final Cut Pro, were designed by teams with significant engineering resources so that they could take advantage of a system with four or eight cores, and even then certain operations will still bottleneck the application.
What Have We Been Doing?
Multithreaded programming is not new on the Mac. We have had POSIX threads, or pthreads, (and NSThread on top of that) since OS X 10.0. The Mac has a scheduler that is multi-processor aware and can assign processes and threads to available CPUs as needed.
There are two ways that software benefits from concurrency (running multiple software tasks simultaneously). The first is to keep certain parts of the software, say the user interface for a financial management app, responsive while waiting for another task that is being processed, say downloading some stock quote data from the Internet in the background. The second opportunity is to design a function that can be parallelized, or split up into smaller chunks, like encoding a video by splitting it into sections that can each be encoded by a different CPU or core. The responsiveness of the app, and the performance of a parallelized function, are great for the end user. However, each developer is still responsible for managing the threads in the application and designing algorithms and functions for atomicity, parallelization and re-entrancy while avoiding deadlocks, resource starvation, deadly embrace, and so on. Concurrency is a challenging endeavor.
So Why Do We Need Grand Central Dispatch?
Grand Central Dispatch (PDF) is a new technology that will be available in Snow Leopard that helps developers more easily write software for multi-core systems. It does not make multi-threading automatic, or write thread-safe code for you, but it does add semantic and syntactic extensions to C, C++, and Objective-C to make code more readable and better organized with hooks into tools to analyze the multi-threaded performance of an application. Developer still have to do the hard conceptual work around figuring out concurrency in their application, but the implementation of those ideas is cleaner.
How Does Grand Central Dispatch Work?
The core functionality of Grand Central Dispatch is provided by organizing code into blocks and queues. A block is a self-contained unit of work that can represent anything from a simple step to a complex function with all the associated arguments and data. Queues are a method to schedule the execution of blocks and define the relationships between them. Instead of spawning and managing threads in the application, the developer marks sections of code as blocks and then places them in a queue. GCD steps in and manages all the queues and pulls blocks out and assigns them to available threads of the appropriate priority to be executed.
The Instruments utility in Xcode lets developers see how their code runs in GCD, so that they can learn how to improve performance. GCD also has a view of the entire system and the resources available to try to maximize efficiency across all running applications. It also relies on native hardware support for locking in Intel CPUs to implement some of its magic. This stuff won’t work on PowerPC, which is another reason why Snow Leopard is Intel-only.
What About the Users?
If you are an end-user, you will not benefit one bit from installing Snow Leopard and having GCD available unless the software you use is written to take advantage of it. It is not at all guaranteed that developers will jump to GCD. If a certain application would benefit from concurrency, then the developer has probably already started using pthreads to make the software more responsive and take advantage of current multi-core systems. If you look in your Activity Monitor, you will see that most applications have multiple threads already. Since multi-threaded code is hard to begin with, I do not see many projects choosing to rewrite all their pthread code to use GCD blocks and queues right away, especially since it means leaving all Leopard, Tiger and earlier users behind.
The low price on the Snow Leopard upgrade is a nice perk for existing Leopard users, but I think it is also meant to reassure developers that a very large percentage of their existing Leopard customer base will be able to run Snow Leopard-only software. If your app runs fine now using NSThread on Leopard, there is little reason to adopt GCD. So why build GCD at all?
I think the most obvious reason for building GCD is that Apple will be able to take advantage of it with all the system processes and included applications that people use all the time. Looking at my Activity Monitor right now, I see the kernel has 66 threads, Safari has 19, Mail.app has 18, mds has 16, SystemUIServer has 13, Spotlight has 6, and so on. Helping all those threads run more efficiently is going to pay off for the user experience on the Mac. If it helps a few other developers along the way, all the better.
Things get a little more interesting when you consider that future iPhones will likely have multi-core CPU’s and that Intel is advising developers to prepare for a future with “thousands of cores” available. Add in something like Larrabee, which presents dozens of additional cores to the system, and the wisdom of a systemwide approach to managing threads becomes apparent.