21 Comments

Summary:

The possibility that Amazon’s SimpleDB might be based on Erlang — a 20-year-old language that some programmers find weird — was nonetheless met with excitement in the programming world. Erlang may not be new, but it could offer one new way that concurrent programming can be done.

Geeks everywhere got excited recently when they heard that SimpleDB might be based on Erlang. Why? Is Erlang the next big thing? Probably not — it’s a 20-year-old language that some programmers find weird.

But the model Erlang offers for parallel programming — distributing computing instructions across multiple processors such as are found in multicore computers or clouds of servers linked together — does represent something radically new to many programmers, even though it’s two decades old.

Parallel programming too difficult and buggy?

Programming for multiple processing units — also known as concurrent programming or parallel programming — is not easy, especially with programming languages designed mainly for situations in which instructions are executed one at a time, in a predictable order.

Languages like C++ and Java work mainly by using sequential processing. They have the capability to run multiple pathways of execution at the same time — allowing computer instructions to run in parallel — but in order to do so, they usually use something called “shared memory.”

With shared memory, different paths of execution (called “threads”) access different bits of data at different times. Your program needs to control access to that data and ensure that no matter what order in which the instructions arrive, the data is in the state you what it to be.

Edward Lee, a professor in UC Berkeley’s EE/CS department, calls threads harmful, citing the indeterminate results you get with multiple threads reading and writing shared data in parallel. Indeed, programs using shared memory can be subject to unpredictable bugs.

Erlang dispenses with shared memory

Erlang was created by Ericsson in 1987 to use in the development of fault-tolerant telecom applications running across many processors. It approaches the problem of program parallelization differently than do C++ or Java or other so-called sequential languages: It uses very lightweight processes that don’t share memory but rather pass messages asynchronously.

Asynchronous message passing, as opposed to shared memory usage, removes the possibility of deadlock and race conditions, two bugs that sometimes occur in shared memory parallel programs. But making an effective and efficient concurrent program still requires plenty of work on the part of a developer and Erlang’s approach doesn’t work for every situation.

Erlang too old and too weird?

As Sun Distinguished Engineer and Director of Web Technologies Tim Bray has said of Erlang, “It’s too weird, and in my brief experiments, the implementation shows its age; we have in fact learned some things about software since way back then.” Another engineer I talked to who’s used Erlang chose the same adjective, telling me, “The problem with Erlang is it’s just too weird. Too many programmers hate weird.”

This fall, Bray launched his Wide Finder project, a programming exercise intended to see how Erlang could parallelize the processing of a large text file. I asked him if his opinion had changed since getting to know Erlang better by way of that project. He said it largely remains the same, but expressed a wish for Erlang’s concurrency model — lightweight threads that use message passing instead of shared memory — to be incorporated into his favorite modern languages, such as Java and Ruby.

Erlang may not be entirely suited to general purpose parallel programming projects, even for SimpleDB maker Amazon. Although Amazon wouldn’t comment on the technologies they use in SimpleDB, a source close to the company told me that while Erlang was indeed used in the SDS project (which became SimpleDB), the engineers implemented their own inter-process communication instead of using Erlang’s built-in distributed communications capability.

But even if Erlang isn’t the next big language, it could serve as a concurrent programming game-changer (as could other languages that offer similar concurrent programming support). By suggesting an alternative to shared memory concurrent programming, it can teach programmers one new way that concurrent programming can be done — even if it’s two decades old.

  1. good analysis of erlang. note that the best jabber/xmpp server out there is in erlang.

    but why does the user of simpledb give a hoot in what language the amazons programmed the darned thing? if it works for me, for all i care they could have programmed it in fortran-ii.

    Share
  2. Re: ejabberd, they had a serious memory usage problem a while back. The devs used strings (Which are lists of ints in Erlang) internally to pipe data around. This inefficiency killed them. They switched to binaries – a special data type that holds raw binary data – and improved their memory usage.

    Erlang binaries are neato – you can do bit-wise pattern matching using them. Making the implementation of, say a IPv4 stack in Erlang significantly easier.

    If I was doing a network course, I’d have my students implement a complete IPv6 stack in Erlang.

    Share
  3. I am not sure what “the language shows it age” means. We use it in our product and we developed a renowed XMPP server much quicker than our competitor.

    And this forgot that the virtual machine is indeed being added new modern features every year. Erlang VM is comparable to Java One in term of modern features (SMP, garbage collection algorithm) and offer features that even Java do not have (native compilation).

    The other language environment (Ruby, Python, Perl) are years away of what Erlang does offer.

    Share
  4. Randy: true, it doesn’t much matter to the user what SimpleDB is implemented in.

    Share
  5. Erlang’s late entry is best explained by Joe Armstrong’s threads :
    “Who cares if Erlang starts slowly – it was designed to start once and never stop – we have systems that have run for 5 years – a two seconds start-up time amortized over 5 years is not too bad. “

    Kepe Clicking,
    Bosky

    PS : btw, we’re using erlang as well…

    Share
  6. I’d have to say that Erlang is only ‘weird’ if you aren’t familiar with functional programming languages. The ‘weirdness’ that you may be talking about is probably the immutable state of variables. The trade off is you lose most if not all side effects when you lose mutability. This can be a tremendously Good Thing when trying to debug because your code becomes extremely easy to write tests for. In any case, Erlang is a great way to become become familiar with functional programming if nothing else. BTW for the ruby lovers at rubyconf this year Matz (the creator of ruby) said the language he is most interested in right now (as far as stealing functionality from) is Erlang.

    Share
  7. @Mickael: I think what Tim meant by the language shows its age is that it’s lacking some capabilities he finds indispensable for his purposes, such as regular expression support. Depending on what you’re trying to do, it might or might not look old-fashioned.

    @wavell: I’d agree that the root of Erlang’s “weirdness” is that it’s a functional language — immutable variables and use of tail recursion might look strange if you’re coming from C or Java, for example, even if you learned about such things in a CS class.

    And yes, Erlang seems like a good way to learn not just about a different way of concurrent programming but also about functional programming (which itself is the basis for that different way of achieving concurrency).

    Very interesting that Matz is most interested in Erlang.

    Share
  8. The other language environment (Ruby, Python, Perl)
    are years away of what Erlang does offer.

    Who cares. Horses for courses. Ruby / Python / Perl were not designed with the same end goals as Erlang.

    I wonder if this posting about Erlang on GigaOm was really relevant?

    Share
  9. @Eddie: Relevant to whom? If you already knew about Erlang and what it offered and why people get excited about it — maybe not. But not everyone who reads tech blogs is a programmer who stays up on the latest programming language buzz.

    Share
  10. When I need to do multithreading, it’s a problem. Then an academic computer scientist says I have to learn an obscure new programming language to take advantage of it. Now I have two problems.

    Share
  11. Erlang is probably the least academic language in existence. In fact, it is the only language that was created with the purpose of building scalable, fault tolerant systems in mind. Whichever language you’re using now is probably far more academic than Erlang.

    Share
  12. Erlang works extremely well for us at Mochi Media, MochiAds has been using it from the beginning and MochiBot was ported to Erlang because it was so much easier to maintain and scale than the original Python implementation.

    The wide finder problem is kind of a red herring, Erlang works best with network I/O. It was designed for telecom, not batch processing from disk. There are of course efficient (and obtuse) ways to do disk I/O with Erlang, but the standard library doesn’t do a very good job of it. On the flip side if you take Python (or Java or Ruby or damn near anything else) and try and make it do networking efficiently out of the box you’ll lose in a similarly spectacular fashion.

    If any of you dying to learn Erlang, we’re hiring ;)

    Share
  13. Languages like C++ and Java work mainly by using sequential processing. They have the capability to run multiple pathways of execution at the same time — allowing computer instructions to run in parallel — but in order to do so, they usually use something called “shared memory.”

    I take it you never heard MPI. It is only been taught in every super computer course since the late 90’s.

    http://en.wikipedia.org/wiki/Message_Passing_Interface

    Share
  14. Microsoft Research has an extension of C# called Sing# that uses message passing as their core concurrency technique. They used it to implement their research OS called Singularity. I can’t think of anything in Erlang that can’t be duplicated by a library in Java/.NET. The message passing part is easy. The reliability “features” fall out from their design techniques. I couldn’t find anything in the VM that was superior to JVM/.NET.

    Check it out: http://research.microsoft.com/os/singularity/

    Share
  15. @Jonathan Allen: I didn’t say C++ and Java always use shared memory for concurrency, I said usually.

    Share
  16. [...] Erlang: A New Way to Program That’s 20 Years Old Geeks everywhere got excited recently when they heard that SimpleDB might be based on Erlang. Why? Is Erlang the next […] [...]

    Share
  17. Jon: “I take it you never heard MPI. It is only been taught in every super computer course since the late 90’s.”

    I take it you are not a native English speaker. That’s alright. In any event, I am a graduate student at a top 10 CS program. (I will assume it is alright to consider this a “super computer course”.) We cover how to parallelize programs in the context of algorithms, but we don’t necessarily use, or even teach, MPI per-say. I like Ocaml, and FP in general, but Erlang offers a new model for concurrency that has been battle tested. The latter is key!

    Share
  18. [...] Meanwhile, Ruby on Rails doesn’t seem quite so hot this year as it was last January, Scala’s getting some laughs, and people have been wondering why Erlang’s so buzzy. [...]

    Share
  19. bad_article_ Monday, March 3, 2008

    “…He said it largely remains the same, but expressed a wish for Erlang’s concurrency model — lightweight threads that use message passing instead of shared memory — to be incorporated into his favorite modern languages, such as Java and Ruby…”

    So has he ever written code in a functional programming language or just imperative ones? which would be why he finds it “wierd”. This guys is a flake.

    Share
  20. [...] are pushing Erlang as a potential solution to parallel programming, while those in the supercomputing industry are [...]

    Share
  21. [...] It could be edited, moved or deleted, which is when conflicts arise. It’s well known that concurrent programming is difficult; sync is just an extreme [...]

    Share

Comments have been disabled for this post