Supporting multiple senders with Rust Pipes

Posted: July 12, 2012 in Mozilla
Tags: , , , , , , , , ,

The benchmarks in my last post had one thing in common: all communication was one sender to one receiver. It’s surprising how often this is sufficient, but sooner or later we are going to need a way to have multiple tasks sending to the same receiver. I’ve been experimenting with two ways of doing many senders to different receivers, and I now have some results to show.

The pipes library includes a select operation. This lets you listen on several receive endpoints simultaneously. Unfortunately, the single-use nature of endpoints makes select a little clunky to use. To help alleviate this, I added a port_set to the library. Port sets allow you to easily treat several receive endpoints as a unit. This allows send to still be very fast, but receive is a little bit slower due to the overhead setting up and tearing down the select operation. The current implementation for select is O(n) in the number of endpoints, so this works well for small numbers of tasks, but breaks down as things get bigger.

The other option is to slow down the sending end, using something I call a shared_chan. This is a send endpoint wrapped in an exclusive ARC. Now all the senders have to contend with each other, but the receive side is exactly as cheap as before. For cases where you have a lot of senders that send messages relatively infrequently, this will likely outperform the port_set approach, at least until select is faster.

Both of these are sufficient to run the msgsend benchmark that I talked about at the beginning of all of this. Here are the results, combined with the previous numbers.

Language Messages per second Comparison
Rust port_set 881,578 232.8%
Scala 378,740 100.0%
Rust port/chan (updated) 227,020 59.9%
Rust shared_chan 173,436 45.8%
Erlang (Bare) 78,670 20.8%
Erlang (OTP) 76,405 20.2%

The most obvious thing is that the port_set version is over twice as fast as Scala, the previous winner. I also re-ran the port/chan version for comparison, and it got a little bit faster. There has been quite a bit of churn in Rust recently, so it’s quite possible that these showed up here as better performance.

Writing the port_set version proved the most interesting to me. Relying on select ended up relaxing some of the ordering guarantees. Previously if we had Task A send a message to Task C and then send a message to Task B, and then have Task B wait to receive message to from Task A and then send a message to Task C, we could count on Task C seeing Task A’s message before seeing Task B’s message. With the port_set, this is no longer true, although we still preserve the order in messages sent by a single task. An easy way to work around this, however, was to rely on pipe’s closure reporting ability. The server could tell when a worker would no longer send any more messages because it would detect when the worker closed its end of the pipe.

About these ads
  1. ghexsel says:

    I’m a big fan of Scala, so I’m quite happy to see lots of the same concepts on a language that tackles a different abstraction level. To me, Rust is to C what Scala is to Java.

    Is Scala actually used as a baseline or trend for Rust? Or is it in the eye of this beholder?

    • Eric Holk says:

      I actually haven’t done much with Scala, although I like a lot of what I read about it. Some of the Scala DSLs are pretty cool. As far as whether Scala is a baseline for Rust, I’m not entirely sure what you mean. We want Rust to be useful for safe systems programming, and we have borrowed ideas from many other languages. We also want it to be at least competitive with and preferably faster than other languages in this same space. On this particular benchmark, Scala did best the last time I ran it, so the goal was to come up with something that would be Scala.

  2. anonymous says:

    Are you planning to include distributed concurrency support in Rust like Scala or Erlang where they have a concept of a remote actor?

    • Eric Holk says:

      So far Rust’s concept of concurrency is limited to within a single process. We will need a multiprocess solution at some point, but we’re not sure exactly the approach we’ll take. With single-process concurrency, we know all the tasks are running the same bits and agree on the types. With multi-process, we can’t rely on this anymore, which means when receiving a message from another process we’ll have to verify that the message indeed has the type it is supposed to. We’re not yet sure how to do this fast.

      For many of Rust’s purposes, I think it would be better to use existing network protocols such as the various RPC protocols that run over HTTP and let them work out the details.

  3. anonymous says:

    Here are the results on my Win7 laptop with 4 core Intel i7

    with erlang R15B01

    Eshell V5.9.1 (abort with ^G)
    1> client:runTest(3000000).
    Count is {ok,300000000}
    Test took 2.106 seconds
    Throughput=1424501.4245014247 per sec
    2> client:runTest2(3000000).
    Count is 300000000
    Test took 1.997 seconds
    Throughput=1502253.380070105 per sec

    and scala 2.10 showing

    scala> Client.runTest(3000000)
    Count is 300000000
    Test took 12.379 seconds
    Throughput=242345.90839324665 per sec

    • Eric Holk says:

      Thanks for the comparison! It looks like the Mac version of Erlang is indeed rather slow.

      As another point of comparison, Brian ran a few of the benchmarks on his Linux machine. Here are his results:

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s