Function scope

Source

pub fn scope<'scope, OP, R>(op: OP) -> Rwhere
    OP: FnOnce(&Scope<'scope>) -> R + Send,
    R: Send,

Expand description

Creates a “fork-join” scope s and invokes the closure with a reference to s. This closure can then spawn asynchronous tasks into s. Those tasks may run asynchronously with respect to the closure; they may themselves spawn additional tasks into s. When the closure returns, it will block until all tasks that have been spawned into s complete.

scope() is a more flexible building block compared to join(), since a loop can be used to spawn any number of tasks without recursing. However, that flexibility comes at a performance price: tasks spawned using scope() must be allocated onto the heap, whereas join() can make exclusive use of the stack. Prefer join() (or, even better, parallel iterators) where possible.

§Example

The Rayon join() function launches two closures and waits for them to stop. One could implement join() using a scope like so, although it would be less efficient than the real implementation:

pub fn join<A,B,RA,RB>(oper_a: A, oper_b: B) -> (RA, RB)
    where A: FnOnce() -> RA + Send,
          B: FnOnce() -> RB + Send,
          RA: Send,
          RB: Send,
{
    let mut result_a: Option<RA> = None;
    let mut result_b: Option<RB> = None;
    rayon::scope(|s| {
        s.spawn(|_| result_a = Some(oper_a()));
        s.spawn(|_| result_b = Some(oper_b()));
    });
    (result_a.unwrap(), result_b.unwrap())
}

§A note on threading

The closure given to scope() executes in the Rayon thread-pool, as do those given to spawn(). This means that you can’t access thread-local variables (well, you can, but they may have unexpected values).

§Task execution

Task execution potentially starts as soon as spawn() is called. The task will end sometime before scope() returns. Note that the closure given to scope may return much earlier. In general the lifetime of a scope created like scope(body) goes something like this:

Scope begins when scope(body) is called
Scope body body() is invoked
- Scope tasks may be spawned
Scope body returns
Scope tasks execute, possibly spawning more tasks
Once all tasks are done, scope ends and scope() returns

To see how and when tasks are joined, consider this example:

// point start
rayon::scope(|s| {
    s.spawn(|s| { // task s.1
        s.spawn(|s| { // task s.1.1
            rayon::scope(|t| {
                t.spawn(|_| ()); // task t.1
                t.spawn(|_| ()); // task t.2
            });
        });
    });
    s.spawn(|s| { // task s.2
    });
    // point mid
});
// point end

The various tasks that are run will execute roughly like so:

| (start)
|
| (scope `s` created)
+-----------------------------------------------+ (task s.2)
+-------+ (task s.1)                            |
|       |                                       |
|       +---+ (task s.1.1)                      |
|       |   |                                   |
|       |   | (scope `t` created)               |
|       |   +----------------+ (task t.2)       |
|       |   +---+ (task t.1) |                  |
| (mid) |   |   |            |                  |
:       |   + <-+------------+ (scope `t` ends) |
:       |   |                                   |
|<------+---+-----------------------------------+ (scope `s` ends)
|
| (end)

The point here is that everything spawned into scope s will terminate (at latest) at the same point – right before the original call to rayon::scope returns. This includes new subtasks created by other subtasks (e.g., task s.1.1). If a new scope is created (such as t), the things spawned into that scope will be joined before that scope returns, which in turn occurs before the creating task (task s.1.1 in this case) finishes.

There is no guaranteed order of execution for spawns in a scope, given that other threads may steal tasks at any time. However, they are generally prioritized in a LIFO order on the thread from which they were spawned. So in this example, absent any stealing, we can expect s.2 to execute before s.1, and t.2 before t.1. Other threads always steal from the other end of the deque, like FIFO order. The idea is that “recent” tasks are most likely to be fresh in the local CPU’s cache, while other threads can steal older “stale” tasks. For an alternate approach, consider scope_fifo() instead.

§Accessing stack data

In general, spawned tasks may access stack data in place that outlives the scope itself. Other data must be fully owned by the spawned task.

let ok: Vec<i32> = vec![1, 2, 3];
rayon::scope(|s| {
    let bad: Vec<i32> = vec![4, 5, 6];
    s.spawn(|_| {
        // We can access `ok` because outlives the scope `s`.
        println!("ok: {:?}", ok);

        // If we just try to use `bad` here, the closure will borrow `bad`
        // (because we are just printing it out, and that only requires a
        // borrow), which will result in a compilation error. Read on
        // for options.
        // println!("bad: {:?}", bad);
   });
});

As the comments example above suggest, to reference bad we must take ownership of it. One way to do this is to detach the closure from the surrounding stack frame, using the move keyword. This will cause it to take ownership of all the variables it touches, in this case including both ok and bad:

let ok: Vec<i32> = vec![1, 2, 3];
rayon::scope(|s| {
    let bad: Vec<i32> = vec![4, 5, 6];
    s.spawn(move |_| {
        println!("ok: {:?}", ok);
        println!("bad: {:?}", bad);
    });

    // That closure is fine, but now we can't use `ok` anywhere else,
    // since it is owned by the previous task:
    // s.spawn(|_| println!("ok: {:?}", ok));
});

While this works, it could be a problem if we want to use ok elsewhere. There are two choices. We can keep the closure as a move closure, but instead of referencing the variable ok, we create a shadowed variable that is a borrow of ok and capture that:

let ok: Vec<i32> = vec![1, 2, 3];
rayon::scope(|s| {
    let bad: Vec<i32> = vec![4, 5, 6];
    let ok: &Vec<i32> = &ok; // shadow the original `ok`
    s.spawn(move |_| {
        println!("ok: {:?}", ok); // captures the shadowed version
        println!("bad: {:?}", bad);
    });

    // Now we too can use the shadowed `ok`, since `&Vec<i32>` references
    // can be shared freely. Note that we need a `move` closure here though,
    // because otherwise we'd be trying to borrow the shadowed `ok`,
    // and that doesn't outlive `scope`.
    s.spawn(move |_| println!("ok: {:?}", ok));
});

Another option is not to use the move keyword but instead to take ownership of individual variables:

let ok: Vec<i32> = vec![1, 2, 3];
rayon::scope(|s| {
    let bad: Vec<i32> = vec![4, 5, 6];
    s.spawn(|_| {
        // Transfer ownership of `bad` into a local variable (also named `bad`).
        // This will force the closure to take ownership of `bad` from the environment.
        let bad = bad;
        println!("ok: {:?}", ok); // `ok` is only borrowed.
        println!("bad: {:?}", bad); // refers to our local variable, above.
    });

    s.spawn(|_| println!("ok: {:?}", ok)); // we too can borrow `ok`
});

§Panics

If a panic occurs, either in the closure given to scope() or in any of the spawned jobs, that panic will be propagated and the call to scope() will panic. If multiple panics occurs, it is non-deterministic which of their panic values will propagate. Regardless, once a task is spawned using scope.spawn(), it will execute, even if the spawning task should later panic. scope() returns once all spawned jobs have completed, and any panics are propagated at that point.