Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoiding reallocations when doing a cartesian product #754

Open
lovesegfault opened this issue Apr 21, 2020 · 1 comment
Open

Avoiding reallocations when doing a cartesian product #754

lovesegfault opened this issue Apr 21, 2020 · 1 comment

Comments

@lovesegfault
Copy link

I have some code that looks like this:

playground

use rayon::prelude::*;

#[derive(Debug)]
struct BigStruct {
    a: u128,
    b: u128,
    c: u128,
    d: u128,
    e: u128
}

impl BigStruct {
    fn new(x: u64, y: u64) -> Self {
        BigStruct {
            a: x.into(),
            b: x.into(),
            c: x.into(),
            d: y.into(),
            e: y.into(),
        }
    }
}



fn main() {
    const X: u64 = 5;
    const Y: u64 = 5;
    let pairs: Vec<BigStruct> = (0..X)
        .into_par_iter()
        .flat_map(|x| (0..Y).into_par_iter().map(move |y| (x, y)))
        .map(|(x, y)| BigStruct::new(x, y))
        .collect();
    
    pairs.iter().for_each(|b| println!("{:?}", b));
}

When I went to profile it, with a very large X and Y I found that most of my execution time was spent on malloc. The natural next step, then, was to try and preallocate the Vec to have the correct size, as this at least means no calls to realloc. I learned about collect_into_vec() and decided to try and use it:

playground

use rayon::prelude::*;

#[derive(Debug)]
struct BigStruct {
    a: u128,
    b: u128,
    c: u128,
    d: u128,
    e: u128,
}

impl BigStruct {
    fn new(x: u64, y: u64) -> Self {
        BigStruct {
            a: x.into(),
            b: x.into(),
            c: x.into(),
            d: y.into(),
            e: y.into(),
        }
    }
}

fn main() {
    const X: u64 = 5;
    const Y: u64 = 5;
    let mut pairs: Vec<BigStruct> = Vec::with_capacity((X * Y) as usize);
    (0..X)
        .into_par_iter()
        .flat_map(|x| (0..Y).into_par_iter().map(move |y| (x, y)))
        .map(|(x, y)| BigStruct::new(x, y))
        .collect_into_vec(&mut pairs);

    pairs.iter().for_each(|b| println!("{:?}", b));
}

This, however, fails due to there not being collect_into_vec() for ParallelIterator, and the call to flat_map() I think converts the IndexedParallelIterator into a plain ParallelIterator.

error[E0599]: no method named `collect_into_vec` found for struct `rayon::iter::map::Map<rayon::iter::flat_map::FlatMap<rayon::range::Iter<u64>, [closure@src/main.rs:32:19: 32:66]>, [closure@src/main.rs:33:14: 33:43]>` in the current scope
  --> src/main.rs:34:10
   |
34 |         .collect_into_vec(&mut pairs);
   |          ^^^^^^^^^^^^^^^^ method not found in `rayon::iter::map::Map<rayon::iter::flat_map::FlatMap<rayon::range::Iter<u64>, [closure@src/main.rs:32:19: 32:66]>, [closure@src/main.rs:33:14: 33:43]>`
   |
   = note: the method `collect_into_vec` exists but the following trait bounds were not satisfied:
           `rayon::iter::map::Map<rayon::iter::flat_map::FlatMap<rayon::range::Iter<u64>, [closure@src/main.rs:32:19: 32:66]>, [closure@src/main.rs:33:14: 33:43]> : rayon::iter::IndexedParallelIterator`

I'm kind of at a loss with how to correctly create an iterator over the cartesian product of two ranges and then collect it into a preallocated vector. The only solution I can see is to implement cartesian() for IndexedParallelIterator so that I can retain the indices instead of losing them in the flat_map(). This, of course, would have to be done inside rayon, but I am hoping there are other solutions.

What is the right way to get an IndexedParallelIterator that's the cartesian product of two IndexedParallelIterators?

@cuviper
Copy link
Member

cuviper commented Apr 21, 2020

the call to flat_map() I think converts the IndexedParallelIterator into a plain ParallelIterator.

Yes, since flat_map only has to return some IntoParallelIterator type, it doesn't have to be indexed, so we don't have any idea how to offset each parallel flat_map invocation. Actually, even if we did have a variant that required the inner part to be IndexedParallelIterator, we still wouldn't know each of their lengths ahead of time -- there's nothing to say they even have to be the same!

The only solution I can see is to implement cartesian() for IndexedParallelIterator so that I can retain the indices instead of losing them in the flat_map(). This, of course, would have to be done inside rayon, but I am hoping there are other solutions.

I think that's possible, and it doesn't have to be part of rayon if you do it as your own extension trait. We made iter::plumbing public exactly to allow third party iterators. It won't be trivial to handle splits at possible odd boundaries, but not insurmountable.

If we do add something like this to rayon, it would be nice to model it after Itertools::cartesian_product for feature parity.

PS - I hope you have more than simple initialization to do in parallel -- I expect once we solve pre-allocation, your example would quickly run into a barrier of memory bandwidth when you try to scale up. It might not be much better than serial code.

@cuviper cuviper changed the title Avoinding reallocations when doing a cartesian product Avoiding reallocations when doing a cartesian product Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants