-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restore indexed formulation of clone_from_slice #31000
Conversation
For good codegen here, we need a lock step iteration where the loop bound is only checked once per iteration; .zip() unfortunately does not optimize this way. If we use a counted loop, and make sure that llvm sees that the bounds check condition is the same as the loop bound condition, the bounds checks are optimized out. For this reason we need to slice `from` (apparently) redundantly. This commit restores the old formulation of clone_from_slice. In this shape, clone_from_slice will again optimize into calling memcpy where possible (for example for &[u8] or &[i32]).
r? @brson (rust_highfive has picked a reviewer for you, use r? to override) |
Fixup of #30943 I used these testcases to verify that it produces the same code as the old clone_from_slice formulation: https://play.rust-lang.org/?gist=568728397b448713ce14&version=nightly the optimizer is clearly very picky here, the reslicing of |
I guess all we need is a way to have ignore tests in non-opt builds, then 2016-01-18 18:45 GMT+01:00 Alex Crichton [email protected]:
|
@dotdash I can't force "opt-level = 3" in // compile-flags: -C no-prepopulate-passes
// Make sure that simple cases of dst.clone_from_slice(src) produce memcpy
// (relies on loop idiom recognize).
#![crate_type = "lib"]
// CHECK-LABEL: @copy_slice
#[no_mangle]
pub fn copy_slice(src: &[u8], dst: &mut [u8]) {
// CHECK: memcpy
dst.clone_from_slice(src);
} |
@bors: p=1 |
Restore indexed formulation of clone_from_slice For good codegen here, we need a lock step iteration where the loop bound is only checked once per iteration; .zip() unfortunately does not optimize this way. If we use a counted loop, and make sure that llvm sees that the bounds check condition is the same as the loop bound condition, the bounds checks are optimized out. For this reason we need to slice `from` (apparently) redundantly. This commit restores the old formulation of clone_from_slice. In this shape, clone_from_slice will again optimize into calling memcpy where possible (for example for &[u8] or &[i32]).
Restore indexed formulation of clone_from_slice
For good codegen here, we need a lock step iteration where the loop
bound is only checked once per iteration; .zip() unfortunately does not
optimize this way.
If we use a counted loop, and make sure that llvm sees that the bounds
check condition is the same as the loop bound condition, the bounds
checks are optimized out. For this reason we need to slice
from
(apparently) redundantly.
This commit restores the old formulation of clone_from_slice. In this
shape, clone_from_slice will again optimize into calling memcpy where possible
(for example for &[u8] or &[i32]).