-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ICU4X struct construction with compiled_data
has a run-time cost compared accessing static POD
#5187
Comments
complied_data
has a run-time cost compared accessing static PODcompiled_data
has a run-time cost compared accessing static POD
The relevant struct whose construction is being deferred is /// An adapter between a Unicode back end an the `idna` crate.
pub struct Adapter {
mapper: Uts46Mapper,
canonical_combining_class: CanonicalCombiningClassMap,
general_category: CodePointMapDataBorrowed<'static, GeneralCategory>,
bidi_class: CodePointMapDataBorrowed<'static, icu_properties::BidiClass>,
joining_type: CodePointMapDataBorrowed<'static, icu_properties::JoiningType>,
} The first two use DataPayload, and the last three use |
Putting in the 2.0 milestone in case we want to consider doing a comprehensive treatment of all ICU4X types that could have fully borrowed constructors. |
Locale-sensitivity is not a requirement to have a borrowed constructor, only for that borrowed constructor to use a singleton instead of having to go through a |
Yes, "has singleton data marker" is more precise than "not locale sensitive". |
No it's the same, but it's not a requirement. |
I started making borrowed versions of the normalizer structs. I pushed my non-compiling WIP to https://github.com/hsivonen/icu4x/tree/borrowednormalizer in case anyone wants to take a look while I'm away. I got stuck trying to |
I'd like to discuss if we want this to block 2.0 or not. I think it can be safely done after 2.0 but might require deprecating functions. |
I think it shouldn't block 2.0 unless we can come up with a way to do this by tweaking the DataPayload architecture somehow. I do think we should try thinking about that: for example we could totally have a feature gated thing that swaps DataPayload with an So let's block 2.0 on ideation and picking a solution, but if the solution can be done without a breaking change we leave it be. |
If we give non-singleton baked data a way to return |
Yep! |
That seems like a much bigger deal than making |
pub struct ComposingNormalizer {
decomposing_normalizer: DecomposingNormalizer,
canonical_compositions: DataPayload<CanonicalCompositionsV1Marker>,
}
pub struct DecomposingNormalizer {
decompositions: DataPayload<CanonicalDecompositionDataV1Marker>,
supplementary_decompositions: Option<SupplementPayloadHolder>,
tables: DataPayload<CanonicalDecompositionTablesV1Marker>,
supplementary_tables: Option<DataPayload<CompatibilityDecompositionTablesV1Marker>>,
decomposition_passthrough_bound: u8, // never above 0xC0
composition_passthrough_bound: u16, // never above 0x0300
}
enum SupplementPayloadHolder {
Compatibility(DataPayload<CompatibilityDecompositionSupplementV1Marker>),
Uts46(DataPayload<Uts46DecompositionSupplementV1Marker>),
} What you need is pub struct ComposingNormalizerBorrowed<'a> {
decomposing_normalizer: DecomposingNormalizerBorrowed<'a>,
canonical_compositions: &'a CanonicalCompositionsV1<'a>,
}
pub struct DecomposingNormalizerBorrowed<'a> {
decompositions: &'a CanonicalDecompositionDataV1<'a>,
supplementary_decompositions: Option<SupplementPayloadHolderBorrowed<'a>>,
tables: &'a CanonicalDecompositionTablesV1<'a>>,
supplementary_tables: Option<&'a CompatibilityDecompositionTablesV1<'a>>,
decomposition_passthrough_bound: u8, // never above 0xC0
composition_passthrough_bound: u16, // never above 0x0300
}
enum SupplementPayloadHolderBorrowed<'a> {
Compatibility(&'a CompatibilityDecompositionSupplementV1<'a>),
Uts46(&'a Uts46DecompositionSupplementV1<'a>),
} |
Normalizer PR in #5413 along the lines of the previous comment here. |
@hsivonen Is this done? |
This is done for normalizer and collator. This is not done for segmenter (#5514). Not sure about other components. |
Steps to reproduce
Clone the repos https://github.com/hsivonen/rust-url , https://github.com/hsivonen/idna_adapter , and https://github.com/hsivonen/idna_mapping next to each other.
Check out the branch
icu4x-baked-cost
foridna_mapping
.Run
cargo bench to_ascii_cow_plain
in theidna
directory ofrust-url
with the four cases consisting of:rust-url
with branchesicu4x-baked-cost-eager
andicu4x-baked-cost-deferred
idna_adapter
with branchesicu4x-baked-cost-unicode-rs
andicu4x-baked-cost-icu4x
Actual results
When the
idna_adapter
branch is the unicode-rs version, eager vs. deferred branch ofrust-url
doesn't matter.When the
idna_adapter
branch is the icu4x version, eager vs. deferred branch ofrust-url
matters: deferred is faster.Discussion
This shows that eagerly instantiating
compiled_data
ICU4X objects via functions that areconst
still has a run-time cost that shows up in the benchmark that stays on an ASCII fast path that doesn't actually use the ICU4X objects. The unicode-rs case avoids this problem by there being no pointer chasing of static data until actual use, so the case that doesn't need any Unicode data doesn't pay any cost of the static data chasing.This shows that users of ICU4X that don't need dynamic data loading still pay a cost for ICU4X having (even compiled out) dynamic data loading support compared to an imaginable case where ICU4X data was laid out as static POD the way unicode-rs data is.
Trying to defer ICU4X struct instantiation until first use adds code complexity to do perfectly: with loops and branches, this would end up involving having to track whether the instantiation is done and paying the cost of branching on that bookkeeping. Putting the ICU4X structs, which are
const
-constructible, instatic
doesn't work, because whether the ICU4X structs areSend
andSync
depends on options onicu_provider
, so thebaked_data
case can't count on the structs beingSend
andSync
without foiling the design option that allows them not to be.Ideally, a code structure that
const
constructscompiled_data
-flavored ICU4X structs and then doesn't use them would not have a measurable run-time performance penalty.The text was updated successfully, but these errors were encountered: