granite-rust/library
bors f421586eed Auto merge of #109216 - martingms:unicode-case-lut-shrink, r=Mark-Simulacrum
Shrink unicode case-mapping LUTs by 24k

I was looking into the binary bloat of a small program using `str::to_lowercase` and `str::to_uppercase`, and noticed that the lookup tables used for case mapping had a lot of zero-bytes in them. The reason for this is that since some characters map to up to three other characters when lower or uppercased, the LUTs store a `[char; 3]` for each character. However, the vast majority of cases only map to a single new character, in other words most of the entries are e.g. `(lowerc, [upperc, '\0', '\0'])`.
This PR introduces a new encoding scheme for these tables.

The changes reduces the size of my test binary by about 24K.

I've also done some `#[bench]`marks on unicode-heavy test data, and found that the performance of both `str::to_lowercase` and `str::to_uppercase` improves by up to 20%. These measurements are obviously very dependent on the character distribution of the data.

Someone else will have to decide whether this more complex scheme is worth it or not, I was just goofing around a bit and here's what came out of it 🤷‍♂️ No hard feelings if this isn't wanted!
2023-03-24 10:33:42 +00:00
..
alloc Rollup merge of #109406 - WaffleLapkin:🥛, r=cuviper 2023-03-24 07:13:04 +01:00
backtrace@07872f28cd Update backtrace 2022-09-02 16:09:58 -04:00
core Auto merge of #109216 - martingms:unicode-case-lut-shrink, r=Mark-Simulacrum 2023-03-24 10:33:42 +00:00
panic_abort Replace libstd, libcore, liballoc in line comments. 2022-12-30 14:00:42 +01:00
panic_unwind Replace libstd, libcore, liballoc in docs. 2022-12-30 14:00:40 +01:00
portable-simd Match unmatched backticks in library/ 2023-03-03 03:03:29 +01:00
proc_macro Auto merge of #105671 - lukas-code:depreciate-char, r=scottmcm 2023-02-12 11:09:06 +00:00
profiler_builtins Fully stabilize NLL 2022-06-03 17:16:41 -04:00
rtstartup Remove custom frame info registration on i686-pc-windows-gnu 2022-08-23 16:12:58 +08:00
rustc-std-workspace-alloc Replace libstd, libcore, liballoc in line comments. 2022-12-30 14:00:42 +01:00
rustc-std-workspace-core Switch all libraries to the 2021 edition 2021-12-23 19:03:47 +08:00
rustc-std-workspace-std Switch all libraries to the 2021 edition 2021-12-23 19:03:47 +08:00
std Rollup merge of #109406 - WaffleLapkin:🥛, r=cuviper 2023-03-24 07:13:04 +01:00
stdarch@b655243782 Update stdarch 2023-03-19 20:41:22 +00:00
test Implementing "<test_binary> --list --format json" #107307 #49359 2023-03-15 14:20:20 -04:00
unwind Match unmatched backticks in library/ 2023-03-03 03:03:29 +01:00