Most of these are because alloc uses `#[lang_item]` to define methods,
but core documents primitives before those methods are available.
- Fix rustdoc-js-std test
For some reason this change made CStr not show up in the results for
`str,u8`. Since it still shows up for str, and since it wasn't a great
match for that query anyway, I think this is ok to let slide.
- Add test that all primitives can be linked to
- Enable `doc(primitive)` in `core` as well
- Add linkcheck exception specifically for Windows
Ideally this would be done automatically by the linkchecker by
replacing `\\` with forward slashes, but this PR is already a ton of
work ...
- Don't forcibly fail linkchecking if there's a broken intra-doc link on Windows
Previously, it would exit with a hard error if a missing file had `::`
in it. This changes it to report a missing file instead, which allows
adding an exception.
This works by doing two things:
- Adding links that are specific to the crate. Since not all primitive
items are defined in `core` (due to lang_items), these need to use
relative links and not intra-doc links.
- Duplicating `primitive_docs` in both core and std. This allows not needing CARGO_PKG_NAME to build the standard library. It also adds a tidy check to make sure they stay the same.
Reword description of automatic impls of `Unsize`.
The existing documentation felt a little unhelpfully concise, so this change tries to improve it by using longer sentences, each of which specifies which kinds of types it applies to as early as possible. In particular, the third item starts with “Structs ...” instead of saying “Foo is a struct” later.
Also, the previous list items “Only the last field has a type involving `T`” and “`T` is not part of the type of any other fields” are, as far as I see, redundant with each other, so I removed the latter.
I have no particular knowledge of `Unsize`; I have attempted to leave the meaning entirely unchanged but may have missed a nuance.
Markdown preview of the edited documentation:
> All implementations of `Unsize` are provided automatically by the compiler.
> Those implementations are:
>
> - Arrays `[T; N]` implement `Unsize<[T]>`.
> - Types implementing a trait `Trait` also implement `Unsize<dyn Trait>`.
> - Structs `Foo<..., T, ...>` implement `Unsize<Foo<..., U, ...>>` if all of these conditions
> are met:
> - `T: Unsize<U>`.
> - Only the last field of `Foo` has a type involving `T`.
> - `Bar<T>: Unsize<Bar<U>>`, where `Bar<T>` stands for the actual type of that last field.
I originally added this bound in an attempt to make the specialization
sound for owning iterators but it was never correct here and the correct
and already implemented solution is is to place it on the IntoIter
implementation.
This is achieved with a branchless bit-twiddling implementation of the
case x < 100_000, and using this as building block.
Benchmark on an Intel i7-8700K (Coffee Lake):
name old ns/iter new ns/iter diff ns/iter diff % speedup
num::int_log::u8_log10_predictable 165 169 4 2.42% x 0.98
num::int_log::u8_log10_random 438 423 -15 -3.42% x 1.04
num::int_log::u8_log10_random_small 438 423 -15 -3.42% x 1.04
num::int_log::u16_log10_predictable 633 417 -216 -34.12% x 1.52
num::int_log::u16_log10_random 908 471 -437 -48.13% x 1.93
num::int_log::u16_log10_random_small 945 471 -474 -50.16% x 2.01
num::int_log::u32_log10_predictable 1,496 1,340 -156 -10.43% x 1.12
num::int_log::u32_log10_random 1,076 873 -203 -18.87% x 1.23
num::int_log::u32_log10_random_small 1,145 874 -271 -23.67% x 1.31
num::int_log::u64_log10_predictable 4,005 3,171 -834 -20.82% x 1.26
num::int_log::u64_log10_random 1,247 1,021 -226 -18.12% x 1.22
num::int_log::u64_log10_random_small 1,265 921 -344 -27.19% x 1.37
num::int_log::u128_log10_predictable 39,667 39,579 -88 -0.22% x 1.00
num::int_log::u128_log10_random 6,456 6,696 240 3.72% x 0.96
num::int_log::u128_log10_random_small 4,108 3,903 -205 -4.99% x 1.05
Benchmark on an M1 Mac Mini:
name old ns/iter new ns/iter diff ns/iter diff % speedup
num::int_log::u8_log10_predictable 143 130 -13 -9.09% x 1.10
num::int_log::u8_log10_random 375 325 -50 -13.33% x 1.15
num::int_log::u8_log10_random_small 376 325 -51 -13.56% x 1.16
num::int_log::u16_log10_predictable 500 322 -178 -35.60% x 1.55
num::int_log::u16_log10_random 794 405 -389 -48.99% x 1.96
num::int_log::u16_log10_random_small 1,035 405 -630 -60.87% x 2.56
num::int_log::u32_log10_predictable 1,144 894 -250 -21.85% x 1.28
num::int_log::u32_log10_random 832 786 -46 -5.53% x 1.06
num::int_log::u32_log10_random_small 832 787 -45 -5.41% x 1.06
num::int_log::u64_log10_predictable 2,681 2,057 -624 -23.27% x 1.30
num::int_log::u64_log10_random 1,015 806 -209 -20.59% x 1.26
num::int_log::u64_log10_random_small 1,004 795 -209 -20.82% x 1.26
num::int_log::u128_log10_predictable 56,825 56,526 -299 -0.53% x 1.01
num::int_log::u128_log10_random 9,056 8,861 -195 -2.15% x 1.02
num::int_log::u128_log10_random_small 1,528 1,527 -1 -0.07% x 1.00
The 128 bit case remains ridiculously slow because llvm fails to optimize division by
a constant 128-bit value to multiplications. This could be worked around but it seems
preferable to fix this in llvm.
From u32 up, table lookup (like suggested here
https://github.com/rust-lang/rust/issues/70887#issuecomment-881099813) is still
faster, but requires a hardware leading_zero to be viable, and might clog up the
cache.
Correct “copies” to “moves” in `<Option<T> as From<T>>::from` doc, and other copyediting
The `impl<T> From<T> for Option<T>` has no `Copy` or `Clone` bound, so its operation is guaranteed to be a move. The call site might copy, but the function itself cannot.
Since that would have been a rather small PR, I also reviewed the other documentation in the file and made other improvements (in separate commits): adding periods and commas, linking `Deref::Target`, and clarifying what "a container" is in `FromIterator`.
More symbolic doc aliases
A bunch of small changes, mostly adding `#[doc(alias = "…")]` entries for symbolic `"…"`.
Also a small change in documentation of `const` keywords.
* Clarify rounding.
* Avoid "wrapping" wording.
* Omit wrong claim on 0 only being returned in error cases.
* Typo fix for one_less_than_next_power_of_two.
Add links in docs for some primitive types
This pull request adds additional links in existing documentation of some of the primitive types.
Where items are linked only once, I have used the `[link](destination)` format. For items in `std`, I have linked directly to the HTML, since although the primitives are in `core`, they are not displayed on `core` documentation. I was unsure of what length I should keep lines of documentation to, so I tried to keep them within reason.
Additionally, I have avoided excessively linking to keywords like `self` when they are not relevant to the documentation. I can add these links if it would be an improvement.
I hope this can improve Rust. Please let me know if there's anything I did wrong!
This implementation has no `Copy` or `Clone` bound, so its operation is
guaranteed to be a move. The call site might copy, but the function
itself cannot.
Also linkify `Some` while we're touching the line anyway.
remove redundant / misplaced sentence from docs
Removes sentence that seems to have drifted down into the examples section. Luckily, someone already added an explanation of what happens with packed structs back into the initial section of the doc entry and this wayward sentence can likely just be deleted.
Add an example for deriving PartialOrd on enums
For some reason, I always forget which variants are smaller and which
are larger when you derive PartialOrd on an enum. And the wording in the
current docs is not entirely clear to me.
So, I often end up making a small enum, deriving PartialOrd on it, and
then writing a `#[test]` with an assert that the top one is smaller than
the bottom one (or the other way around) to figure out which way the
deriving goes.
So then I figured, it would be great if the standard library docs just
had that example, so if I keep forgetting, at least I can figure it out
quickly by looking at std's docs.
`fmt::Formatter::pad`: don't call chars().count() more than one time
First commit merges two branches of match to call chars().count() only once: that should be faster if this method hits place of 3rd (previous) branch, plus quarter shorter.
Second commit fixes some clippy lints while i'm here (should it be separate PR?).
Allow writing of incomplete UTF-8 sequences to the Windows console via stdout/stderr
# Problem
Writes of just an incomplete UTF-8 byte sequence (e.g. `b"\xC3"` or `b"\xF0\x9F"`) to stdout/stderr with a Windows console attached error with `io::ErrorKind::InvalidData, "Windows stdio in console mode does not support writing non-UTF-8 byte sequences"` even though further writes could complete the codepoint. This is currently a rare occurence since the [linewritershim](2c56ea38b0/library/std/src/io/buffered/linewritershim.rs) implementation flushes complete lines immediately and buffers up to 1024 bytes for incomplete lines. It can still happen as described in #83258.
The problem will become more pronounced once the developer can switch stdout/stderr from line-buffered to block-buffered or immediate when the changes in the "Switchable buffering for Stdout" pull request (#78515) get merged.
# Patch description
If there is at least one valid UTF-8 codepoint all valid UTF-8 is passed through to the extracted `write_valid_utf8_to_console()` fn. The new code only comes into play if `write()` is being passed a short byte slice comprising an incomplete UTF-8 codepoint. In this case up to three bytes are buffered in the `IncompleteUtf8` struct associated with `Stdout` / `Stderr`. The bytes are accepted one at a time. As soon as an error can be detected `io::ErrorKind::InvalidData, "Windows stdio in console mode does not support writing non-UTF-8 byte sequences"` is returned. Once a complete UTF-8 codepoint is received it is passed to the `write_valid_utf8_to_console()` and the buffer length is set to zero.
Calling `flush()` will neither error nor write anything if an incomplete codepoint is present in the buffer.
# Tests
Currently there are no Windows-specific tests for console writing code at all. Writing (regression) tests for this problem is a bit challenging since unit tests and UI tests don't run in a console and suddenly popping up another console window might be surprising to developers running the testsuite and it might not work at all in CI builds. To just test the new functionality in unit tests the code would need to be refactored. Some guidance on how to proceed would be appreciated.
# Public API changes
* `std::str::verifications::utf8_char_width()` would be exposed as `std::str::utf8_char_width()` behind the "str_internals" feature gate.
# Related issues
* Fixes#83258.
* PR #78515 will exacerbate the problem.
# Open questions
* Add tests?
* Squash into one commit with better commit message?