Commit graph

6776 commits

Author SHA1 Message Date
bors
c9aa2595d9 Auto merge of #84959 - camsteffen:lint-suggest-group, r=estebank
Suggest lint groups

Fixes rust-lang/rust-clippy#6986
2021-07-20 02:11:55 +00:00
bors
6535449a00 Auto merge of #87284 - Aaron1011:remove-paren-special, r=petrochenkov
Remove special case for `ExprKind::Paren` in `MutVisitor`

The special case breaks several useful invariants (`ExpnId`s are
globally unique, and never change). This special case
was added back in 2016 in https://github.com/rust-lang/rust/pull/34355

r? `@petrochenkov`
2021-07-19 23:50:23 +00:00
Aaron Hill
f9f238e6b8
Remove special case for ExprKind::Paren in MutVisitor
The special case breaks several useful invariants (`ExpnId`s are
globally unique, and never change). This special case
was added back in 2016 in https://github.com/rust-lang/rust/pull/34355
2021-07-19 17:23:10 -05:00
bors
014026d1a7 Auto merge of #87153 - michaelwoerister:debuginfo-names-dyn-trait-projection-bounds, r=wesleywiser
[debuginfo] Emit associated type bindings in trait object type names.

This PR updates debuginfo type name generation for trait objects to include associated type bindings and auto trait bounds -- so that, for example, the debuginfo type name of `&dyn Iterator<Item=Foo>` and `&dyn Iterator<Item=Bar>` don't both map to just `&dyn Iterator` anymore.

The following table shows examples of debuginfo type names before and after the PR:
| type | before |  after |
|------|---------|-------|
| `&dyn Iterator<Item=u32>>` | `&dyn Iterator` | `&dyn Iterator<Item=u32>` |
| `&(dyn Iterator<Item=u32>> + Sync)` | `&dyn Iterator` | `&(dyn Iterator<Item=u32> + Sync)` |
| `&(dyn SomeTrait<bool, i8, Bar=u32>> + Send)` | `&dyn SomeTrait<bool, i8>` | `&(dyn SomeTrait<bool, i8, Bar=u32>> + Send)`  |

For targets that need C++-like type names, we use `assoc$<Item,u32>` instead of `Item=u32`:
| type | before |  after |
|------|---------|-------|
| `&dyn Iterator<Item=u32>>` | `ref$<dyn$<Iterator> >` | `ref$<dyn$<Iterator<assoc$<Item,u32> > > >` |
| `&(dyn Iterator<Item=u32>> + Sync)` | `ref$<dyn$<Iterator> >` | `ref$<dyn$<Iterator<assoc$<Item,u32> >,Sync> >` |
| `&(dyn SomeTrait<bool, i8, Bar=u32>> + Send)` | `ref$<dyn$<SomeTrait<bool, i8> > >` | `ref$<dyn$<SomeTrait<bool,i8,assoc$<Bar,u32> > >,Send> >`  |

The PR also adds self-profiling measurements for debuginfo type name generation (re. https://github.com/rust-lang/rust/issues/86431). It looks like the compiler spends up to 0.5% of its time in that task, so the potential for optimizing it via caching seems limited.

However, the perf run also shows [the biggest regression](https://perf.rust-lang.org/detailed-query.html?commit=585e91c718b0b2c5319e1fffd0ff1e62aaf7ccc2&base_commit=b9197978a90be6f7570741eabe2da175fec75375&benchmark=tokio-webpush-simple-debug&run_name=incr-unchanged) in a test case that does not even invoke the code in question. This suggests that the length of the names we generate here can affect performance by influencing how much data the linker has to copy around.

Fixes https://github.com/rust-lang/rust/issues/86134.
2021-07-19 21:25:43 +00:00
bors
d5af63480f Auto merge of #87225 - estebank:cleanup, r=oli-obk
Various diagnostics clean ups/tweaks

* Always point at macros, including derive macros
* Point at non-local items that introduce a trait requirement
* On private associated item, point at definition
2021-07-19 18:44:27 +00:00
Esteban Küber
ba052bd8de Various diagnostics clean ups/tweaks
* Always point at macros, including derive macros
* Point at non-local items that introduce a trait requirement
* On private associated item, point at definition
2021-07-19 08:43:35 -07:00
bors
b543e0dc03 Auto merge of #86970 - inquisitivecrystal:force-warn, r=davidtwco
Make `--force-warns` a normal lint level option

Now that `ForceWarn` is a lint level, there's no reason `--force-warns` should be treated differently from other options that set lint levels. This merges the `ForceWarn` handling in with the other lint level command line options. It also unifies all of the relevant selection logic in `compiler/rustc_lint/src/levels.rs`, rather than having some of it weirdly elsewhere.

Fixes #86958, which arose from the special-cased handling of `ForceWarn` having had an error in it.
2021-07-19 13:18:04 +00:00
Guillaume Gomez
6cb69ea790
Rollup merge of #87268 - SkiFire13:fix-uninit-ref-list, r=nagisa
Don't create references to uninitialized data in `List::from_arena`

Previously `result` and `arena_slice` were references pointing to uninitialized data, which is technically UB. They may have been fine because the pointed data is `Copy` and and they were only written to, but the semantics of this aren't clearly defined yet, and since we have a sound way to do the same thing I don't think we should keep the possibly-unsound way.
2021-07-19 11:37:49 +02:00
Guillaume Gomez
4dd32a1cac
Rollup merge of #87256 - Aaron1011:hir-wf-assoc-default, r=oli-obk
Extend HIR-based WF checking to associated type defaults

Previously, we would only look at associated types in `impl` blocks.
2021-07-19 11:37:47 +02:00
Giacomo Stevanato
98e9d16d25 Don't create references to uninitialized data in List::from_arena 2021-07-19 10:47:45 +02:00
bors
0ecff8c623 Auto merge of #87146 - Aaron1011:better-macro-lint, r=petrochenkov
Compute a better `lint_node_id` during expansion

When we need to emit a lint at a macro invocation, we currently use the
`NodeId` of its parent definition (e.g. the enclosing function). This
means that any `#[allow]` / `#[deny]` attributes placed 'closer' to the
macro (e.g. on an enclosing block or statement) will have no effect.

This commit computes a better `lint_node_id` in `InvocationCollector`.
When we visit/flat_map an AST node, we assign it a `NodeId` (earlier
than we normally would), and store than `NodeId` in current
`ExpansionData`. When we collect a macro invocation, the current
`lint_node_id` gets cloned along with our `ExpansionData`, allowing it
to be used if we need to emit a lint later on.

This improves the handling of `#[allow]` / `#[deny]` for
`SEMICOLON_IN_EXPRESSIONS_FROM_MACROS` and some `asm!`-related lints.
The 'legacy derive helpers' lint retains its current behavior
(I've inlined the now-removed `lint_node_id` function), since
there isn't an `ExpansionData` readily available.
2021-07-19 04:22:51 +00:00
bors
10c0b003db Auto merge of #86848 - notriddle:notriddle/drop-dyn, r=varkor
feat(rustc_lint): add `dyn_drop`

Based on the conversation in #86747.

Explanation
-----------

A trait object bound of the form `dyn Drop` is most likely misleading and not what the programmer intended.

`Drop` bounds do not actually indicate whether a type can be trivially dropped or not, because a composite type containing `Drop` types does not necessarily implement `Drop` itself. Naïvely, one might be tempted to write a deferred drop system, to pull cleaning up memory out of a latency-sensitive code path, using `dyn Drop` trait objects. However, this breaks down e.g. when `T` is `String`, which does not implement `Drop`, but should probably be accepted.

To write a trait object bound that accepts anything, use a placeholder trait with a blanket implementation.

```rust
trait Placeholder {}
impl<T> Placeholder for T {}
fn foo(_x: Box<dyn Placeholder>) {}
```
2021-07-19 01:41:54 +00:00
bors
b548d9f1c6 Auto merge of #87004 - JamieCunliffe:pgo-gc-sections, r=Mark-Simulacrum
Don't use gc-sections with profile-generate.

When building with profile-generate don't call gc_sections as this can
can sometimes strip out profile data. This missing information in the
prof files can then result in missing functions when using the profile
information.

#78226

r? `@Mark-Simulacrum`
2021-07-18 23:14:31 +00:00
bors
59216858a3 Auto merge of #86950 - tmiasko:personality, r=nagisa
Use existing declaration of rust_eh_personality

If crate declares `rust_eh_personality`, re-use existing declaration
as otherwise attempts to set function attributes that follow the
declaration will fail (unless it happens to have exactly the same
type signature as the one predefined in the compiler).

Fixes #70117.
Fixes https://github.com/rust-lang/rust/pull/81469#issuecomment-809428126; probably.
2021-07-18 20:33:23 +00:00
Aaron Hill
93aa89023f
Extend HIR-based WF checking to associated type defaults
Previously, we would only look at associated types in `impl` blocks.
2021-07-18 10:36:54 -05:00
Michael Howell
dbd4fd5716 feat(rustc_lint): add dyn_drop
Based on the conversation in #86747.

Explanation
-----------

A trait object bound of the form `dyn Drop` is most likely misleading
and not what the programmer intended.

`Drop` bounds do not actually indicate whether a type can be trivially
dropped or not, because a composite type containing `Drop` types does
not necessarily implement `Drop` itself. Naïvely, one might be tempted
to write a deferred drop system, to pull cleaning up memory out of a
latency-sensitive code path, using `dyn Drop` trait objects. However,
this breaks down e.g. when `T` is `String`, which does not implement
`Drop`, but should probably be accepted.

To write a trait object bound that accepts anything, use a placeholder
trait with a blanket implementation.

```rust
trait Placeholder {}
impl<T> Placeholder for T {}
fn foo(_x: Box<dyn Placeholder>) {}
```
2021-07-18 07:55:57 -07:00
bors
18073052d8 Auto merge of #86698 - cjgillot:modc, r=estebank
Move OnDiskCache to rustc_query_impl.

This should be the last remnant of the query implementation that was still in rustc_middle.
2021-07-18 10:42:23 +00:00
Camille GILLOT
5b921505ef Remove deadlock virtual call. 2021-07-18 11:14:08 +02:00
Camille GILLOT
81241cbf3a Move OnDiskCache to rustc_query_impl. 2021-07-18 11:14:07 +02:00
bors
5a8a44196b Auto merge of #87242 - JohnTitor:rollup-t9rmwpo, r=JohnTitor
Rollup of 8 pull requests

Successful merges:

 - #86763 (Add a regression test for issue-63355)
 - #86814 (Recover from a misplaced inner doc comment)
 - #86843 (Check that const parameters of trait methods have compatible types)
 - #86889 (rustdoc: Cleanup ExternalCrate)
 - #87092 (Remove nondeterminism in multiple-definitions test)
 - #87170 (Add diagnostic items for Clippy)
 - #87183 (fix typo in compile_fail doctest)
 - #87205 (rustc_middle: remove redundant clone)

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
2021-07-18 08:15:17 +00:00
inquisitivecrystal
2f2db99432 Make --force-warns a normal lint level option 2021-07-17 23:13:59 -07:00
bors
3ab6b60337 Auto merge of #87071 - inquisitivecrystal:inclusive-range, r=estebank
Add diagnostics for mistyped inclusive range

Inclusive ranges are correctly typed as `..=`. However, it's quite easy to think of it as being like `==`, and type `..==` instead. This PR adds helpful diagnostics for this case.

Resolves #86395 (there are some other cases there, but I think those should probably have separate issues).

r? `@estebank`
2021-07-18 05:58:16 +00:00
Yuki Okushi
810e47897a
Rollup merge of #87205 - matthiaskrgr:clippy_cln, r=oli-obk
rustc_middle: remove redundant clone

found while looking through some clippy lint warnings
2021-07-18 14:21:59 +09:00
Yuki Okushi
07faa2e32c
Rollup merge of #87170 - xFrednet:clippy-5393-add-diagnostic-items, r=Manishearth,oli-obk
Add diagnostic items for Clippy

This adds a bunch of diagnostic items to `std`/`core`/`alloc` functions, structs and traits used in Clippy. The actual refactorings in Clippy to use these items will be done in a different PR in Clippy after the next sync.

This PR doesn't include all paths Clippy uses, I've only gone through the first 85 lines of Clippy's [`paths.rs`](ecf85f4bdc/clippy_utils/src/paths.rs) (after rust-lang/rust-clippy#7466) to get some feedback early on. I've also decided against adding diagnostic items to methods, as it would be nicer and more scalable to access them in a nicer fashion, like adding a `is_diagnostic_assoc_item(did, sym::Iterator, sym::map)` function or something similar (Suggested by `@camsteffen` [on Zulip](https://rust-lang.zulipchat.com/#narrow/stream/147480-t-compiler.2Fwg-diagnostics/topic/Diagnostic.20Item.20Naming.20Convention.3F/near/225024603))

There seems to be some different naming conventions when it comes to diagnostic items, some use UpperCamelCase (`BinaryHeap`) and some snake_case (`hashmap_type`). This PR uses UpperCamelCase for structs and traits and snake_case with the module name as a prefix for functions. Any feedback on is this welcome.

cc: rust-lang/rust-clippy#5393

r? `@Manishearth`
2021-07-18 14:21:57 +09:00
Yuki Okushi
81d0b70402
Rollup merge of #87092 - ricobbe:fix-raw-dylib-multiple-definitions, r=petrochenkov
Remove nondeterminism in multiple-definitions test

Compare all fields in `DllImport` when sorting to avoid nondeterminism in the error for multiple inconsistent definitions of an extern function.  Restore the multiple-definitions test.

Resolves #87084.
2021-07-18 14:21:56 +09:00
Yuki Okushi
783efd29ae
Rollup merge of #86843 - FabianWolff:issue-86820, r=lcnr
Check that const parameters of trait methods have compatible types

This PR fixes #86820. The problem is that this currently passes the type checker:
```rust
trait Tr {
    fn foo<const N: u8>(self) -> u8;
}

impl Tr for f32 {
    fn foo<const N: bool>(self) -> u8 { 42 }
}
```
i.e. the type checker fails to check whether const parameters in `impl` methods have the same type as the corresponding declaration in the trait. With my changes, I get, for the above code:
```
error[E0053]: method `foo` has an incompatible const parameter type for trait
 --> test.rs:6:18
  |
6 |     fn foo<const N: bool>(self) -> u8 { 42 }
  |                  ^
  |
note: the const parameter `N` has type `bool`, but the declaration in trait `Tr::foo` has type `u8`
 --> test.rs:2:18
  |
2 |     fn foo<const N: u8>(self) -> u8;
  |                  ^

error: aborting due to previous error
```
This fixes #86820, where an ICE happens later on because the trait method is declared with a const parameter of type `u8`, but the `impl` uses one of type `usize`:
> `expected int of size 8, but got size 1`
2021-07-18 14:21:54 +09:00
Yuki Okushi
469935f7a4
Rollup merge of #86814 - Aaron1011:inner-doc-recover, r=estebank
Recover from a misplaced inner doc comment

Fixes #86781
2021-07-18 14:21:53 +09:00
Aaron Hill
7ca089c6d2
Only use assign_id! for ast nodes that support attributes 2021-07-17 23:03:58 -05:00
Aaron Hill
d6e3c11101
Add additional missing lint handling logic 2021-07-17 23:03:58 -05:00
Aaron Hill
2bd15a25ef
Add missing visit_expr_field 2021-07-17 23:03:57 -05:00
Aaron Hill
ddd544856e
Compute a better lint_node_id during expansion
When we need to emit a lint at a macro invocation, we currently use the
`NodeId` of its parent definition (e.g. the enclosing function). This
means that any `#[allow]` / `#[deny]` attributes placed 'closer' to the
macro (e.g. on an enclosing block or statement) will have no effect.

This commit computes a better `lint_node_id` in `InvocationCollector`.
When we visit/flat_map an AST node, we assign it a `NodeId` (earlier
than we normally would), and store than `NodeId` in current
`ExpansionData`. When we collect a macro invocation, the current
`lint_node_id` gets cloned along with our `ExpansionData`, allowing it
to be used if we need to emit a lint later on.

This improves the handling of `#[allow]` / `#[deny]` for
`SEMICOLON_IN_EXPRESSIONS_FROM_MACROS` and some `asm!`-related lints.
The 'legacy derive helpers' lint retains its current behavior
(I've inlined the now-removed `lint_node_id` function), since
there isn't an `ExpansionData` readily available.
2021-07-17 23:03:56 -05:00
bors
77d155973c Auto merge of #85686 - ptrojahn:loop_reinitialize, r=estebank
Add help on reinitialization between move and access

Fixes #83760
2021-07-18 02:13:12 +00:00
bors
eb0b95b55a Auto merge of #87129 - FabianWolff:issue-75356, r=varkor
Warn about useless assignments of variables/fields to themselves

This PR fixes #75356. Following `@varkor's` suggestion in https://github.com/rust-lang/rust/issues/75356#issuecomment-700339154, I have implemented this warning as part of the `dead_code` lint. Unlike the `-Wself-assign` implementation in [Clang](56e6d4742e/clang/lib/Sema/SemaExpr.cpp (L13875-L13909)), my implementation also warns about self-assignments of struct fields (`s.x = s.x`).

r? `@varkor`
2021-07-17 22:51:07 +00:00
bors
c7331d65bd Auto merge of #87203 - jackh726:logging, r=nikomatsakis
Some perf optimizations and logging

Various bits of (potential) perf optimizations and some logging additions/changes pulled out from #85499

The only not extremely straightforward change is adding `needs_normalization` in `trait::project`. This is just a perf optimization to avoid fold through a type with *only* opaque types in `UserFacing` mode (as they aren't replaced).

This should be a simple PR for *anyone* to review, so I'm going to let highfive assign. But I'll go ahead and cc `@nikomatsakis` in case he has time today.
2021-07-17 20:23:58 +00:00
jackh726
fa839b1194 Add needs_normalization 2021-07-17 16:09:22 -04:00
jackh726
d954a8ee8e Some perf optimizations and logging 2021-07-17 16:09:17 -04:00
bors
68511b574f Auto merge of #86676 - cjgillot:localexpn, r=petrochenkov
Make expansions stable for incr. comp.

This PR aims to make expansions stable for incr. comp. by using the same architecture as definitions:
- the interned identifier `ExpnId` contains a `CrateNum` and a crate-local id;
- bidirectional maps `ExpnHash <-> ExpnId` are setup;
- incr. comp. on-disk cache saves and reconstructs expansions using their `ExpnHash`.

I tried to use as many `LocalExpnId` as I could in the resolver code, but I may have missed a few opportunities.

All this will allow to use an `ExpnId` as a query key, and to force this query without recomputing caller queries. For instance, this will be used to implement #85999.

r? `@petrochenkov`
2021-07-17 17:56:46 +00:00
Camille GILLOT
b35ceeeec7 Simplify Expn creation. 2021-07-17 19:41:14 +02:00
Camille GILLOT
dddaa6d068 Rename expn_info -> expn_data. 2021-07-17 19:41:13 +02:00
Camille GILLOT
0f8573e57b Pass ExpnData by reference. 2021-07-17 19:41:12 +02:00
Camille GILLOT
a51b131fd1 Always hash spans in expn. 2021-07-17 19:41:11 +02:00
Camille GILLOT
41c1f39fa8 Drop ExpnData::krate. 2021-07-17 19:41:10 +02:00
Camille GILLOT
dbd2d77641 Drop orig_id. 2021-07-17 19:41:09 +02:00
Camille GILLOT
37a13def48 Encode ExpnId using ExpnHash for incr. comp. 2021-07-17 19:41:08 +02:00
Camille GILLOT
2fe37c5bd1 Choose encoding format in caller code. 2021-07-17 19:41:07 +02:00
Camille GILLOT
078dd37f88 Use LocalExpnId where possible. 2021-07-17 19:41:02 +02:00
Camille GILLOT
6e78d6c9d6 Make the CrateNum part of the ExpnId. 2021-07-17 19:35:33 +02:00
bors
c78ebb7bdc Auto merge of #87123 - RalfJung:miri-provenance-overhaul, r=oli-obk
CTFE/Miri engine Pointer type overhaul

This fixes the long-standing problem that we are using `Scalar` as a type to represent pointers that might be integer values (since they point to a ZST). The main problem is that with int-to-ptr casts, there are multiple ways to represent the same pointer as a `Scalar` and it is unclear if "normalization" (i.e., the cast) already happened or not. This leads to ugly methods like `force_mplace_ptr` and `force_op_ptr`.
Another problem this solves is that in Miri, it would make a lot more sense to have the `Pointer::offset` field represent the full absolute address (instead of being relative to the `AllocId`). This means we can do ptr-to-int casts without access to any machine state, and it means that the overflow checks on pointer arithmetic are (finally!) accurate.

To solve this, the `Pointer` type is made entirely parametric over the provenance, so that we can use `Pointer<AllocId>` inside `Scalar` but use `Pointer<Option<AllocId>>` when accessing memory (where `None` represents the case that we could not figure out an `AllocId`; in that case the `offset` is an absolute address). Moreover, the `Provenance` trait determines if a pointer with a given provenance can be cast to an integer by simply dropping the provenance.

I hope this can be read commit-by-commit, but the first commit does the bulk of the work. It introduces some FIXMEs that are resolved later.
Fixes https://github.com/rust-lang/miri/issues/841
Miri PR: https://github.com/rust-lang/miri/pull/1851
r? `@oli-obk`
2021-07-17 15:26:27 +00:00
bors
f502bd3abd Auto merge of #86761 - Alexhuszagh:master, r=estebank
Update Rust Float-Parsing Algorithms to use the Eisel-Lemire algorithm.

# Summary

Rust, although it implements a correct float parser, has major performance issues in float parsing. Even for common floats, the performance can be 3-10x [slower](https://arxiv.org/pdf/2101.11408.pdf) than external libraries such as [lexical](https://github.com/Alexhuszagh/rust-lexical) and [fast-float-rust](https://github.com/aldanor/fast-float-rust).

Recently, major advances in float-parsing algorithms have been developed by Daniel Lemire, along with others, and implement a fast, performant, and correct float parser, with speeds up to 1200 MiB/s on Apple's M1 architecture for the [canada](0e2b5d163d/data/canada.txt) dataset, 10x faster than Rust's 130 MiB/s.

In addition, [edge-cases](https://github.com/rust-lang/rust/issues/85234) in Rust's [dec2flt](868c702d0c/library/core/src/num/dec2flt) algorithm can lead to over a 1600x slowdown relative to efficient algorithms. This is due to the use of Clinger's correct, but slow [AlgorithmM and Bellepheron](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.45.4152&rep=rep1&type=pdf), which have been improved by faster big-integer algorithms and the Eisel-Lemire algorithm, respectively.

Finally, this algorithm provides substantial improvements in the number of floats the Rust core library can parse. Denormal floats with a large number of digits cannot be parsed, due to use of the `Big32x40`, which simply does not have enough digits to round a float correctly. Using a custom decimal class, with much simpler logic, we can parse all valid decimal strings of any digit count.

```rust
// Issue in Rust's dec2fly.
"2.47032822920623272088284396434110686182e-324".parse::<f64>();   // Err(ParseFloatError { kind: Invalid })
```

# Solution

This pull request implements the Eisel-Lemire algorithm, modified from [fast-float-rust](https://github.com/aldanor/fast-float-rust) (which is licensed under Apache 2.0/MIT), along with numerous modifications to make it more amenable to inclusion in the Rust core library. The following describes both features in fast-float-rust and improvements in fast-float-rust for inclusion in core.

**Documentation**

Extensive documentation has been added to ensure the code base may be maintained by others, which explains the algorithms as well as various associated constants and routines. For example, two seemingly magical constants include documentation to describe how they were derived as follows:

```rust
    // Round-to-even only happens for negative values of q
    // when q ≥ −4 in the 64-bit case and when q ≥ −17 in
    // the 32-bitcase.
    //
    // When q ≥ 0,we have that 5^q ≤ 2m+1. In the 64-bit case,we
    // have 5^q ≤ 2m+1 ≤ 2^54 or q ≤ 23. In the 32-bit case,we have
    // 5^q ≤ 2m+1 ≤ 2^25 or q ≤ 10.
    //
    // When q < 0, we have w ≥ (2m+1)×5^−q. We must have that w < 2^64
    // so (2m+1)×5^−q < 2^64. We have that 2m+1 > 2^53 (64-bit case)
    // or 2m+1 > 2^24 (32-bit case). Hence,we must have 2^53×5^−q < 2^64
    // (64-bit) and 2^24×5^−q < 2^64 (32-bit). Hence we have 5^−q < 2^11
    // or q ≥ −4 (64-bit case) and 5^−q < 2^40 or q ≥ −17 (32-bitcase).
    //
    // Thus we have that we only need to round ties to even when
    // we have that q ∈ [−4,23](in the 64-bit case) or q∈[−17,10]
    // (in the 32-bit case). In both cases,the power of five(5^|q|)
    // fits in a 64-bit word.
    const MIN_EXPONENT_ROUND_TO_EVEN: i32;
    const MAX_EXPONENT_ROUND_TO_EVEN: i32;
```

This ensures maintainability of the code base.

**Improvements for Disguised Fast-Path Cases**

The fast path in float parsing algorithms attempts to use native, machine floats to represent both the significant digits and the exponent, which is only possible if both can be exactly represented without rounding. In practice, this means that the significant digits must be 53-bits or less and the then exponent must be in the range `[-22, 22]` (for an f64). This is similar to the existing dec2flt implementation.

However, disguised fast-path cases exist, where there are few significant digits and an exponent above the valid range, such as `1.23e25`. In this case, powers-of-10 may be shifted from the exponent to the significant digits, discussed at length in https://github.com/rust-lang/rust/issues/85198.

**Digit Parsing Improvements**

Typically, integers are parsed from string 1-at-a-time, requiring unnecessary multiplications which can slow down parsing. An approach to parse 8 digits at a time using only 3 multiplications is described in length [here](https://johnnylee-sde.github.io/Fast-numeric-string-to-int/). This leads to significant performance improvements, and is implemented for both big and little-endian systems.

**Unsafe Changes**

Relative to fast-float-rust, this library makes less use of unsafe functionality and clearly documents it. This includes the refactoring and documentation of numerous unsafe methods undesirably marked as safe. The original code would look something like this, which is deceptively marked as safe for unsafe functionality.

```rust
impl AsciiStr {
    #[inline]
    pub fn step_by(&mut self, n: usize) -> &mut Self {
        unsafe { self.ptr = self.ptr.add(n) };
        self
    }
}

...

#[inline]
fn parse_scientific(s: &mut AsciiStr<'_>) -> i64 {
    // the first character is 'e'/'E' and scientific mode is enabled
    let start = *s;
    s.step();
    ...
}
```

The new code clearly documents safety concerns, and does not mark unsafe functionality as safe, leading to better safety guarantees.

```rust
impl AsciiStr {
    /// Advance the view by n, advancing it in-place to (n..).
    pub unsafe fn step_by(&mut self, n: usize) -> &mut Self {
        // SAFETY: same as step_by, safe as long n is less than the buffer length
        self.ptr = unsafe { self.ptr.add(n) };
        self
    }
}

...

/// Parse the scientific notation component of a float.
fn parse_scientific(s: &mut AsciiStr<'_>) -> i64 {
    let start = *s;
    // SAFETY: the first character is 'e'/'E' and scientific mode is enabled
    unsafe {
        s.step();
    }
    ...
}
```

This allows us to trivially demonstrate the new implementation of dec2flt is safe.

**Inline Annotations Have Been Removed**

In the previous implementation of dec2flt, inline annotations exist practically nowhere in the entire module. Therefore, these annotations have been removed, which mostly does not impact [performance](https://github.com/aldanor/fast-float-rust/issues/15#issuecomment-864485157).

**Fixed Correctness Tests**

Numerous compile errors in `src/etc/test-float-parse` were present, due to deprecation of `time.clock()`, as well as the crate dependencies with `rand`. The tests have therefore been reworked as a [crate](https://github.com/Alexhuszagh/rust/tree/master/src/etc/test-float-parse), and any errors in `runtests.py` have been patched.

**Undefined Behavior**

An implementation of `check_len` which relied on undefined behavior (in fast-float-rust) has been refactored, to ensure that the behavior is well-defined. The original code is as follows:

```rust
    #[inline]
    pub fn check_len(&self, n: usize) -> bool {
        unsafe { self.ptr.add(n) <= self.end }
    }
```

And the new implementation is as follows:

```rust
    /// Check if the slice at least `n` length.
    fn check_len(&self, n: usize) -> bool {
        n <= self.as_ref().len()
    }
```

Note that this has since been fixed in [fast-float-rust](https://github.com/aldanor/fast-float-rust/pull/29).

**Inferring Binary Exponents**

Rather than explicitly store binary exponents, this new implementation infers them from the decimal exponent, reducing the amount of static storage required. This removes the requirement to store [611 i16s](868c702d0c/library/core/src/num/dec2flt/table.rs (L8)).

# Code Size

The code size, for all optimizations, does not considerably change relative to before for stripped builds, however it is **significantly** smaller prior to stripping the resulting binaries. These binary sizes were calculated on x86_64-unknown-linux-gnu.

**new**

Using rustc version 1.55.0-dev.

opt-level|size|size(stripped)
|:-:|:-:|:-:|
0|400k|300K
1|396k|292K
2|392k|292K
3|392k|296K
s|396k|292K
z|396k|292K

**old**

Using rustc version 1.53.0-nightly.

opt-level|size|size(stripped)
|:-:|:-:|:-:|
0|3.2M|304K
1|3.2M|292K
2|3.1M|284K
3|3.1M|284K
s|3.1M|284K
z|3.1M|284K

# Correctness

The dec2flt implementation passes all of Rust's unittests and comprehensive float parsing tests, along with numerous other tests such as Nigel Toa's comprehensive float [tests](https://github.com/nigeltao/parse-number-fxx-test-data) and Hrvoje Abraham  [strtod_tests](https://github.com/ahrvoje/numerics/blob/master/strtod/strtod_tests.toml). Therefore, it is unlikely that this algorithm will incorrectly round parsed floats.

# Issues Addressed

This will fix and close the following issues:

- resolves #85198
- resolves #85214
- resolves #85234
- fixes #31407
- fixes #31109
- fixes #53015
- resolves #68396
- closes https://github.com/aldanor/fast-float-rust/issues/15
2021-07-17 12:56:22 +00:00
xFrednet
67002db2cf Corrected symbol order after adding diagnostic items 2021-07-17 12:20:43 +02:00