coverage: Add enums to accommodate other kinds of coverage mappings
Extracted from #118305.
LLVM supports several different kinds of coverage mapping regions, but currently we only ever emit ordinary “code” regions. This PR performs the plumbing required to add other kinds of regions as enum variants, but does not add any specific variants other than `Code`.
The main motivation for this change is branch coverage, but it will also allow separate experimentation with gap regions and skipped regions, which might help in producing more accurate and useful coverage reports.
---
``@rustbot`` label +A-code-coverage
This is less elegant in some ways, since we no longer visit a BCB's spans as a
batch, but will make it much easier to add support for other kinds of coverage
mapping regions (e.g. branch regions or gap regions).
Reorder early post-inlining passes.
`RemoveZsts`, `RemoveUnneededDrops` and `UninhabitedEnumBranching` only depend on types, so they should be executed together early after MIR inlining introduces those types.
This does not change the end-result, but this makes the pipeline a bit more consistent.
Merge dead bb pruning and unreachable bb deduplication.
Both routines share the same basic structure: iterate on all bbs to identify work, and then renumber bbs.
We can do both at once.
coverage: `llvm-cov` expects column numbers to be bytes, not code points
Normally the compiler emits column numbers as a 1-based number of Unicode code points.
But when we embed coverage mappings for `-Cinstrument-coverage`, those mappings will ultimately be read by the `llvm-cov` tool. That tool assumes that column numbers are 1-based numbers of *bytes*, and relies on that assumption when slicing up source code to apply highlighting (in HTML reports, and in text-based reports with colour).
For the very common case of all-ASCII source code, bytes and code points are the same, so the difference isn't noticeable. But for code that contains non-ASCII characters, emitting column numbers as code points will result in `llvm-cov` slicing strings in the wrong places, producing mangled output or fatal errors.
(See https://github.com/taiki-e/cargo-llvm-cov/issues/275 as an example of what can go wrong.)
Currently it's used for two dynamic checks:
- When a diagnostic is emitted, has it been emitted before?
- When a diagnostic is dropped, has it been emitted/cancelled?
The first check is no longer need, because `emit` is consuming, so it's
impossible to emit a `DiagnosticBuilder` twice. The second check is
still needed.
This commit replaces `DiagnosticBuilderState` with a simpler
`Option<Box<Diagnostic>>`, which is enough for the second check:
functions like `emit` and `cancel` can take the `Diagnostic` and then
`drop` can check that the `Diagnostic` was taken.
The `DiagCtxt` reference from `DiagnosticBuilderState` is now stored as
its own field, removing the need for the `dcx` method.
As well as making the code shorter and simpler, the commit removes:
- One (deprecated) `ErrorGuaranteed::unchecked_claim_error_was_emitted`
call.
- Two `FIXME(eddyb)` comments that are no longer relevant.
- The use of a dummy `Diagnostic` in `into_diagnostic`.
Nice!
Stop allowing `rustc::potential_query_instability` on all of
rustc_mir_transform and instead allow it on a case-by-case basis if it
is safe to do so. In this particular crate, all instances were safe to
allow.
rustc_mir_transform: Make DestinationPropagation stable for queries
By using `FxIndexMap` instead of `FxHashMap`, so that the order of visiting of locals is deterministic.
We also need to bless
`copy_propagation_arg.foo.DestinationPropagation.panic*.diff`. Do not review the diff of the diff. Instead look at the diff files before and after this commit. Both before and after this commit, 3 statements are replaced with nop. It's just that due to change in ordering, different statements are replaced. But the net result is the same. In other words, compare this diff (before fix):
* 090d5eac72/tests/mir-opt/dest-prop/copy_propagation_arg.foo.DestinationPropagation.panic-unwind.diff
With this diff (after fix):
* f603babd63/tests/mir-opt/dest-prop/copy_propagation_arg.foo.DestinationPropagation.panic-unwind.diff
and you can see that both before and after the fix, we replace 3 statements with `nop`s.
I find it _slightly_ surprising that the test this PR affects did not previously fail spuriously due to the indeterminism of `FxHashMap`, but I guess in can be explained with the predictability of small `FxHashMap`s with `usize` (`Local`) keys, or something along those lines.
This should fix [this](https://github.com/rust-lang/rust/pull/119252#discussion_r1436101791) comment, but I wanted to make a separate PR for this fix for a simpler development and review process.
Part of https://github.com/rust-lang/rust/issues/84447 which is E-help-wanted.
r? `@cjgillot` who is reviewer for the highly related PR https://github.com/rust-lang/rust/pull/119252.
Rollup of 9 pull requests
Successful merges:
- #119208 (coverage: Hoist some complex code out of the main span refinement loop)
- #119216 (Use diagnostic namespace in stdlib)
- #119414 (bootstrap: Move -Clto= setting from Rustc::run to rustc_cargo)
- #119420 (Handle ForeignItem as TAIT scope.)
- #119468 (rustdoc-search: tighter encoding for f index)
- #119628 (remove duplicate test)
- #119638 (fix cyle error when suggesting to use associated function instead of constructor)
- #119640 (library: Fix warnings in rtstartup)
- #119642 (library: Fix a symlink test failing on Windows)
r? `@ghost`
`@rustbot` modify labels: rollup
coverage: Hoist some complex code out of the main span refinement loop
The span refinement loop in `spans.rs` takes the spans that have been extracted from MIR, and modifies them to produce more helpful output in coverage reports.
It is also one of the most complicated pieces of code in the coverage instrumentor. It has an abundance of moving pieces that make it difficult to understand, and most attempts to modify it end up accidentally changing its behaviour in unacceptable ways.
This PR nevertheless tries to make a dent in it by hoisting two pieces of special-case logic out of the main loop, and into separate preprocessing passes. Coverage tests show that the resulting mappings are *almost* identical, with all known differences being unimportant.
This should hopefully unlock further simplifications to the refinement loop, since it now has fewer edge cases to worry about.
By using FxIndexMap instead of FxHashMap, so that the order of visiting
of locals is deterministic.
We also need to bless
copy_propagation_arg.foo.DestinationPropagation.panic*.diff. Do not
review the diff of the diff. Instead look at the diff file before and
after this commit. Both before and after this commit, 3 statements are
replaced with nop. It's just that due to change in ordering, different
statements are replaced. But the net result is the same.
Check yield terminator's resume type in borrowck
In borrowck, we didn't check that the lifetimes of the `TerminatorKind::Yield`'s `resume_place` were actually compatible with the coroutine's signature. That means that the lifetimes were totally going unchecked. Whoops!
This PR implements this checking.
Fixes#119564
r? types
Migrate memory overlap check from validator to lint
The check attempts to identify potential undefined behaviour, rather
than whether MIR is well-formed. It belongs in the lint not validator.
Follow up to changes from #119077.
This draws a clear distinction between the fields/methods that are needed by
initial span extraction and preprocessing, and those that are needed by the
main "refinement" loop.
Reevaluate `body.should_skip()` after updating the MIR phase to ensure
that injected MIR is processed correctly.
Update a few custom MIR tests that were ill-formed for the injected
phase.
`Diagnostic` has 40 methods that return `&mut Self` and could be
considered setters. Four of them have a `set_` prefix. This doesn't seem
necessary for a type that implements the builder pattern. This commit
removes the `set_` prefixes on those four methods.
coverage: Prepare mappings separately from injecting statements
These two tasks historically needed to be interleaved, but after various recent changes (including #116046 and #116917) they can now be fully separated.
---
`@rustbot` label +A-code-coverage
These two tasks historically needed to be interleaved, but after various recent
changes (including #116046 and #116917) they can now be fully separated.
Implement constant propagation on top of MIR SSA analysis
This implements the idea I proposed in https://github.com/rust-lang/rust/pull/110719#issuecomment-1718324700
Based on https://github.com/rust-lang/rust/pull/109597
The value numbering "GVN" pass formulates each rvalue that appears in MIR with an abstract form (the `Value` enum), and assigns an integer `VnIndex` to each. This abstract form can be used to deduplicate values, reusing an earlier local that holds the same value instead of recomputing. This part is proposed in #109597.
From this abstract representation, we can perform more involved simplifications, for example in https://github.com/rust-lang/rust/pull/111344.
With the abstract representation `Value`, we can also attempt to evaluate each to a constant using the interpreter. This builds a `VnIndex -> OpTy` map. From this map, we can opportunistically replace an operand or a rvalue with a constant if their value has an associated `OpTy`.
The most relevant commit is [Evaluated computed values to constants.](2767c4912e)"
r? `@oli-obk`