Commit graph

596 commits

Author SHA1 Message Date
Rémy Rakic
9ac8d2fe4e track proc-macro expansions in the self-profiler
Use the proc-macro descr to track their individual expansions with
self-profiling events. This will help diagnose performance issues
with slow proc-macros.
2022-04-05 15:37:14 +02:00
Nicholas Nethercote
7300bd6a38 Move the missing fragment identifier checking.
In #95555 this was moved out of `parse_tt_inner` and `nameize` into
`compute_locs`. But the next commit will be moving `compute_locs`
outwards to a place that isn't suitable for the missing fragment
identifier checking. So this reinstates the old checking.
2022-04-05 17:23:30 +10:00
Nicholas Nethercote
896d8f5905 Remove the lifetime from TtParser and MatcherLoc.
It's a slight performance loss for now, but that will be recouped by the
next commit.
2022-04-05 17:19:38 +10:00
David Wood
3c2f864ffb session: opt for enabling directionality markers
Add an option for enabling and disabling Fluent's directionality
isolation markers in output. Disabled by default as these can render in
some terminals and applications.

Signed-off-by: David Wood <david.wood@huawei.com>
2022-04-05 07:01:03 +01:00
David Wood
d5119c5b9f errors: implement sysroot/testing bundle loading
Extend loading of Fluent bundles so that bundles can be loaded from the
sysroot based on the language requested by the user, or using a nightly
flag.

Sysroot bundles are loaded from `$sysroot/share/locale/$locale/*.ftl`.

Signed-off-by: David Wood <david.wood@huawei.com>
2022-04-05 07:01:02 +01:00
David Wood
7f91697b50 errors: implement fallback diagnostic translation
This commit updates the signatures of all diagnostic functions to accept
types that can be converted into a `DiagnosticMessage`. This enables
existing diagnostic calls to continue to work as before and Fluent
identifiers to be provided. The `SessionDiagnostic` derive just
generates normal diagnostic calls, so these APIs had to be modified to
accept Fluent identifiers.

In addition, loading of the "fallback" Fluent bundle, which contains the
built-in English messages, has been implemented.

Each diagnostic now has "arguments" which correspond to variables in the
Fluent messages (necessary to render a Fluent message) but no API for
adding arguments has been added yet. Therefore, diagnostics (that do not
require interpolation) can be converted to use Fluent identifiers and
will be output as before.
2022-04-05 07:01:02 +01:00
David Wood
c45f29595d span: move MultiSpan
`MultiSpan` contains labels, which are more complicated with the
introduction of diagnostic translation and will use types from
`rustc_errors` - however, `rustc_errors` depends on `rustc_span` so
`rustc_span` cannot use types like `DiagnosticMessage` without
dependency cycles. Introduce a new `rustc_error_messages` crate that can
contain `DiagnosticMessage` and `MultiSpan`.

Signed-off-by: David Wood <david.wood@huawei.com>
2022-04-05 07:01:00 +01:00
David Wood
8c684563a5 errors: introduce DiagnosticMessage
Introduce a `DiagnosticMessage` type that will enable diagnostic
messages to be simple strings or Fluent identifiers.
`DiagnosticMessage` is now used in the implementation of the standard
`DiagnosticBuilder` APIs.

Signed-off-by: David Wood <david.wood@huawei.com>
2022-04-05 06:53:39 +01:00
bors
60e50fc1cf Auto merge of #95653 - Dylan-DPC:rollup-2p9hzi3, r=Dylan-DPC
Rollup of 7 pull requests

Successful merges:

 - #92942 (stabilize windows_process_extensions_raw_arg)
 - #94817 (Release notes for 1.60.0)
 - #95343 (Reduce unnecessary escaping in proc_macro::Literal::character/string)
 - #95431 (Stabilize total_cmp)
 - #95438 (Add SyncUnsafeCell.)
 - #95467 (Windows: Synchronize asynchronous pipe reads and writes)
 - #95609 (Suggest borrowing when trying to coerce unsized type into `dyn Trait`)

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
2022-04-04 19:51:52 +00:00
Dylan DPC
2d1496a8f6
Rollup merge of #95343 - dtolnay:literals, r=petrochenkov
Reduce unnecessary escaping in proc_macro::Literal::character/string

I noticed that https://doc.rust-lang.org/proc_macro/struct.Literal.html#method.character is producing unreadable literals that make macro-expanded code unnecessarily hard to read. Since the proc macro server was using `escape_unicode()`, every char is escaped using `\u{…}` regardless of whether there is any need to do so. For example `Literal::character('=')` would previously produce `'\u{3d}'` which unnecessarily obscures the meaning when reading the macro-expanded code.

I've changed Literal::string also in this PR because `str`'s `Debug` impl is also smarter than just calling `escape_debug` on every char. For example `Literal::string("ferris's")` would previously produce `"ferris\'s"` but will now produce `"ferris's"`.
2022-04-04 20:41:30 +02:00
Nicholas Nethercote
0bd47e8a39 Reorder match arms in parse_tt_inner.
To match the order the variants are declared in.
2022-04-04 17:03:36 +10:00
Nicholas Nethercote
88f8fbcce0 A new matcher representation for use in parse_tt.
`parse_tt` currently traverses a `&[TokenTree]` to do matching. But this
is a bad representation for the traversal.
- `TokenTree` is nested, and there's a bunch of expensive and fiddly
  state required to handle entering and exiting nested submatchers.
- There are three positions (sequence separators, sequence Kleene ops,
  and end of the matcher) that are represented by an index that exceeds
  the end of the `&[TokenTree]`, which is clumsy and error-prone.

This commit introduces a new representation called `MatcherLoc` that is
designed specifically for matching. It fixes all the above problems,
making the code much easier to read. A `&[TokenTree]` is converted to a
`&[MatcherLoc]` before matching begins. Despite the cost of the
conversion, it's still a net performance win, because various pieces of
traversal state are computed once up-front, rather than having to be
recomputed repeatedly during the macro matching.

Some improvements worth noting.
- `parse_tt_inner` is *much* easier to read. No more having to compare
  `idx` against `len` and read comments to understand what the result
  means.
- The handling of `Delimited` in `parse_tt_inner` is now trivial.
- The three end-of-sequence cases in `parse_tt_inner` are now handled in
  three separate match arms, and the control flow is much simpler.
- `nameize` is no longer recursive.
- There were two places that issued "missing fragment specifier" errors:
  one in `parse_tt_inner()`, and one in `nameize()`. Presumably the
  latter was never executed. There's now a single place issuing these
  errors, in `compute_locs()`.
- The number of heap allocations done for a `check full` build of
  `async-std-1.10.0` (an extreme example of heavy macro use) drops from
  11.8M to 2.6M, and most of these occur outside of macro matching.
- The size of `MatcherPos` drops from 64 bytes to 16 bytes. Small enough
  that it no longer needs boxing, which partly accounts for the
  reduction in allocations.
- The rest of the drop in allocations is due to the removal of
  `MatcherKind`, because we no longer need to record anything for the
  parent matcher when entering a submatcher.
- Overall it reduces code size by 45 lines.
2022-04-04 17:01:28 +10:00
Jacob Pratt
6b75406f5a
Create 2024 edition 2022-04-02 02:45:49 -04:00
bors
95f68702ff Auto merge of #95509 - nnethercote:simplify-MatcherPos-some-more, r=petrochenkov
Simplify `MatcherPos` some more

A few more improvements.

r? `@petrochenkov`
2022-04-02 04:59:16 +00:00
Vadim Petrochenkov
9ab4f732cb expand: Do not count metavar declarations on RHS of macro_rules
They are 0 by definition there.
2022-03-31 19:09:40 +03:00
Nicholas Nethercote
c6fedd4f10 Make MatcherPos not derive Clone.
It's only used in one place, and there we clone and then make a bunch of
modifications. It's clearer if we duplicate more explicitly, and there's
a symmetry now between `sequence()` and `empty_sequence()`.
2022-03-31 14:40:43 +11:00
Nicholas Nethercote
f68a0449ed Remove MatcherPos::stack.
`parse_tt` needs a way to get from within submatchers make to the
enclosing submatchers. Currently it has two distinct mechanisms for
this:
- `Delimited` submatchers use `MatcherPos::stack` to record stuff about
  the parent (and further back ancestors).
- `Sequence` submatchers use `MatcherPosSequence::parent` to point to
  the parent matcher position.

Having two mechanisms is really confusing, and it took me a long time to
understand all this.

This commit eliminates `MatcherPos::stack`, and changes `Delimited`
submatchers to use the same mechanism as sequence submatchers. That
mechanism is also changed a bit: instead of storing the entire parent
`MatcherPos`, we now only store the necessary parts from the parent
`MatcherPos`.

Overall this is a small performance win, with the positives outweighing
the negatives, but it's mostly for clarity.
2022-03-31 14:39:00 +11:00
Dylan DPC
1b7d6dbd30
Rollup merge of #95497 - nyurik:compiler-spell-comments, r=compiler-errors
Spellchecking compiler comments

This PR cleans up the rest of the spelling mistakes in the compiler comments. This PR does not change any literal or code spelling issues.
2022-03-31 04:57:28 +02:00
Nicholas Nethercote
048bd67d51 Clarify idx handling in sequences.
By adding comments, and improving an assertion. I finally fully
understand this part!
2022-03-31 11:48:36 +11:00
Nicholas Nethercote
2e423c7fd0 Remove MatcherPos::match_lo.
It's redundant w.r.t. other fields.
2022-03-31 11:48:35 +11:00
Nicholas Nethercote
21699c41af Simplify exit of Delimited submatchers.
Currently, we detect an exit from a `Delimited` submatcher when `idx`
exceeds the bounds of the current submatcher *and* there is a `stack`
entry.

This commit changes it to something simpler: just look for a
`CloseDelim` token.
2022-03-31 11:48:34 +11:00
Yuri Astrakhan
5160f8f843 Spellchecking compiler comments
This PR cleans up the rest of the spelling mistakes in the compiler comments. This PR does not change any literal or code spelling issues.
2022-03-30 15:14:15 -04:00
bors
c5cf08d37b Auto merge of #95425 - nnethercote:yet-more-parse_tt-improvements, r=petrochenkov
Yet more `parse_tt` improvements

Including lots of comment improvements, and an overhaul of how `matches` work that gives big speedups.

r? `@petrochenkov`
2022-03-30 19:08:01 +00:00
Yuri Astrakhan
7e8201ae0a Spellchecking some comments
This PR attempts to clean up some minor spelling mistakes in comments
2022-03-30 01:39:38 -04:00
Nicholas Nethercote
6b0a16ab1a Pre-allocate an empty Lrc<NamedMatchVec>.
This avoids some allocations.
2022-03-30 10:54:57 +11:00
Nicholas Nethercote
524d21bd54 Overhaul how matches are recorded.
Currently, matches within a sequence are recorded in a new empty
`matches` vector. Then when the sequence finishes the matches are merged
into the `matches` vector of the parent.

This commit changes things so that a sequence mp inherits the matches
made so far. This means that additional matches from the sequence don't
need to be merged into the parent. `push_match` becomes more
complicated, and the current sequence depth needs to be tracked. But
it's a sizeable performance win because it avoids one or more
`push_match` calls on every iteration of a sequence.

The commit also removes `match_hi`, which is no longer necessary.
2022-03-30 10:54:37 +11:00
Nicholas Nethercote
a1b140cdb7 Improve comments and rename many things for consistency.
In particular:
- Replace use of "item" with "matcher position/"mp".
- Replace use of "repetition" with "sequence".
- Replace `ms` with `matcher`.
2022-03-30 10:50:17 +11:00
Nicholas Nethercote
ac3d8ce1c6 Clarify comments about doc comments in macros. 2022-03-30 10:42:47 +11:00
Nicholas Nethercote
2b60cc081b Simplify and rename count_names. 2022-03-30 10:42:34 +11:00
Nicholas Nethercote
df6ead557d Add a useful assertion. 2022-03-29 08:00:26 +11:00
Dylan DPC
1c8b7412d4
Rollup merge of #95390 - nnethercote:allow-doc-comments-in-macros, r=petrochenkov
Ignore doc comments in a declarative macro matcher.

Fixes #95267. Reverts to the old behaviour before #95159 introduced a
regression.

r? `@petrochenkov`
2022-03-28 16:08:11 +02:00
Dylan DPC
ae037a86f9
Rollup merge of #95301 - nnethercote:rm-NtTT, r=petrochenkov
Remove `Nonterminal::NtTT`.

It's only needed for macro expansion, not as a general element in the
AST. This commit removes it, adds `NtOrTt` for the parser and macro
expansion cases, and renames the variants in `NamedMatch` to better
match the new type.

r? `@petrochenkov`
2022-03-28 16:08:07 +02:00
Nicholas Nethercote
9967594346 Ignore doc comments in a declarative macro matcher.
Fixes #95267. Reverts to the old behaviour before #95159 introduced a
regression.
2022-03-28 10:45:24 +11:00
Nicholas Nethercote
364b908d57 Remove Nonterminal::NtTT.
It's only needed for macro expansion, not as a general element in the
AST. This commit removes it, adds `NtOrTt` for the parser and macro
expansion cases, and renames the variants in `NamedMatch` to better
match the new type.
2022-03-28 10:03:02 +11:00
Dylan DPC
979c8e885e
Rollup merge of #95335 - Badel2:resolve-path, r=Dylan-DPC
Move resolve_path to rustc_builtin_macros and make it private

Fixing a FIXME introduced by `@jyn514` in #85457
2022-03-27 05:36:09 +02:00
David Tolnay
f383134acc
Use str and char's Debug impl to format literals 2022-03-26 13:15:48 -07:00
Badel2
ea26d72710 Move resolve_path to rustc_builtin_macros and make it private 2022-03-26 16:47:13 +01:00
Vadim Petrochenkov
baa3ad4dc8 proc-macro: Stop wrapping ident matchers into groups 2022-03-26 12:38:46 +03:00
bors
c74925438c Auto merge of #95149 - cjgillot:once-diag, r=estebank
Remove `Session::one_time_diagnostic`

This is untracked mutable state, which modified the behaviour of queries.
It was used for 2 things: some full-blown errors, but mostly for lint declaration notes ("the lint level is defined here" notes).

It is replaced by the diagnostic deduplication infra which already exists in the diagnostic emitter.
A new diagnostic level `OnceNote` is introduced specifically for lint notes, to deduplicate subdiagnostics.

As a drive-by, diagnostic emission takes a `&mut` to allow dropping the `SubDiagnostic`s.
2022-03-26 00:54:54 +00:00
Nicholas Nethercote
fdec26ddad Shrink MatcherPosRepetition.
Currently it copies a `KleeneOp` and a `Token` out of a
`SequenceRepetition`. It's better to store a reference to the
`SequenceRepetition`, which is now possible due to #95159 having changed
the lifetimes.
2022-03-25 12:35:56 +11:00
Nicholas Nethercote
cad5f1e774 Shrink NamedMatchVec to one inline element.
This counters the `NamedMatchVec` size increase from the previous
commit, leaving `NamedMatchVec` smaller than before.
2022-03-25 12:35:56 +11:00
Nicholas Nethercote
6817442ec7 Split NamedMatch::MatchNonterminal in two.
The `Lrc` is only relevant within `transcribe()`. There, the `Lrc` is
helpful for the non-`NtTT` cases, because the entire nonterminal is
cloned. But for the `NtTT` cases the inner token tree is cloned (a full
clone) and so the `Lrc` is of no help.

This commit splits the `NtTT` and non-`NtTT` cases, avoiding the useless
`Lrc` in the former case, for the following effect on macro-heavy
crates.
- It reduces the total number of allocations a lot.
- It increases the size of some of the remaining allocations.
- It doesn't affect *peak* memory usage, because the larger allocations
  are short-lived.

This overall gives a speed win.
2022-03-25 12:35:52 +11:00
Nicholas Nethercote
904e70a7b0 Add a size assertion for NamedMatchVec. 2022-03-23 13:54:34 +11:00
bors
a4a5e79814 Auto merge of #95159 - nnethercote:TtParser, r=petrochenkov
Introduce `TtParser`

These commits make a number of changes to declarative macro expansion, resulting in code that is shorter, simpler, and faster.

Best reviewed one commit at a time.

r? `@petrochenkov`
2022-03-22 21:46:57 +00:00
Nicholas Nethercote
31df680789 Eliminate TokenTreeOrTokenTreeSlice.
As its name suggests, `TokenTreeOrTokenTreeSlice` is either a single
`TokenTree` or a slice of them. It has methods `len` and `get_tt` that
let it be treated much like an ordinary slice. The reason it isn't an
ordinary slice is that for `TokenTree::Delimited` the open and close
delimiters are represented implicitly, and when they are needed they are
constructed on the fly with `Delimited::{open,close}_tt`, rather than
being present in memory.

This commit changes `Delimited` so the open and close delimiters are
represented explicitly. As a result, `TokenTreeOrTokenTreeSlice` is no
longer needed and `MatcherPos` and `MatcherTtFrame` can just use an
ordinary slice. `TokenTree::{len,get_tt}` are also removed, because they
were only needed to support `TokenTreeOrTokenTreeSlice`.

The change makes the code shorter and a little bit faster on benchmarks
that use macro expansion heavily, partly because `MatcherPos` is a lot
smaller (less data to `memcpy`) and partly because ordinary slice
operations are faster than `TokenTreeOrTokenTreeSlice::{len,get_tt}`.
2022-03-23 07:13:31 +11:00
Caio
74e7313e0e Fix generated tokens hygiene 2022-03-21 19:45:55 -03:00
Nicholas Nethercote
754dc8e66f Move items into TtParser as Vecs.
By putting them in `TtParser`, we can reuse them for every rule in a
macro. With that done, they can be `SmallVec` instead of `Vec`, and this
is a performance win because these vectors are hot and `SmallVec`
operations are a bit slower due to always needing an "inline or heap?"
check.
2022-03-21 10:09:24 +11:00
Nicholas Nethercote
cedb787f6e Remove MatcherPosHandle.
This type was a small performance win for `html5ever`, which uses a
macro with hundreds of very simple rules that don't contain any
metavariables. But this type is complicated (extra lifetimes) and
perf-neutral for macros that do have metavariables.

This commit removes `MatcherPosHandle`, simplifying things a lot. This
increases the allocation rate for `html5ever` and similar cases a bit,
but makes things easier for follow-up changes that will improve
performance more than what we lost here.
2022-03-21 10:08:29 +11:00
Camille GILLOT
056951d628 Take &mut Diagnostic in emit_diagnostic.
Taking a Diagnostic by move would break the usual pattern
`diag.label(..).emit()`.
2022-03-20 20:36:08 +01:00
Nicholas Nethercote
10644e0789 Remove an impossible code path.
Doc comments cannot appear in a matcher.
2022-03-19 09:44:47 +11:00