os-rust/compiler/rustc_expand
Nicholas Nethercote 88f8fbcce0 A new matcher representation for use in parse_tt.
`parse_tt` currently traverses a `&[TokenTree]` to do matching. But this
is a bad representation for the traversal.
- `TokenTree` is nested, and there's a bunch of expensive and fiddly
  state required to handle entering and exiting nested submatchers.
- There are three positions (sequence separators, sequence Kleene ops,
  and end of the matcher) that are represented by an index that exceeds
  the end of the `&[TokenTree]`, which is clumsy and error-prone.

This commit introduces a new representation called `MatcherLoc` that is
designed specifically for matching. It fixes all the above problems,
making the code much easier to read. A `&[TokenTree]` is converted to a
`&[MatcherLoc]` before matching begins. Despite the cost of the
conversion, it's still a net performance win, because various pieces of
traversal state are computed once up-front, rather than having to be
recomputed repeatedly during the macro matching.

Some improvements worth noting.
- `parse_tt_inner` is *much* easier to read. No more having to compare
  `idx` against `len` and read comments to understand what the result
  means.
- The handling of `Delimited` in `parse_tt_inner` is now trivial.
- The three end-of-sequence cases in `parse_tt_inner` are now handled in
  three separate match arms, and the control flow is much simpler.
- `nameize` is no longer recursive.
- There were two places that issued "missing fragment specifier" errors:
  one in `parse_tt_inner()`, and one in `nameize()`. Presumably the
  latter was never executed. There's now a single place issuing these
  errors, in `compute_locs()`.
- The number of heap allocations done for a `check full` build of
  `async-std-1.10.0` (an extreme example of heavy macro use) drops from
  11.8M to 2.6M, and most of these occur outside of macro matching.
- The size of `MatcherPos` drops from 64 bytes to 16 bytes. Small enough
  that it no longer needs boxing, which partly accounts for the
  reduction in allocations.
- The rest of the drop in allocations is due to the removal of
  `MatcherKind`, because we no longer need to record anything for the
  parent matcher when entering a submatcher.
- Overall it reduces code size by 45 lines.
2022-04-04 17:01:28 +10:00
..
src A new matcher representation for use in parse_tt. 2022-04-04 17:01:28 +10:00
Cargo.toml Migrate to 2021 2021-09-20 22:21:42 -04:00