os-rust/compiler/stable_mir
bors a77322c16f Auto merge of #118310 - scottmcm:three-way-compare, r=davidtwco
Add `Ord::cmp` for primitives as a `BinOp` in MIR

Update: most of this OP was written months ago.  See https://github.com/rust-lang/rust/pull/118310#issuecomment-2016940014 below for where we got to recently that made it ready for review.

---

There are dozens of reasonable ways to implement `Ord::cmp` for integers using comparison, bit-ops, and branches.  Those differences are irrelevant at the rust level, however, so we can make things better by adding `BinOp::Cmp` at the MIR level:

1. Exactly how to implement it is left up to the backends, so LLVM can use whatever pattern its optimizer best recognizes and cranelift can use whichever pattern codegens the fastest.
2. By not inlining those details for every use of `cmp`, we drastically reduce the amount of MIR generated for `derive`d `PartialOrd`, while also making it more amenable to MIR-level optimizations.

Having extremely careful `if` ordering to μoptimize resource usage on broadwell (#63767) is great, but it really feels to me like libcore is the wrong place to put that logic.  Similarly, using subtraction [tricks](https://graphics.stanford.edu/~seander/bithacks.html#CopyIntegerSign) (#105840) is arguably even nicer, but depends on the optimizer understanding it (https://github.com/llvm/llvm-project/issues/73417) to be practical.  Or maybe [bitor is better than add](https://discourse.llvm.org/t/representing-in-ir/67369/2?u=scottmcm)?  But maybe only on a future version that [has `or disjoint` support](https://discourse.llvm.org/t/rfc-add-or-disjoint-flag/75036?u=scottmcm)?  And just because one of those forms happens to be good for LLVM, there's no guarantee that it'd be the same form that GCC or Cranelift would rather see -- especially given their very different optimizers.  Not to mention that if LLVM gets a spaceship intrinsic -- [which it should](https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/Suboptimal.20inlining.20in.20std.20function.20.60binary_search.60/near/404250586) -- we'll need at least a rustc intrinsic to be able to call it.

As for simplifying it in Rust, we now regularly inline `{integer}::partial_cmp`, but it's quite a large amount of IR.  The best way to see that is with 8811efa88b (diff-d134c32d028fbe2bf835fef2df9aca9d13332dd82284ff21ee7ebf717bfa4765R113) -- I added a new pre-codegen MIR test for a simple 3-tuple struct, and this PR change it from 36 locals and 26 basic blocks down to 24 locals and 8 basic blocks.  Even better, as soon as the construct-`Some`-then-match-it-in-same-BB noise is cleaned up, this'll expose the `Cmp == 0` branches clearly in MIR, so that an InstCombine (#105808) can simplify that to just a `BinOp::Eq` and thus fix some of our generated code perf issues.  (Tracking that through today's `if a < b { Less } else if a == b { Equal } else { Greater }` would be *much* harder.)

---

r? `@ghost`
But first I should check that perf is ok with this
~~...and my true nemesis, tidy.~~
2024-04-02 19:21:44 +00:00
..
src Auto merge of #118310 - scottmcm:three-way-compare, r=davidtwco 2024-04-02 19:21:44 +00:00
Cargo.toml Split out the stable part of smir into its own crate to prevent accidental usage of forever unstable things 2023-09-25 14:38:27 +00:00
README.md Split out the stable part of smir into its own crate to prevent accidental usage of forever unstable things 2023-09-25 14:38:27 +00:00
rust-toolchain.toml Split out the stable part of smir into its own crate to prevent accidental usage of forever unstable things 2023-09-25 14:38:27 +00:00

This crate is regularly synced with its mirror in the rustc repo at compiler/rustc_smir.

We use git subtree for this to preserve commits and allow the rustc repo to edit these crates without having to touch this repo. This keeps the crates compiling while allowing us to independently work on them here. The effort of keeping them in sync is pushed entirely onto us, without affecting rustc workflows negatively. This may change in the future, but changes to policy should only be done via a compiler team MCP.

Instructions for working on this crate locally

Since the crate is the same in the rustc repo and here, the dependencies on rustc_* crates will only either work here or there, but never in both places at the same time. Thus we use optional dependencies on the rustc_* crates, requiring local development to use

cargo build --no-default-features -Zavoid-dev-deps

in order to compile successfully.

Instructions for syncing

Updating this repository

In the rustc repo, execute

git subtree push --prefix=compiler/rustc_smir url_to_your_fork_of_project_stable_mir some_feature_branch

and then open a PR of your some_feature_branch against https://github.com/rust-lang/project-stable-mir

Updating the rustc library

First we need to bump our stack limit, as the rustc repo otherwise quickly hits that:

ulimit -s 60000

Maximum function recursion depth (1000) reached

Then we need to disable dash as the default shell for sh scripts, as otherwise we run into a hard limit of a recursion depth of 1000:

sudo dpkg-reconfigure dash

and then select No to disable dash.

Patching your git worktree

The regular git worktree does not scale to repos of the size of the rustc repo. So download the git-subtree.sh from https://github.com/gitgitgadget/git/pull/493/files and run

sudo cp --backup /path/to/patched/git-subtree.sh /usr/lib/git-core/git-subtree
sudo chmod --reference=/usr/lib/git-core/git-subtree~ /usr/lib/git-core/git-subtree
sudo chown --reference=/usr/lib/git-core/git-subtree~ /usr/lib/git-core/git-subtree

Actually doing a sync

In the rustc repo, execute

git subtree pull --prefix=compiler/rustc_smir https://github.com/rust-lang/project-stable-mir smir

Note: only ever sync to rustc from the project-stable-mir's smir branch. Do not sync with your own forks.

Then open a PR against rustc just like a regular PR.

Stable MIR Design

The stable-mir will follow a similar approach to proc-macro2. Its implementation will eventually be broken down into two main crates:

  • stable_mir: Public crate, to be published on crates.io, which will contain the stable data structure as well as proxy APIs to make calls to the compiler.
  • rustc_smir: The compiler crate that will translate from internal MIR to SMIR. This crate will also implement APIs that will be invoked by stable-mir to query the compiler for more information.

This will help tools to communicate with the rust compiler via stable APIs. Tools will depend on stable_mir crate, which will invoke the compiler using APIs defined in rustc_smir. I.e.:

    ┌──────────────────────────────────┐           ┌──────────────────────────────────┐
    │   External Tool     ┌──────────┐ │           │ ┌──────────┐   Rust Compiler     │
    │                     │          │ │           │ │          │                     │
    │                     │stable_mir| │           │ │rustc_smir│                     │
    │                     │          │ ├──────────►| │          │                     │
    │                     │          │ │◄──────────┤ │          │                     │
    │                     │          │ │           │ │          │                     │
    │                     │          │ │           │ │          │                     │
    │                     └──────────┘ │           │ └──────────┘                     │
    └──────────────────────────────────┘           └──────────────────────────────────┘

More details can be found here: https://hackmd.io/XhnYHKKuR6-LChhobvlT-g?view

For now, the code for these two crates are in separate modules of this crate. The modules have the same name for simplicity. We also have a third module, rustc_internal which will expose APIs and definitions that allow users to gather information from internal MIR constructs that haven't been exposed in the stable_mir module.