os-rust/compiler/stable_mir
bors 7120fdac7a Auto merge of #126963 - runtimeverification:smir_serde_derive, r=celinval
Add basic Serde serialization capabilities to Stable MIR

This PR adds basic Serde serialization capabilities to Stable MIR. It is intentionally minimal (just wrapping all stable MIR types with a Serde `derive`), so that any important design decisions can be discussed before going further. A simple test is included with this PR to validate that JSON can actually be emitted.

## Notes

When I wrapped the Stable MIR error types in `compiler/stable_mir/src/error.rs`, it caused test failures (though I'm not sure why) so I backed those out.

## Future Work

So, this PR will support serializing basic stable MIR, but it _does not_ support serializing interned values beneath `Ty`s and `AllocId`s, etc... My current thinking about how to handle this is as follows:

1.  Add new `visited_X` fields to the `Tables` struct for each interned category of interest.

2.  As serialization is occuring, serialize interned values as usual _and_ also record the interned value we referenced in `visited_X`.

    (Possibly) In addition, if an interned value recursively references other interned values, record those interned values as well.

3.  Teach the stable MIR `Context` how to access the `visited_X` values and expose them with wrappers in `stable_mir/src/lib.rs` to users (e.g. to serialize and/or further analyze them).

### Pros

This approach does not commit to any specific serialization format regarding interned values or other more complex cases, which avoids us locking into any behaviors that may not be desired long-term.

### Cons

The user will need to manually handle serializing interned values.

### Alternatives

1.  We can directly provide access to the underlying `Tables` maps for interned values; the disadvantage of this approach is that it either requires extra processing for users to filter out to only use the values that they need _or_ users may serialize extra values that they don't need. The advantage is that the implementation is even simpler. The other pros/cons are similar to the above.

2.  We can directly serialize interned values by expanding them in-place. The pro is that this may make some basic inputs easier to consume. However, the cons are that there will need to be special provisions for dealing with cyclical values on both the producer and consumer _and_ global values will possibly need to be de-duplicated on the consumer side.
2024-07-25 20:27:51 +00:00
..
src Auto merge of #126963 - runtimeverification:smir_serde_derive, r=celinval 2024-07-25 20:27:51 +00:00
Cargo.toml add serde derive Serialize to stable_mir 2024-06-26 11:56:01 -04:00
README.md Split out the stable part of smir into its own crate to prevent accidental usage of forever unstable things 2023-09-25 14:38:27 +00:00
rust-toolchain.toml Split out the stable part of smir into its own crate to prevent accidental usage of forever unstable things 2023-09-25 14:38:27 +00:00

This crate is regularly synced with its mirror in the rustc repo at compiler/rustc_smir.

We use git subtree for this to preserve commits and allow the rustc repo to edit these crates without having to touch this repo. This keeps the crates compiling while allowing us to independently work on them here. The effort of keeping them in sync is pushed entirely onto us, without affecting rustc workflows negatively. This may change in the future, but changes to policy should only be done via a compiler team MCP.

Instructions for working on this crate locally

Since the crate is the same in the rustc repo and here, the dependencies on rustc_* crates will only either work here or there, but never in both places at the same time. Thus we use optional dependencies on the rustc_* crates, requiring local development to use

cargo build --no-default-features -Zavoid-dev-deps

in order to compile successfully.

Instructions for syncing

Updating this repository

In the rustc repo, execute

git subtree push --prefix=compiler/rustc_smir url_to_your_fork_of_project_stable_mir some_feature_branch

and then open a PR of your some_feature_branch against https://github.com/rust-lang/project-stable-mir

Updating the rustc library

First we need to bump our stack limit, as the rustc repo otherwise quickly hits that:

ulimit -s 60000

Maximum function recursion depth (1000) reached

Then we need to disable dash as the default shell for sh scripts, as otherwise we run into a hard limit of a recursion depth of 1000:

sudo dpkg-reconfigure dash

and then select No to disable dash.

Patching your git worktree

The regular git worktree does not scale to repos of the size of the rustc repo. So download the git-subtree.sh from https://github.com/gitgitgadget/git/pull/493/files and run

sudo cp --backup /path/to/patched/git-subtree.sh /usr/lib/git-core/git-subtree
sudo chmod --reference=/usr/lib/git-core/git-subtree~ /usr/lib/git-core/git-subtree
sudo chown --reference=/usr/lib/git-core/git-subtree~ /usr/lib/git-core/git-subtree

Actually doing a sync

In the rustc repo, execute

git subtree pull --prefix=compiler/rustc_smir https://github.com/rust-lang/project-stable-mir smir

Note: only ever sync to rustc from the project-stable-mir's smir branch. Do not sync with your own forks.

Then open a PR against rustc just like a regular PR.

Stable MIR Design

The stable-mir will follow a similar approach to proc-macro2. Its implementation will eventually be broken down into two main crates:

  • stable_mir: Public crate, to be published on crates.io, which will contain the stable data structure as well as proxy APIs to make calls to the compiler.
  • rustc_smir: The compiler crate that will translate from internal MIR to SMIR. This crate will also implement APIs that will be invoked by stable-mir to query the compiler for more information.

This will help tools to communicate with the rust compiler via stable APIs. Tools will depend on stable_mir crate, which will invoke the compiler using APIs defined in rustc_smir. I.e.:

    ┌──────────────────────────────────┐           ┌──────────────────────────────────┐
    │   External Tool     ┌──────────┐ │           │ ┌──────────┐   Rust Compiler     │
    │                     │          │ │           │ │          │                     │
    │                     │stable_mir| │           │ │rustc_smir│                     │
    │                     │          │ ├──────────►| │          │                     │
    │                     │          │ │◄──────────┤ │          │                     │
    │                     │          │ │           │ │          │                     │
    │                     │          │ │           │ │          │                     │
    │                     └──────────┘ │           │ └──────────┘                     │
    └──────────────────────────────────┘           └──────────────────────────────────┘

More details can be found here: https://hackmd.io/XhnYHKKuR6-LChhobvlT-g?view

For now, the code for these two crates are in separate modules of this crate. The modules have the same name for simplicity. We also have a third module, rustc_internal which will expose APIs and definitions that allow users to gather information from internal MIR constructs that haven't been exposed in the stable_mir module.