No description
Find a file
Matthias Krüger 78529d9841
Rollup merge of #124921 - RalfJung:offset-from-same-addr, r=oli-obk
offset_from: always allow pointers to point to the same address

This PR implements the last remaining part of the t-opsem consensus in https://github.com/rust-lang/unsafe-code-guidelines/issues/472: always permits offset_from when both pointers have the same address, no matter how they are computed. This is required to achieve *provenance monotonicity*.

Tracking issue: https://github.com/rust-lang/rust/issues/117945

### What is provenance monotonicity and why does it matter?

Provenance monotonicity is the property that adding arbitrary provenance to any no-provenance pointer must never make the program UB. More specifically, in the program state, data in memory is stored as a sequence of [abstract bytes](https://rust-lang.github.io/unsafe-code-guidelines/glossary.html#abstract-byte), where each byte can optionally carry provenance. When a pointer is stored in memory, all of the bytes it is stored in carry that provenance. Provenance monotonicity means: if we take some byte that does not have provenance, and give it some arbitrary provenance, then that cannot change program behavior or introduce UB into a UB-free program.

We care about provenance monotonicity because we want to allow the optimizer to remove provenance-stripping operations. Removing a provenance-stripping operation effectively means the program after the optimization has provenance where the program before the optimization did not -- since the provenance removal does not happen in the optimized program. IOW, the compiler transformation added provenance to previously provenance-free bytes. This is exactly what provenance monotonicity lets us do.

We care about removing provenance-stripping operations because `*ptr = *ptr` is, in general, (likely) a provenance-stripping operation. Specifically, consider `ptr: *mut usize` (or any integer type), and imagine the data at `*ptr` is actually a pointer (i.e., we are type-punning between pointers and integers). Then `*ptr` on the right-hand side evaluates to the data in memory *without* any provenance (because [integers do not have provenance](https://rust-lang.github.io/rfcs/3559-rust-has-provenance.html#integers-do-not-have-provenance)). Storing that back to `*ptr` means that the abstract bytes `ptr` points to are the same as before, except their provenance is now gone. This makes  `*ptr = *ptr`  a provenance-stripping operation  (Here we assume `*ptr` is fully initialized. If it is not initialized, evaluating `*ptr` to a value is UB, so removing `*ptr = *ptr` is trivially correct.)

### What does `offset_from` have to do with provenance monotonicity?

With `ptr = without_provenance(N)`, `ptr.offset_from(ptr)` is always well-defined and returns 0. By provenance monotonicity, I can now add provenance to the two arguments of `offset_from` and it must still be well-defined. Crucially, I can add *different* provenance to the two arguments, and it must still be well-defined. In other words, this must always be allowed: `ptr1.with_addr(N).offset_from(ptr2.with_addr(N))` (and it returns 0). But the current spec for `offset_from` says that the two pointers must either both be derived from an integer or both be derived from the same allocation, which is not in general true for arbitrary `ptr1`, `ptr2`.

To obtain provenance monotonicity, this PR hence changes the spec for offset_from to say that if both pointers have the same address, the function is always well-defined.

### What further consequences does this have?

It means the compiler can no longer transform `end2 = begin.offset(end.offset_from(begin))` into `end2 = end`. However, it can still be transformed into `end2 = begin.with_addr(end.addr())`, which later parts of the backend (when provenance has been erased) can trivially turn into `end2 = end`.

The only alternative I am aware of is a fundamentally different handling of zero-sized accesses, where a "no provenance" pointer is not allowed to do zero-sized accesses and instead we have a special provenance that indicates "may be used for zero-sized accesses (and nothing else)". `offset` and `offset_from` would then always be UB on a "no provenance" pointer, and permit zero-sized offsets on a "zero-sized provenance" pointer. This achieves provenance monotonicity. That is, however, a breaking change as it contradicts what we landed in https://github.com/rust-lang/rust/pull/117329. It's also a whole bunch of extra UB, which doesn't seem worth it just to achieve that transformation.

### What about the backend?

LLVM currently doesn't have an intrinsic for pointer difference, so we anyway cast to integer and subtract there. That's never UB so it is compatible with any relaxation we may want to apply.

If LLVM gets a `ptrsub` in the future, then plausibly it will be consistent with `ptradd` and [consider two equal pointers to be inbounds](https://github.com/rust-lang/rust/pull/124921#issuecomment-2205795829).
2024-07-15 21:11:47 +02:00
.github Lower timeout of CI jobs to 4 hours 2024-07-12 11:27:46 +02:00
.reuse Rollup merge of #126876 - WaffleLapkin:unignoreconfigtoml, r=Mark-Simulacrum 2024-06-30 10:39:47 +02:00
compiler Rollup merge of #124921 - RalfJung:offset-from-same-addr, r=oli-obk 2024-07-15 21:11:47 +02:00
library Rollup merge of #124921 - RalfJung:offset-from-same-addr, r=oli-obk 2024-07-15 21:11:47 +02:00
LICENSES Add missing CC-BY-SA-4.0. 2023-11-27 11:03:53 +00:00
src Rollup merge of #124921 - RalfJung:offset-from-same-addr, r=oli-obk 2024-07-15 21:11:47 +02:00
tests Rollup merge of #124921 - RalfJung:offset-from-same-addr, r=oli-obk 2024-07-15 21:11:47 +02:00
.clang-format Add .clang-format 2024-06-26 05:56:00 +08:00
.editorconfig Only use max_line_length = 100 for *.rs 2023-07-10 15:18:36 -07:00
.git-blame-ignore-revs Ignore compiletest test directive migration commits 2024-02-22 18:55:02 +00:00
.gitattributes Rename config.toml.example to config.example.toml 2023-03-11 14:10:00 -08:00
.gitignore Avoid follow-up errors and ICEs after missing lifetime errors on data structures 2024-07-11 11:00:15 +00:00
.gitmodules refactor: add rustc-perf submodule to src/tools 2024-05-20 14:56:49 +00:00
.ignore Add .ignore file to make config.toml searchable in vscode 2024-06-24 10:15:16 +02:00
.mailmap .mailmap: Associate both my work and my private email with me 2024-06-15 09:27:39 +02:00
Cargo.lock Move DropBomb from run-make-support to build_helper 2024-07-12 20:14:37 +02:00
Cargo.toml Implement x perf as a separate tool 2024-06-27 10:22:03 +02:00
CODE_OF_CONDUCT.md Remove the code of conduct; instead link https://www.rust-lang.org/conduct.html 2019-10-05 22:55:19 +02:00
config.example.toml Rollup merge of #127322 - onur-ozkan:ci-rustc-incompatible-options, r=Mark-Simulacrum 2024-07-14 10:05:20 +02:00
configure Ensure ./configure works when configure.py path contains spaces 2024-02-16 18:57:22 +00:00
CONTRIBUTING.md fix: Update CONTRIBUTING.md recommend -> recommended 2023-11-16 23:57:09 +05:30
COPYRIGHT Update COPYRIGHT file 2022-10-30 10:23:14 -04:00
INSTALL.md Rollup merge of #127434 - onur-ozkan:use-bootstrap-instead-of-rustbuild, r=Mark-Simulacrum 2024-07-13 20:19:45 -07:00
LICENSE-APACHE Remove appendix from LICENCE-APACHE 2019-12-30 14:25:53 +00:00
LICENSE-MIT LICENSE-MIT: Remove inaccurate (misattributed) copyright notice 2017-07-26 16:51:58 -07:00
README.md Use SVG logos in the README.md. 2024-04-03 19:48:20 +02:00
RELEASES.md Add 1.80 release notes 2024-07-14 04:52:25 +01:00
rust-bors.toml Increase timeout for new bors bot 2024-03-13 08:31:07 +01:00
rustfmt.toml Add rustdoc-json support for use<> 2024-07-12 05:24:51 -04:00
triagebot.toml Auto merge of #125935 - madsmtm:merge-os-apple, r=workingjubilee 2024-07-14 16:28:07 +00:00
x Make x capable of resolving symlinks 2023-10-14 17:53:33 +03:00
x.ps1 use & instead of start-process in x.ps1 2023-12-09 09:46:16 -05:00
x.py Fix recent python linting errors 2023-08-02 04:40:28 -04:00

This is the main source code repository for Rust. It contains the compiler, standard library, and documentation.

Why Rust?

  • Performance: Fast and memory-efficient, suitable for critical services, embedded devices, and easily integrate with other languages.

  • Reliability: Our rich type system and ownership model ensure memory and thread safety, reducing bugs at compile-time.

  • Productivity: Comprehensive documentation, a compiler committed to providing great diagnostics, and advanced tooling including package manager and build tool (Cargo), auto-formatter (rustfmt), linter (Clippy) and editor support (rust-analyzer).

Quick Start

Read "Installation" from The Book.

Installing from Source

If you really want to install from source (though this is not recommended), see INSTALL.md.

Getting Help

See https://www.rust-lang.org/community for a list of chat platforms and forums.

Contributing

See CONTRIBUTING.md.

License

Rust is primarily distributed under the terms of both the MIT license and the Apache License (Version 2.0), with portions covered by various BSD-like licenses.

See LICENSE-APACHE, LICENSE-MIT, and COPYRIGHT for details.

Trademark

The Rust Foundation owns and protects the Rust and Cargo trademarks and logos (the "Rust Trademarks").

If you want to use these names or brands, please read the media guide.

Third-party logos may be subject to third-party copyrights and trademarks. See Licenses for details.