Cargo pulls in libc from crates.io for a number of dependencies, but
0.2.27 is too old to work properly with Solaris. In particular, it
needs the change to make Solaris' PTHREAD_PROCESS_PRIVATE a 16-bit
integer.
APFloat: Rewrite It In Rust and use it for deterministic floating-point CTFE.
As part of the CTFE initiative, we're forced to find a solution for floating-point operations.
By design, IEEE-754 does not explicitly define everything in a deterministic manner, and there is some variability between platforms, at the very least (e.g. NaN payloads).
If types are to evaluate constant expressions involving type (or in the future, const) generics, that evaluation needs to be *fully deterministic*, even across `rustc` host platforms.
That is, if `[T; T::X]` was used in a cross-compiled library, and the evaluation of `T::X` executed a floating-point operation, that operation has to be reproducible on *any other host*, only knowing `T` and the definition of the `X` associated const (as either AST or HIR).
Failure to uphold those rules allows an associated type (e.g. `<Foo as Iterator>::Item`) to be seen as two (or more) different types, depending on the current host, and such type safety violations typically allow writing of a `transmute` in safe code, given enough generics.
The options considered by @rust-lang/compiler were:
1. Ban floating-point operations in generic const-evaluation contexts
2. Emulate floating-point operations in an uniformly deterministic fashion
The former option may seem appealing at first, but floating-point operations *are allowed today*, so they can't be banned wholesale, a distinction has to be made between the code that already works, and future generic contexts. *Moreover*, every computation that succeeded *has to be cached*, otherwise the generic case can be reproduced without any generics. IMO there are too many ways it can go wrong, and a single violation can be enough for an unsoundness hole.
Not to mention we may end up really wanting floating-point operations *anyway*, in CTFE.
I went with the latter option, and seeing how LLVM *already* has a library for this exact purpose (as it needs to perform optimizations independently of host floating-point capabilities), i.e. `APFloat`, that was what I ended up basing this PR on.
But having been burned by the low reusability of bindings that link to LLVM, and because I would *rather* the floating-point operations to be wrong than not deterministic or not memory-safe (`APFloat` does far more pointer juggling than I'm comfortable with), I decided to RIIR.
This way, we have a guarantee of *no* `unsafe` code, a bit more control over the where native floating-point might accidentally be involved, and non-LLVM backends can share it.
I've also ported all the testcases over, *before* any functionality, to catch any mistakes.
Currently the PR replaces all CTFE operations to go through `apfloat::ieee::{Single,Double}`, keeping only the bits of the `f32` / `f64` memory representation in between operations.
Converting from a string also double-checks that `core::num` and `apfloat` agree on the interpretation of a floating-point number literal, in case either of them has any bugs left around.
r? @nikomatsakis
f? @nagisa @est31
<hr/>
Huge thanks to @edef1c for first demoing usable `APFloat` bindings and to @chandlerc for fielding my questions on IRC about `APFloat` peculiarities (also upstreaming some bugfixes).
Run translation and LLVM in parallel when compiling with multiple CGUs
This is still a work in progress but the bulk of the implementation is done, so I thought it would be good to get it in front of more eyes.
This PR makes the compiler start running LLVM while translation is still in progress, effectively allowing for more parallelism towards the end of the compilation pipeline. It also allows the main thread to switch between either translation or running LLVM, which allows to reduce peak memory usage since not all LLVM module have to be kept in memory until linking. This is especially good for incr. comp. but it works just as well when running with `-Ccodegen-units=N`.
In order to help tuning and debugging the work scheduler, the PR adds the `-Ztrans-time-graph` flag which spits out html files that show how work packages where scheduled:
![Building regex](https://user-images.githubusercontent.com/1825894/28679272-f6752bd8-72f2-11e7-8a6c-56207855ce95.png)
(red is translation, green is llvm)
One side effect here is that `-Ztime-passes` might show something not quite correct because trans and LLVM are not strictly separated anymore. I plan to have some special handling there that will try to produce useful output.
One open question is how to determine whether the trans-thread should switch to intermediate LLVM processing.
TODO:
- [x] Restore `-Z time-passes` output for LLVM.
- [x] Update documentation, esp. for work package scheduling.
- [x] Tune the scheduling algorithm.
cc @alexcrichton @rust-lang/compiler
Three small fixes for save-analysis
First commit does some naive deduplication of macro uses. We end up with lots of duplication here because of the weird way we get this data (we extract a use for every span generated by a macro use).
Second commit is basically a typo fix.
Third commit is a bit interesting, it partially reverts a change from #40939 where temporary variables in format! (and thus println!) got a span with the primary pointing at the value stored into the temporary (e.g., `x` in `println!("...", x)`). If `format!` had a definition it should point at the temporary in the macro def, but since it is built-in, that is not possible (for now), so `DUMMY_SP` is the best we can do (using the span in the callee really breaks save-analysis because it thinks `x` is a definition as well as a reference).
There aren't a test for this stuff because: the deduplication is filtered by any of the users of save-analysis, so it is purely an efficiency change. I couldn't actually find an example for the second commit that we have any machinery to test, and the third commit is tested by the RLS, so there will be a test once I update the RLS version and and uncomment the previously failing tests).
r? @jseyfried
Rework Rustbuild to an eagerly compiling approach
This introduces a new dependency on `serde`; I don't believe that's a problem since bootstrap is compiled with nightly/beta always so proc macros are available. Compile times are slightly longer -- about 2-3x (30 seconds vs. 10 seconds). I don't think this is too big a problem, especially since recompiling bootstrap is somewhat rare. I think we can remove the dependency on Serde if necessary, though, so let me know.
r? @alexcrichton
Update the `cargo` submodule
Notably pull in an update to the `jobserver` crate to have Cargo set the
`CARGO_MAKEFLAGS` environment variable instead of the `MAKEFLAGS` environment
variable.
cc https://github.com/rust-lang/rust/issues/42635
Notably pull in an update to the `jobserver` crate to have Cargo set the
`CARGO_MAKEFLAGS` environment variable instead of the `MAKEFLAGS` environment
variable.
Switch to rust-lang-nursery/compiler-builtins
This commit migrates the in-tree `libcompiler_builtins` to the upstream version
at https://github.com/rust-lang-nursery/compiler-builtins. The upstream version
has a number of intrinsics written in Rust and serves as an in-progress rewrite
of compiler-rt into Rust. Additionally it also contains all the existing
intrinsics defined in `libcompiler_builtins` for 128-bit integers.
It's been the intention since the beginning to make this transition but
previously it just lacked the manpower to get done. As this PR likely shows it
wasn't a trivial integration! Some highlight changes are:
* The PR rust-lang-nursery/compiler-builtins#166 contains a number of fixes
across platforms and also some refactorings to make the intrinsics easier to
read. The additional testing added there also fixed a number of integration
issues when pulling the repository into this tree.
* LTO with the compiler-builtins crate was fixed to link in the entire crate
after the LTO process as these intrinsics are excluded from LTO.
* Treatment of hidden symbols was updated as previously the
`#![compiler_builtins]` crate would mark all symbol *imports* as hidden
whereas it was only intended to mark *exports* as hidden.
rustc: Implement the #[global_allocator] attribute
This PR is an implementation of [RFC 1974] which specifies a new method of
defining a global allocator for a program. This obsoletes the old
`#![allocator]` attribute and also removes support for it.
[RFC 1974]: https://github.com/rust-lang/rfcs/pull/1974
The new `#[global_allocator]` attribute solves many issues encountered with the
`#![allocator]` attribute such as composition and restrictions on the crate
graph itself. The compiler now has much more control over the ABI of the
allocator and how it's implemented, allowing much more freedom in terms of how
this feature is implemented.
cc #27389
This PR is an implementation of [RFC 1974] which specifies a new method of
defining a global allocator for a program. This obsoletes the old
`#![allocator]` attribute and also removes support for it.
[RFC 1974]: https://github.com/rust-lang/rfcs/pull/197
The new `#[global_allocator]` attribute solves many issues encountered with the
`#![allocator]` attribute such as composition and restrictions on the crate
graph itself. The compiler now has much more control over the ABI of the
allocator and how it's implemented, allowing much more freedom in terms of how
this feature is implemented.
cc #27389
This commit migrates the in-tree `libcompiler_builtins` to the upstream version
at https://github.com/rust-lang-nursery/compiler-builtins. The upstream version
has a number of intrinsics written in Rust and serves as an in-progress rewrite
of compiler-rt into Rust. Additionally it also contains all the existing
intrinsics defined in `libcompiler_builtins` for 128-bit integers.
It's been the intention since the beginning to make this transition but
previously it just lacked the manpower to get done. As this PR likely shows it
wasn't a trivial integration! Some highlight changes are:
* The PR rust-lang-nursery/compiler-builtins#166 contains a number of fixes
across platforms and also some refactorings to make the intrinsics easier to
read. The additional testing added there also fixed a number of integration
issues when pulling the repository into this tree.
* LTO with the compiler-builtins crate was fixed to link in the entire crate
after the LTO process as these intrinsics are excluded from LTO.
* Treatment of hidden symbols was updated as previously the
`#![compiler_builtins]` crate would mark all symbol *imports* as hidden
whereas it was only intended to mark *exports* as hidden.
When writing LLVM IR output demangled fn name in comments
`--emit=llvm-ir` looks like this now:
```
; <alloc::vec::Vec<T> as core::ops::index::IndexMut<core::ops::range::RangeFull>>::index_mut
; Function Attrs: inlinehint uwtable
define internal { i8*, i64 } @"_ZN106_$LT$alloc..vec..Vec$LT$T$GT$$u20$as$u20$core..ops..index..IndexMut$LT$core..ops..range..RangeFull$GT$$GT$9index_mut17h7f7b576609f30262E"(%"alloc::vec::Vec<u8>"* dereferenceable(24)) unnamed_addr #0 {
start:
...
```
cc https://github.com/integer32llc/rust-playground/issues/15
Integrate jobserver support to parallel codegen
This commit integrates the `jobserver` crate into the compiler. The crate was
previously integrated in to Cargo as part of rust-lang/cargo#4110. The purpose
here is to two-fold:
* Primarily the compiler can cooperate with Cargo on parallelism. When you run
`cargo build -j4` then this'll make sure that the entire build process between
Cargo/rustc won't use more than 4 cores, whereas today you'd get 4 rustc
instances which may all try to spawn lots of threads.
* Secondarily rustc/Cargo can now integrate with a foreign GNU `make` jobserver.
This means that if you call cargo/rustc from `make` or another
jobserver-compatible implementation it'll use foreign parallelism settings
instead of creating new ones locally.
As the number of parallel codegen instances in the compiler continues to grow
over time with the advent of incremental compilation it's expected that this'll
become more of a problem, so this is intended to nip concurrent concerns in the
bud by having all the tools to cooperate!
Note that while rustc has support for itself creating a jobserver it's far more
likely that rustc will always use the jobserver configured by Cargo. Cargo today
will now set a jobserver unconditionally for rustc to use.
This commit integrates the `jobserver` crate into the compiler. The crate was
previously integrated in to Cargo as part of rust-lang/cargo#4110. The purpose
here is to two-fold:
* Primarily the compiler can cooperate with Cargo on parallelism. When you run
`cargo build -j4` then this'll make sure that the entire build process between
Cargo/rustc won't use more than 4 cores, whereas today you'd get 4 rustc
instances which may all try to spawn lots of threads.
* Secondarily rustc/Cargo can now integrate with a foreign GNU `make` jobserver.
This means that if you call cargo/rustc from `make` or another
jobserver-compatible implementation it'll use foreign parallelism settings
instead of creating new ones locally.
As the number of parallel codegen instances in the compiler continues to grow
over time with the advent of incremental compilation it's expected that this'll
become more of a problem, so this is intended to nip concurrent concerns in the
bud by having all the tools to cooperate!
Note that while rustc has support for itself creating a jobserver it's far more
likely that rustc will always use the jobserver configured by Cargo. Cargo today
will now set a jobserver unconditionally for rustc to use.
This commit deletes the in-tree `getopts` crate in favor of the crates.io-based
`getopts` crate. The main difference here is with a new builder-style API, but
otherwise everything else remains relatively standard.