os-rust/library/core
Trevor Gross 3f9aa50b70
Rollup merge of #124874 - jedbrown:float-mul-add-fast, r=saethlin
intrinsics fmuladdf{32,64}: expose llvm.fmuladd.* semantics

Add intrinsics `fmuladd{f32,f64}`. This computes `(a * b) + c`, to be fused if the code generator determines that (i) the target instruction set has support for a fused operation, and (ii) that the fused operation is more efficient than the equivalent, separate pair of `mul` and `add` instructions.

https://llvm.org/docs/LangRef.html#llvm-fmuladd-intrinsic

The codegen_cranelift uses the `fma` function from libc, which is a correct implementation, but without the desired performance semantic. I think this requires an update to cranelift to expose a suitable instruction in its IR.

I have not tested with codegen_gcc, but it should behave the same way (using `fma` from libc).

---
This topic has been discussed a few times on Zulip and was suggested, for example, by `@workingjubilee` in [Effect of fma disabled](https://rust-lang.zulipchat.com/#narrow/stream/122651-general/topic/Effect.20of.20fma.20disabled/near/274179331).
2024-10-11 23:57:44 -04:00
..
benches Reformat using the new identifier sorting from rustfmt 2024-09-22 19:11:29 -04:00
src Rollup merge of #124874 - jedbrown:float-mul-add-fast, r=saethlin 2024-10-11 23:57:44 -04:00
tests Rollup merge of #131287 - RalfJung:const_result, r=tgross35 2024-10-11 16:53:48 -05:00
Cargo.toml Port std library to RTEMS 2024-09-03 09:19:29 +02:00