granite-rust/compiler
bors fd9bf59436 Auto merge of #111999 - scottmcm:codegen-less-memcpy, r=compiler-errors
Use `load`+`store` instead of `memcpy` for small integer arrays

I was inspired by #98892 to see whether, rather than making `mem::swap` do something smart in the library, we could update MIR assignments like `*_1 = *_2` to do something smarter than `memcpy` for sufficiently-small types that doing it inline is going to be better than a `memcpy` call in assembly anyway.  After all, special code may help `mem::swap`, but if the "obvious" MIR can just result in the correct thing that helps everything -- other code like `mem::replace`, people doing it manually, and just passing around by value in general -- as well as makes MIR inlining happier since it doesn't need to deal with all the complicated library code if it just sees a couple assignments.

LLVM will turn the short, known-length `memcpy`s into direct instructions in the backend, but that's too late for it to be able to remove `alloca`s.  In general, replacing `memcpy`s with typed instructions is hard in the middle-end -- even for `memcpy.inline` where it knows it won't be a function call -- is hard [due to poison propagation issues](https://rust-lang.zulipchat.com/#narrow/stream/187780-t-compiler.2Fwg-llvm/topic/memcpy.20vs.20load-store.20for.20MIR.20assignments/near/360376712).  So because we know more about the type invariants -- these are typed copies -- rustc can emit something more specific, allowing LLVM to `mem2reg` away the `alloca`s in some situations.

#52051 previously did something like this in the library for `mem::swap`, but it ended up regressing during enabling mir inlining (cbbf06b0cd), so this has been suboptimal on stable for ≈5 releases now.

The code in this PR is narrowly targeted at just integer arrays in LLVM, but works via a new method on the [`LayoutTypeMethods`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/traits/trait.LayoutTypeMethods.html) trait, so specific backends based on cg_ssa can enable this for more situations over time, as we find them.  I don't want to try to bite off too much in this PR, though.  (Transparent newtypes and simple things like the 3×usize `String` would be obvious candidates for a follow-up.)

Codegen demonstrations: <https://llvm.godbolt.org/z/fK8hT9aqv>

Before:
```llvm
define void `@swap_rgb48_old(ptr` noalias nocapture noundef align 2 dereferenceable(6) %x, ptr noalias nocapture noundef align 2 dereferenceable(6) %y) unnamed_addr #1 {
  %a.i = alloca [3 x i16], align 2
  call void `@llvm.lifetime.start.p0(i64` 6, ptr nonnull %a.i)
  call void `@llvm.memcpy.p0.p0.i64(ptr` noundef nonnull align 2 dereferenceable(6) %a.i, ptr noundef nonnull align 2 dereferenceable(6) %x, i64 6, i1 false)
  tail call void `@llvm.memcpy.p0.p0.i64(ptr` noundef nonnull align 2 dereferenceable(6) %x, ptr noundef nonnull align 2 dereferenceable(6) %y, i64 6, i1 false)
  call void `@llvm.memcpy.p0.p0.i64(ptr` noundef nonnull align 2 dereferenceable(6) %y, ptr noundef nonnull align 2 dereferenceable(6) %a.i, i64 6, i1 false)
  call void `@llvm.lifetime.end.p0(i64` 6, ptr nonnull %a.i)
  ret void
}
```
Note it going to stack:
```nasm
swap_rgb48_old:                         # `@swap_rgb48_old`
        movzx   eax, word ptr [rdi + 4]
        mov     word ptr [rsp - 4], ax
        mov     eax, dword ptr [rdi]
        mov     dword ptr [rsp - 8], eax
        movzx   eax, word ptr [rsi + 4]
        mov     word ptr [rdi + 4], ax
        mov     eax, dword ptr [rsi]
        mov     dword ptr [rdi], eax
        movzx   eax, word ptr [rsp - 4]
        mov     word ptr [rsi + 4], ax
        mov     eax, dword ptr [rsp - 8]
        mov     dword ptr [rsi], eax
        ret
```

Now:
```llvm
define void `@swap_rgb48(ptr` noalias nocapture noundef align 2 dereferenceable(6) %x, ptr noalias nocapture noundef align 2 dereferenceable(6) %y) unnamed_addr #0 {
start:
  %0 = load <3 x i16>, ptr %x, align 2
  %1 = load <3 x i16>, ptr %y, align 2
  store <3 x i16> %1, ptr %x, align 2
  store <3 x i16> %0, ptr %y, align 2
  ret void
}
```
still lowers to `dword`+`word` operations, but has no stack traffic:
```nasm
swap_rgb48:                             # `@swap_rgb48`
        mov     eax, dword ptr [rdi]
        movzx   ecx, word ptr [rdi + 4]
        movzx   edx, word ptr [rsi + 4]
        mov     r8d, dword ptr [rsi]
        mov     dword ptr [rdi], r8d
        mov     word ptr [rdi + 4], dx
        mov     word ptr [rsi + 4], cx
        mov     dword ptr [rsi], eax
        ret
```

And as a demonstration that this isn't just `mem::swap`, a `mem::replace` on a small array (since replace doesn't use swap since #83022), which used to be `memcpy`s in LLVM changes in IR
```llvm
define void `@replace_short_array(ptr` noalias nocapture noundef sret([3 x i32]) dereferenceable(12) %0, ptr noalias noundef align 4 dereferenceable(12) %r, ptr noalias nocapture noundef readonly dereferenceable(12) %v) unnamed_addr #0 {
start:
  %1 = load <3 x i32>, ptr %r, align 4
  store <3 x i32> %1, ptr %0, align 4
  %2 = load <3 x i32>, ptr %v, align 4
  store <3 x i32> %2, ptr %r, align 4
  ret void
}
```
but that lowers to reasonable `dword`+`qword` instructions still
```nasm
replace_short_array:                    # `@replace_short_array`
        mov     rax, rdi
        mov     rcx, qword ptr [rsi]
        mov     edi, dword ptr [rsi + 8]
        mov     dword ptr [rax + 8], edi
        mov     qword ptr [rax], rcx
        mov     rcx, qword ptr [rdx]
        mov     edx, dword ptr [rdx + 8]
        mov     dword ptr [rsi + 8], edx
        mov     qword ptr [rsi], rcx
        ret
```
2023-06-06 01:50:28 +00:00
..
rustc fix link 2023-03-11 10:53:47 -06:00
rustc_abi Use translatable diagnostics in rustc_const_eval 2023-06-01 14:45:18 +00:00
rustc_apfloat
rustc_arena Deny the unsafe_op_in_unsafe_fn lint in 2023-04-28 21:00:54 -07:00
rustc_ast Add warn-by-default lint for local binding shadowing exported glob re-export item 2023-05-27 18:49:07 +08:00
rustc_ast_lowering Separate AnonConst from ConstBlock in HIR. 2023-06-02 21:25:18 +00:00
rustc_ast_passes cleanup 2023-06-03 09:44:30 +08:00
rustc_ast_pretty Migrate offset_of from a macro to builtin # syntax 2023-05-05 21:44:13 +02:00
rustc_attr Ensure Fluent messages are in alphabetical order 2023-05-25 23:49:35 +00:00
rustc_baked_icu_data Regen baked data 2023-05-02 10:45:16 -07:00
rustc_borrowck Rollup merge of #111980 - compiler-errors:unmapped-substs, r=lcnr 2023-06-01 11:09:43 +05:30
rustc_builtin_macros Auto merge of #111748 - nnethercote:Cow-DiagnosticMessage, r=WaffleLapkin 2023-05-29 07:10:44 +00:00
rustc_codegen_cranelift Rollup merge of #112168 - scottmcm:lower-div-rem-unchecked-to-mir, r=oli-obk 2023-06-02 16:02:06 -07:00
rustc_codegen_gcc Use translatable diagnostics in rustc_const_eval 2023-06-01 14:45:18 +00:00
rustc_codegen_llvm Use load-store instead of memcpy for short integer arrays 2023-06-04 00:51:49 -07:00
rustc_codegen_ssa Auto merge of #111999 - scottmcm:codegen-less-memcpy, r=compiler-errors 2023-06-06 01:50:28 +00:00
rustc_const_eval Rollup merge of #112168 - scottmcm:lower-div-rem-unchecked-to-mir, r=oli-obk 2023-06-02 16:02:06 -07:00
rustc_data_structures Update dependencies with reported vulnerabilities 2023-06-02 12:34:01 -05:00
rustc_driver fix spelling error 2023-06-05 16:01:09 +02:00
rustc_driver_impl Use Cow in {D,Subd}iagnosticMessage. 2023-05-29 09:23:43 +10:00
rustc_error_codes Implement custom diagnostic for ConstParamTy 2023-06-01 18:21:42 +00:00
rustc_error_messages Use translatable diagnostics in rustc_const_eval 2023-06-01 14:45:18 +00:00
rustc_errors Auto merge of #112198 - compiler-errors:rollup-o2xe4of, r=compiler-errors 2023-06-02 07:57:21 +00:00
rustc_expand Use Cow in {D,Subd}iagnosticMessage. 2023-05-29 09:23:43 +10:00
rustc_feature cleanup 2023-06-03 09:44:30 +08:00
rustc_fluent_macro Remove unused synstructure dep 2023-04-22 22:03:33 +01:00
rustc_fs_util Add try_canonicalize to rustc_fs_util and use it over fs::canonicalize 2023-03-16 21:50:23 +01:00
rustc_graphviz enable rust_2018_idioms for doctests 2023-05-07 00:12:29 +03:00
rustc_hir Separate AnonConst from ConstBlock in HIR. 2023-06-02 21:25:18 +00:00
rustc_hir_analysis Rollup merge of #112322 - compiler-errors:no-IMPLIED_BOUNDS_ENTAILMENT-if-errs, r=eholk 2023-06-05 23:48:00 +02:00
rustc_hir_pretty Separate AnonConst from ConstBlock in HIR. 2023-06-02 21:25:18 +00:00
rustc_hir_typeck Fix type-inference regression in #112225 2023-06-04 10:56:00 +02:00
rustc_incremental Ensure Fluent messages are in alphabetical order 2023-05-25 23:49:35 +00:00
rustc_index Auto merge of #111925 - Manishearth:rollup-z6z6l2v, r=Manishearth 2023-05-25 00:33:43 +00:00
rustc_infer Normalize anon consts in new solver 2023-06-02 22:07:57 +00:00
rustc_interface Ensure Fluent messages are in alphabetical order 2023-05-25 23:49:35 +00:00
rustc_lexer Don't try to eat non-existent decimal digits. 2023-05-15 18:33:12 +10:00
rustc_lint Ensure space is inserted after keyword in unused_delims 2023-06-05 14:25:00 +00:00
rustc_lint_defs Remove const eval limit and implement an exponential backoff lint instead 2023-05-31 10:24:17 +00:00
rustc_llvm Add SafeStack support to rustc 2023-05-26 15:18:54 -04:00
rustc_log Stabilize IsTerminal 2023-04-10 17:24:23 +09:00
rustc_macros Use translatable diagnostics in rustc_const_eval 2023-06-01 14:45:18 +00:00
rustc_metadata Separate AnonConst from ConstBlock in HIR. 2023-06-02 21:25:18 +00:00
rustc_middle Rollup merge of #112183 - compiler-errors:new-solver-anon-ct, r=BoxyUwU 2023-06-02 16:02:06 -07:00
rustc_mir_build Show note for type ascription interpreted as a constant pattern, not a new variable 2023-06-04 20:49:30 +08:00
rustc_mir_dataflow unique borrows are mutating uses 2023-05-29 17:15:48 +02:00
rustc_mir_transform Auto merge of #112240 - cjgillot:recurse-inline, r=scottmcm 2023-06-04 03:39:24 +00:00
rustc_monomorphize Clarify follow_inlining. 2023-06-02 13:07:30 +10:00
rustc_parse Auto merge of #111748 - nnethercote:Cow-DiagnosticMessage, r=WaffleLapkin 2023-05-29 07:10:44 +00:00
rustc_parse_format Fix typos in compiler 2023-04-10 22:02:52 +02:00
rustc_passes Rollup merge of #112081 - obeis:doc-test-literal, r=compiler-errors 2023-06-05 23:47:57 +02:00
rustc_plugin_impl Add rustc_fluent_macro to decouple fluent from rustc_macros 2023-04-18 18:56:22 +00:00
rustc_privacy Rename impl_defaultness to defaultness 2023-06-01 06:14:06 +00:00
rustc_query_impl deps: bump crates 2023-05-26 13:03:47 +03:00
rustc_query_system Ensure Fluent messages are in alphabetical order 2023-05-25 23:49:35 +00:00
rustc_resolve Use Cow in {D,Subd}iagnosticMessage. 2023-05-29 09:23:43 +10:00
rustc_serialize Fix the FileEncoder buffer size. 2023-05-15 08:59:11 +10:00
rustc_session Auto merge of #103877 - oli-obk:const_eval_step_limit, r=fee1-dead 2023-06-01 05:32:00 +00:00
rustc_smir Remove DesugaringKind::Replace. 2023-05-25 17:40:46 +00:00
rustc_span Auto merge of #111567 - Urgau:uplift_cast_ref_to_mut, r=b-naber 2023-06-01 01:27:32 +00:00
rustc_symbol_mangling Rollup merge of #112182 - rcvalle:rust-cfi-fix-111185, r=compiler-errors 2023-06-02 18:12:45 +02:00
rustc_target Auto merge of #112198 - compiler-errors:rollup-o2xe4of, r=compiler-errors 2023-06-02 07:57:21 +00:00
rustc_trait_selection Rollup merge of #112318 - oli-obk:assoc_ty_sized_bound_for_object_safety, r=compiler-errors 2023-06-05 23:48:00 +02:00
rustc_traits Rename tcx.mk_re_* => Region::new_* 2023-05-29 17:54:53 +00:00
rustc_transmute Remove unused TypeFoldable/TypeVisitable impls. 2023-04-26 15:19:50 +10:00
rustc_ty_utils Separate AnonConst from ConstBlock in HIR. 2023-06-02 21:25:18 +00:00
rustc_type_ir better TyKind::Debug 2023-05-26 18:55:02 +01:00