clang-p2996/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCAsmInfo.cpp at 8fcb1263f42657ecbc355beff12500dfbcddee17

Files

Matt Arsenault 01c9a14ccf AMDGPU: Define v_mfma_f32_{16x16x128|32x32x64}_f8f6f4 instructions (#116723 )

These use a new VOP3PX encoding for the v_mfma_scale_* instructions,
which bundles the pre-scale v_mfma_ld_scale_b32. None of the modifiers
are supported yet (op_sel, neg or clamp).

I'm not sure the intrinsic should really expose op_sel (or any of the
others). If I'm reading the documentation correctly, we should be able
to just have the raw scale operands and auto-match op_sel to byte
extract patterns.

The op_sel syntax also seems extra horrible in this usage, especially with the
usual assumed op_sel_hi=-1 behavior.

2024-11-21 08:51:58 -08:00

2.4 KiB

Raw Blame History

View Raw

2.4 KiB Raw Blame History

2.4 KiB

Raw Blame History