Files
clang-p2996/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
Changpeng Fang 70e78be7dc AMDGPU: Custom lower fptrunc vectors for f32 -> f16 (#141883)
The latest asics support v_cvt_pk_f16_f32 instruction. However current
implementation of vector fptrunc lowering fully scalarizes the vectors,
and the scalar conversions may not always be combined to generate the
packed one.
We made v2f32 -> v2f16 legal in
https://github.com/llvm/llvm-project/pull/139956. This work is an
extension to handle wider vectors. Instead of fully scalarization, we
split the vector to packs (v2f32 -> v2f16) to ensure the packed
conversion can always been generated.
2025-06-06 15:15:24 -07:00

274 KiB