clang-p2996/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp at 6a21dfaac66ffa39dc7faaec1cd7932099c052d4

Files

Changpeng Fang 70e78be7dc AMDGPU: Custom lower fptrunc vectors for f32 -> f16 (#141883 )

The latest asics support v_cvt_pk_f16_f32 instruction. However current
implementation of vector fptrunc lowering fully scalarizes the vectors,
and the scalar conversions may not always be combined to generate the
packed one.
We made v2f32 -> v2f16 legal in
https://github.com/llvm/llvm-project/pull/139956. This work is an
extension to handle wider vectors. Instead of fully scalarization, we
split the vector to packs (v2f32 -> v2f16) to ensure the packed
conversion can always been generated.

2025-06-06 15:15:24 -07:00

274 KiB

Raw Blame History

View Raw

274 KiB Raw Blame History

274 KiB

Raw Blame History