Files
clang-p2996/llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
Konrad Kusiak 4fa8a5487e [AMDGPU] Add sanity check that fixes bad shift operation in AMD backend
There is a problem with the
SILoadStoreOptimizer::dmasksCanBeCombined() function that can lead to
UB.

This boolean function decides if two masks can be combined into 1. The
idea here is that the bits which are "on" in one mask, don't overlap
with the "on" bits of the other. Consider an example (10 bits for
simplicity):

Mask 1: 0101101000
Mask 2: 0000000110

Those can be combined into a single mask: 0101101110.

To check if such an operation is possible, the code takes the mask
which is greater and counts how many 0s there are, starting from the
LSB and stopping at the first 1. Then, it shifts 1u by this number and
compares it with the smaller mask. The problem is that when both masks
are 0, the counter will find 32 zeroes in the first mask and will try
to do a shift by 32 positions which leads to UB.

The fix is a simple sanity check, if the bigger mask is 0 or not.

https://reviews.llvm.org/D155051
2023-08-11 15:26:35 -04:00

85 KiB