First step towards being able to decode AVX512 PMOVZX instructions without a massive bloat in the shuffle decode switch statement.
This should also make it easier to decode X86ISD::VZEXT target shuffles in the future.
llvm-svn: 259995
This patch adds support for v8i16 and v16i8 shuffle lowering using the immediate versions of the SSE4A EXTRQ and INSERTQ instructions. Although rather limited (they can only act on the lower 64-bits of the source vectors, leave the upper 64-bits of the result vector undefined and don't have VEX encoded variants), the instructions are still useful for the zero extension of any lane (EXTRQ) or inserting a lane into another vector (INSERTQ). Testing demonstrated that it wasn't typically worth it to use these instructions for v2i64 or v4i32 vector shuffles although they are capable of it.
As well as adding specific pattern matching for the shuffles, the patch uses EXTRQ for zero extension cases where SSE41 isn't available and its more efficient than the SSE2 'unpack' default approach. It also adds shuffle decode support for the EXTRQ / INSERTQ cases when the instructions are handling full byte-sized extractions / insertions.
From this foundation, future patches will be able to make use of the instructions for situations that use their ability to extract/insert at the bit level.
Differential Revision: http://reviews.llvm.org/D10146
llvm-svn: 241508
This patch adds shuffle mask decodes for integer zero extends (pmovzx** and movq xmm,xmm) and scalar float/double loads/moves (movss/movsd).
Also adds shuffle mask decodes for integer loads (movd/movq).
Differential Revision: http://reviews.llvm.org/D7228
llvm-svn: 227688
Patch to provide shuffle decodes and asm comments for the sse pslldq/psrldq SSE2/AVX2 byte shift instructions.
Differential Revision: http://reviews.llvm.org/D5598
llvm-svn: 219738
lowering.
This also implements the fancy blend lowering for v16i16 using AVX2 and
teaches the X86 backend to print shuffle masks for 256-bit PSHUFB
and PBLENDW instructions. It also makes the mask decoding correct for
PBLENDW instructions. The yaks, they are legion.
Tests are updated accordingly. There are some missing tests for the
VBLENDVB lowering, but I'll add those in a follow-up as this commit has
accumulated enough cruft already.
llvm-svn: 218430
add VPBLENDD to the InstPrinter's comment generation so we get nice
comments everywhere.
Now that we have the nice comments, I can see the bug introduced by
a silly typo in the commit that enabled VPBLENDD, and have fixed it. Yay
tests that are easy to inspect.
llvm-svn: 218335
undef in the shuffle mask. This shows up when we're printing comments
during lowering and we still have an IR-level constant hanging around
that models undef.
A nice consequence of this is *much* prettier test cases where the undef
lanes actually show up as undef rather than as a particular set of
values. This also allows us to print shuffle comments in cases that use
undef such as the recently added variable VPERMILPS lowering. Now those
test cases have nice shuffle comments attached with their details.
The shuffle lowering for PSHUFB has been augmented to use undef, and the
shuffle combining has been augmented to comprehend it.
llvm-svn: 218301
instructions when it finds an appropriate pattern.
These are lovely instructions, and its a shame to not use them. =] They
are fast, and can hand loads folded into their operands, etc.
I've also plumbed the comment shuffle decoding through the various
layers so that the test cases are printed nicely.
llvm-svn: 217758
an immediate operand when we don't have instruction-specific comments.
This ensures that instruction-specific comments are attached to the same
line as the instruction which is important for using them to write
readable and maintainable tests. My next commit will just such a test.
llvm-svn: 217099