Commit Graph

48680 Commits

Author SHA1 Message Date
Ilia Diachkov
b8e1544b9d [SPIRV] add SPIRVPrepareFunctions pass and update other passes
The patch adds SPIRVPrepareFunctions pass, which modifies function
signatures containing aggregate arguments and/or return values before
IR translation. Information about the original signatures is stored in
metadata. It is used during call lowering to restore correct SPIR-V types
of function arguments and return values. This pass also substitutes some
llvm intrinsic calls to function calls, generating the necessary functions
in the module, as the SPIRV translator does.

The patch also includes changes in other modules, fixing errors and
enabling many SPIR-V features that were omitted earlier. And 15 LIT tests
are also added to demonstrate the new functionality.

Differential Revision: https://reviews.llvm.org/D129730

Co-authored-by: Aleksandr Bezzubikov <zuban32s@gmail.com>
Co-authored-by: Michal Paszkowski <michal.paszkowski@outlook.com>
Co-authored-by: Andrey Tretyakov <andrey1.tretyakov@intel.com>
Co-authored-by: Konrad Trifunovic <konrad.trifunovic@intel.com>
2022-07-22 04:00:48 +03:00
Teresa Johnson
1dad6247d2 [MemProf] Add memprof metadata related analysis utilities
Adds a number of utilities that are used to help create and update
memprof related metadata. These will be used during profile matching
and annotation, as well as by the inliner when updating the metadata.
Also adds unit tests for the utilities.

See also related RFCs:
RFC: Sanitizer-based Heap Profiler [1]
RFC: A binary serialization format for MemProf [2]
RFC: IR metadata format for MemProf [3]
(Note that the IR metadata format has changed from the RFC during
implementation, as described in the preceeding patch adding the basic
metadata and verification support.)

Depends on D128141.

Differential Revision: https://reviews.llvm.org/D128854
2022-07-21 13:46:01 -07:00
Sanjay Patel
78c09f0f24 [PatternMatch][InstCombine] match a vector with constant expression element(s) as a constant expression
The InstCombine test is reduced from issue #56601. Without the more
liberal match for ConstantExpr, we try to rearrange constants in
Negator forever.

Alternatively, we could adjust the definition of m_ImmConstant to be
more conservative, but that's probably a larger patch, and I don't
see any downside to changing m_ConstantExpr. We never capture and
modify a ConstantExpr; transforms just want to avoid it.

Differential Revision: https://reviews.llvm.org/D130286
2022-07-21 15:23:57 -04:00
Daniel Thornburgh
6605187103 [NFC] Fix compiler warning in MarkupFilter 2022-07-21 12:00:29 -07:00
Daniel Thornburgh
17e4c217b6 [Symbolizer] Implement contextual symbolizer markup elements.
This change implements the contextual symbolizer markup elements: reset,
module, and mmap. These provide information about the runtime context of
the binary necessary to resolve addresses to symbolic values.

Summary information is printed to the output about this context.
Multiple mmap elements for the same module line are coalesced together.
The standard requires that such elements occur on their own lines to
allow for this; accordingly, anything after a contextual element on a
line is silently discarded.

Implementing this cleanly requires that the filter drive the parser;
this allows skipped sections to avoid being parsed. This also makes the
filter quite a bit easier to use, at the cost of some unused
flexibility.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D129519
2022-07-21 11:29:19 -07:00
David Sherwood
f15b6b2907 [AArch64] Add target hook for preferPredicateOverEpilogue
This patch adds the AArch64 hook for preferPredicateOverEpilogue,
which currently returns true if SVE is enabled and one of the
following conditions (non-exhaustive) is met:

1. The "sve-tail-folding" option is set to "all", or
2. The "sve-tail-folding" option is set to "all+noreductions"
and the loop does not contain reductions,
3. The "sve-tail-folding" option is set to "all+norecurrences"
and the loop has no first-order recurrences.

Currently the default option is "disabled", but this will be
changed in a later patch.

I've added new tests to show the options behave as expected here:

  Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll

Differential Revision: https://reviews.llvm.org/D129560
2022-07-21 17:20:06 +01:00
Joseph Huber
bc33c2fa0c [Binary] Hard-code the alignment of the offloading binary
Summary:
We previously used `alignof` to get the necessary alignment of the
binary header. However this was different on 32-bit platforms and caused
a few tests to fail because of it. This patch just changes this to be a
hard-coded constant of 8.
2022-07-21 09:28:26 -04:00
Nikita Popov
1f69503107 [MemoryBuiltins] Add getReallocatedOperand() function (NFC)
Replace the value-accepting isReallocLikeFn() overload with a
getReallocatedOperand() function, which returns which operand is
the one being reallocated. Currently, this is always the first one,
but once allockind(realloc) is respected, the reallocated operand
will be determined by the allocptr parameter attribute.
2022-07-21 14:54:16 +02:00
Nikita Popov
46e6dd84b7 [MemoryBuiltins] Remove isFreeCall() function (NFC)
Remove isFreeCall() in favor of getFreedOperand(). Replace the
two remaining uses with a getFreedOperand() != nullptr check, as
they only care that something is getting freed. (The usage in DSE
is correct as such. The allocator-related checks in CFLGraph look
rather questionable in general.)
2022-07-21 14:44:23 +02:00
Alexey Lapshin
8bb4451a65 [Reland][DebugInfo][llvm-dwarfutil] Combine overlapped address ranges.
DWARF files may contain overlapping address ranges. f.e. it can happen if the two
copies of the function have identical instruction sequences and they end up sharing.
That looks incorrect from the point of view of DWARF spec. Current implementation
of DWARFLinker does not combine overlapped address ranges. It would be good if such
ranges would be handled in some useful way. Thus, this patch allows DWARFLinker
to combine overlapped ranges in a single one.

Depends on D86539

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D123469
2022-07-21 14:15:39 +03:00
Matt Devereau
e0fbd990c9 [AArch64][SVE] Add ISel pattern to lower DUPLANE128 to LD1RQD
Following on from https://reviews.llvm.org/D128902, lower DUPLANE128 to LD1RQD
for integer load types from instruction selection.

Differential Revision: https://reviews.llvm.org/D130010
2022-07-21 10:56:43 +00:00
Alexey Lapshin
3aad49082c Revert "[DebugInfo][llvm-dwarfutil] Combine overlapped address ranges."
This reverts commit d2a4d6bf9c.
2022-07-21 13:40:20 +03:00
Nikita Popov
c81dff3c30 [MemoryBuiltins] Add getFreedOperand() function (NFCI)
We currently assume in a number of places that free-like functions
free their first argument. This is true for all hardcoded free-like
functions, but with the new attribute-based design, the freed
argument is supposed to be indicated by the allocptr attribute.

To make sure we handle this correctly once allockind(free) is
respected, add a getFreedOperand() helper which returns the freed
argument, rather than just indicating whether the call frees *some*
argument.

This migrates most but not all users of isFreeCall() to the new
API. The remaining users are a bit more tricky.
2022-07-21 12:39:35 +02:00
Alexey Lapshin
d2a4d6bf9c [DebugInfo][llvm-dwarfutil] Combine overlapped address ranges.
DWARF files may contain overlapping address ranges. f.e. it can happen if the two
copies of the function have identical instruction sequences and they end up sharing.
That looks incorrect from the point of view of DWARF spec. Current implementation
of DWARFLinker does not combine overlapped address ranges. It would be good if such
ranges would be handled in some useful way. Thus, this patch allows DWARFLinker
to combine overlapped ranges in a single one.

Depends on D86539

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D123469
2022-07-21 13:15:18 +03:00
Nikita Popov
d144ae6e1b [MemoryBuiltins] Default to trivial mapper in getAllocSize() (NFC)
Default getAllocSize() to use the trivial mapper. Also switch
from using std::function to function_ref.

Furthermore, update the doc comment to point out a subtle difference
between getAllocSize() and getObjectSize(): The latter may also
return something for calls that return their argument (via "returned"
attribute or special intrinsics like invariant groups).
2022-07-21 11:43:48 +02:00
Nikita Popov
f45ab43332 [MemoryBuiltins] Avoid isAllocationFn() call before checking removable alloc
Alloc directly checking whether a given call is a removable
allocation, instead of first checking whether it is an allocation
first.
2022-07-21 09:39:19 +02:00
Congzhe Cao
05ccde8023 [LoopCacheAnalysis] Fix a type mismatch problem in cost calculation
There is a problem in loop cache analysis that the types of SCEV variables
`Coeff` and `ElemSize` in function `isConsecutive()` may not match. The
mismatch would cause SCEV failures when `Coeff` is multiplied with `ElemSize`.

The fix in this patch is to extend the type of both `Coeff` and `ElemSize` to
whichever is wider in those two variables. As a clean-up, duplicate calculations
of `Stride` in `computeRefCost()` is then removed.

Reviewed By: Meinersbur, #loopoptwg

Differential Revision: https://reviews.llvm.org/D128877
2022-07-21 01:57:05 -04:00
Anubhab Ghosh
4fcf8434dd [ORC] Add a new MemoryMapper-based JITLinkMemoryManager implementation.
MapperJITLinkMemoryManager supports executor memory management using any
implementation of MemoryMapper to do the transfer such as InProcessMapper or
SharedMemoryMapper.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D129495
2022-07-20 17:52:37 -07:00
Teresa Johnson
0174f5553e [MemProf] Basic metadata support and verification
Add basic support for the MemProf metadata (!memprof and !callsite)
which was initially described in "RFC: IR metadata format for MemProf"
(https://discourse.llvm.org/t/rfc-ir-metadata-format-for-memprof/59165).

The bulk of the patch is verification support, along with some tests.

There are a couple of changes to the format described in the original
RFC:

Initial measurements suggested that a tree format for the stack ids in
the contexts would be more efficient, but subsequent evaluation with
large applications showed that in fact the cost of the additional
metadata nodes required by this deduplication scheme overwhelmed the
benefit from sharing stack id nodes. Therefore, the implementation here
and in follow on patches utilizes a simpler scheme of lists of stack id
integers in the memprof profile contexts and callsite metadata. The
follow on matching patch employs context trimming optimizations to
reduce the cost.

Secondly, instead of verbosely listing all profiled fields in each
profiled context (memory info block or MIB), and deferring the
interpretation of the profile data, the profile data is evaluated and
converted into string tags specifying the behavior (e.g. "cold") during
profile matching. This reduces the verbosity of the profile metadata,
and allows additional context trimming optimizations. As a result, the
named metadata schema description is also no longer needed.

Differential Revision: https://reviews.llvm.org/D128141
2022-07-20 15:30:55 -07:00
Schrodinger ZHU Yifan
304027206c [ThinLTO] Support aliased GlobalIFunc
Fixes https://github.com/llvm/llvm-project/issues/56290: when an ifunc is
aliased in LTO, clang will attempt to create an alias summary; however, as ifunc
is not included in the module summary, doing so will lead to crash.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D129009
2022-07-20 15:30:38 -07:00
Kazu Hirata
94e03abf91 [IPO] Restore a call to has_value (NFC)
This patch restores a call to has_value to make it clear that we are
checking the presence of an optional value, not the underlying value.

This patch partially reverts d08f34b592.

Differential Revision: https://reviews.llvm.org/D129453
2022-07-20 09:40:18 -07:00
Roman Rusyaev
394a388d14 [TableGen] Add a location for a class definition that was forward-declared
This change improves ctags generation for tablegen files.

For the following example
```
class A;

class A {
  int a;
}
```
Previously, tags were generated only for a forward declaration of class 'A'.

This patch allows generating tags for the forward declarations
and further definition of class 'A'.

Reviewed By: barannikov88

Original patch by: rusyaev-roman (Roman Rusyaev)
Some adjustments by: nhaehnle (Nicolai Hähnle)

Differential Revision: https://reviews.llvm.org/D129935
2022-07-20 15:56:17 +02:00
esmeyi
b1847ff068 [XCOFF] write the aux header when the visibility is specified in XCOFF32.
The n_type field in the symbol table entry has two interpretations in XCOFF32, and a single interpretation in XCOFF64.
The new interpretation is used in XCOFF32 if the value of the o_vstamp field in the auxiliary header is 2.
In XCOFF64 and the new XCOFF32 interpretation, the n_type field is used for the symbol type and visibility.
The patch writes the aux header with an o_vstamp field value of 2 when the visibility is specified in XCOFF32 to make the new XCOFF32 interpretation used.

Reviewed By: DiggerLin, jhenderson

Differential Revision: https://reviews.llvm.org/D128148
2022-07-20 07:09:34 -04:00
Chuanqi Xu
645d2dd3a9 Revert "Don't treat readnone call in presplit coroutine as not access memory"
This reverts commit 57224ff4a6. This
commit may trigger crashes on some workloads. Revert it for clearness.
2022-07-20 17:00:58 +08:00
Fangrui Song
e931c2e870 [LegacyPM] Remove InstrOrderFileLegacyPass
Following recent changes removing non-core features of the legacy
PM/optimization pipeline.
2022-07-19 23:58:51 -07:00
Chuanqi Xu
57224ff4a6 Don't treat readnone call in presplit coroutine as not access memory
To solve the readnone problems in coroutines. See
https://discourse.llvm.org/t/address-thread-identification-problems-with-coroutine/62015
for details.

According to the discussion, we decide to fix the problem by inserting
isPresplitCoroutine() checks in different passes instead of
wrapping/unwrapping readnone attributes in CoroEarly/CoroCleanup passes.
In this direction, we might not be able to cover every case at first.
Let's take a "find and fix" strategy.

Reviewed By: nikic, nhaehnle, jyknight

Differential Revision: https://reviews.llvm.org/D127383
2022-07-20 10:37:23 +08:00
Jez Ng
2e2737cdf9 [MC][MachO] Change addrsig format + ensure its size is properly set
There were two problems with the previous setup:

1. We weren't setting its size, which caused problems when `__llvm_addrsig`
   wasn't the last section. In particular, `__debug_line` (if created) is
   generated and placed after `__llvm_addrsig`, and would result in an
   invalid object file w/ overlapping sections being emitted.

2. The symbol indices could be invalidated if e.g. `llvm-strip` ran on
   the object file. See discussion [here][1].

To fix both these issues, we use symbol relocations instead of encoding
symbol indices directly in the section contents. The section itself
doesn't contain any data. That sidesteps the layout problem in addition
to solving the second issue.

The corresponding LLD change to read in this new format: {D128938}.
It will fix the icf-safe.ll test failure on this diff.

[1]: https://discourse.llvm.org/t/problems-with-mach-o-address-significance-table-generation/63392/

Reviewed By: #lld-macho, alx32

Differential Revision: https://reviews.llvm.org/D127637
2022-07-19 21:22:23 -04:00
Lang Hames
94e6d2677b [ORC] Fix serialization / deserialization of default-constructed StringRef.
Avoids accessing the data field on zero-length strings. This is the StringRef
counterpart to the ArrayRef<char> fix in 67220c2ad7.

rdar://97285294
2022-07-19 17:22:21 -07:00
Anubhab Ghosh
1b1f1c7786 Re-re-apply 5acd471698, Add a shared-memory based orc::MemoryMapper...
...with more fixes.

The original patch was reverted in 3e9cc543f2 due to bot failures caused by
a missing dependence on librt. That issue was fixed in 32d8d23cd0, but that
commit also broke sanitizer bots due to a bug in SimplePackedSerialization:
empty ArrayRef<char>s triggered a zero-byte memcpy from a null source. The
ArrayRef<char> serialization issue was fixed in 67220c2ad7, and this patch has
also been updated with a new custom SharedMemorySegFinalizeRequest message that
should avoid serializing empty ArrayRefs in the first place.

https://reviews.llvm.org/D128544
2022-07-19 15:35:33 -07:00
Johannes Doerfert
bf789b1957 [Attributor] Replace AAValueSimplify with AAPotentialValues
For the longest time we used `AAValueSimplify` and
`genericValueTraversal` to determine "potential values". This was
problematic for many reasons:
- We recomputed the result a lot as there was no caching for the 9
  locations calling `genericValueTraversal`.
- We added the idea of "intra" vs. "inter" procedural simplification
  only as an afterthought. `genericValueTraversal` did offer an option
  but `AAValueSimplify` did not. Thus, we might end up with "too much"
  simplification in certain situations and then gave up on it.
- Because `genericValueTraversal` was not a real `AA` we ended up with
  problems like the infinite recursion bug (#54981) as well as code
  duplication.

This patch introduces `AAPotentialValues` and replaces the
`AAValueSimplify` uses with it. `genericValueTraversal` is folded into
`AAPotentialValues` as are the instruction simplifications performed in
`AAValueSimplify` before. We further distinguish "intra" and "inter"
procedural simplification now.

`AAValueSimplify` was not deleted as we haven't ported the
re-materialization of instructions yet. There are other differences over
the former handling, e.g., we may not fold trivially foldable
instructions right now, e.g., `add i32 1, 1` is not folded to `i32 2`
but if an operand would be simplified to `i32 1` we would fold it still.

We are also even more aware of function/SCC boundaries in CGSCC passes,
which is good even if some tests look like they regress.

Fixes: https://github.com/llvm/llvm-project/issues/54981

Note: A previous version was flawed and consequently reverted in
      6555558a80.
2022-07-19 16:24:42 -05:00
Yusra Syeda
6fb27bc2e3 [SystemZ][z/OS] Introduce CCAssignToRegAndStack to calling convention
Differential Revision: https://reviews.llvm.org/D127328
2022-07-19 13:55:25 -04:00
Cole Kissane
e939bf67e3 [llvm] add zstd to llvm::compression namespace
- add zstd to `llvm::compression` namespace
- add a CMake option `LLVM_ENABLE_ZSTD` with behavior mirroring that of `LLVM_ENABLE_ZLIB`
- add tests for zstd to `llvm/unittests/Support/CompressionTest.cpp`
- debian users should install libzstd when using `LLVM_ENABLE_ZSTD=FORCE_ON` from source due to this bug https://bugs.launchpad.net/ubuntu/+source/libzstd/+bug/1941956

Reviewed By: leonardchan, MaskRay

Differential Revision: https://reviews.llvm.org/D128465
2022-07-19 10:54:36 -07:00
Jon Chesterfield
3a20597776 [amdgpu] Implement lds kernel id intrinsic
Implement an intrinsic for use lowering LDS variables to different
addresses from different kernels. This will allow kernels that cannot
reach an LDS variable to avoid wasting space for it.

There are a number of implicit arguments accessed by intrinsic already
so this implementation closely follows the existing handling. It is slightly
novel in that this SGPR is written by the kernel prologue.

It is necessary in the general case to put variables at different addresses
such that they can be compactly allocated and thus necessary for an
indirect function call to have some means of determining where a
given variable was allocated. Claiming an arbitrary SGPR into which
an integer can be written by the kernel, in this implementation based
on metadata associated with that kernel, which is then passed on to
indirect call sites is sufficient to determine the variable address.

The intent is to emit a __const array of LDS addresses and index into it.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D125060
2022-07-19 17:46:19 +01:00
Alexey Lapshin
4539b44148 [Reland][Debuginfo][llvm-dwarfutil] llvm-dwarfutil dsymutil-like tool for ELF.
This patch implements proposal https://lists.llvm.org/pipermail/llvm-dev/2020-August/144579.html
llvm-dwarfutil - is a tool that is used for processing debug info(DWARF) located in built binary files to improve debug info quality, reduce debug info size. The patch currently implements smaller set of command-line options(comparing to the proposal):

```
./llvm-dwarfutil [options] <input file> <output file>

  --garbage-collection    Do garbage collection for debug info(default)
  -j <value>              Alias for --num-threads
  --no-garbage-collection Don`t do garbage collection for debug info
  --no-odr-deduplication  Don`t do ODR deduplication for debug types
  --no-odr                Alias for --no-odr-deduplication
  --no-separate-debug-file
                          Create single output file, containing debug tables(default)
  --num-threads <threads> Number of available threads for multi-threaded execution. Defaults to the number of cores on the current machine
  --odr-deduplication     Do ODR deduplication for debug types(default)
  --odr                   Alias for --odr-deduplication
  --separate-debug-file   Create two output files: file w/o debug tables and file with debug tables
  --tombstone [bfd,maxpc,exec,universal]
                          Tombstone value used as a marker of invalid address(default: universal)
    =bfd - Zero for all addresses and [1,1] for DWARF v4 (or less) address ranges and exec
    =maxpc - Minus 1 for all addresses and minus 2 for DWARF v4 (or less) address ranges
    =exec - Match with address ranges of executable sections
    =universal - Both: bfd and maxpc
```

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D86539
2022-07-19 15:11:36 +03:00
Simon Pilgrim
0f6b0461b0 [DAG] SimplifyDemandedBits - relax "xor (X >> ShiftC), XorC --> (not X) >> ShiftC" to match only demanded bits
The "xor (X >> ShiftC), XorC --> (not X) >> ShiftC" fold is currently limited to the XOR mask being a shifted all-bits mask, but we can relax this to only need to match under the demanded bits.

This helps expose more bit extraction/clearing patterns and fixes the PowerPC testCompares*.ll regressions from D127115

Alive2: https://alive2.llvm.org/ce/z/fl7T7K

Differential Revision: https://reviews.llvm.org/D129933
2022-07-19 10:59:07 +01:00
David Spickett
5d14873249 [llvm][AArch64] Add missing FPCR, H and B registers to Codeview mapping
Fixes https://github.com/llvm/llvm-project/issues/56484

H registers are 16 bit views of AArch64's Neon registers and
B are the 8 bit views.

msvc does not support 16 bit float (some mention in DirectX but I
couldn't find a way to get to it) so for lack of a better reference
I'm using:
85c9b41b33/server/references/dia/include/cvconst.h
(the other microsoft-pdb repo is no longer up to date)

Luckily clang does support fp16 so a test is added for that.

There is no 8 bit float type so I had to get creative with the
test case. We're not testing for correct debug info here just
that we can select the B register and not crash in the process.

For FPCR it's never going to be passed as an argument so I've
not added a test for it. It is included to keep our list looking
the same as the reference.

Reviewed By: majnemer

Differential Revision: https://reviews.llvm.org/D129774
2022-07-19 09:33:13 +00:00
Alexey Lapshin
e717f91c96 Revert "[Debuginfo][llvm-dwarfutil] llvm-dwarfutil dsymutil-like tool for ELF."
This reverts commit e2147c26bd.
2022-07-19 12:17:47 +03:00
Alexey Lapshin
e2147c26bd [Debuginfo][llvm-dwarfutil] llvm-dwarfutil dsymutil-like tool for ELF.
This patch implements proposal https://lists.llvm.org/pipermail/llvm-dev/2020-August/144579.html
llvm-dwarfutil - is a tool that is used for processing debug info(DWARF) located in built binary files to improve debug info quality, reduce debug info size. The patch currently implements smaller set of command-line options(comparing to the proposal):

```
./llvm-dwarfutil [options] <input file> <output file>

  --garbage-collection    Do garbage collection for debug info(default)
  -j <value>              Alias for --num-threads
  --no-garbage-collection Don`t do garbage collection for debug info
  --no-odr-deduplication  Don`t do ODR deduplication for debug types
  --no-odr                Alias for --no-odr-deduplication
  --no-separate-debug-file
                          Create single output file, containing debug tables(default)
  --num-threads <threads> Number of available threads for multi-threaded execution. Defaults to the number of cores on the current machine
  --odr-deduplication     Do ODR deduplication for debug types(default)
  --odr                   Alias for --odr-deduplication
  --separate-debug-file   Create two output files: file w/o debug tables and file with debug tables
  --tombstone [bfd,maxpc,exec,universal]
                          Tombstone value used as a marker of invalid address(default: universal)
    =bfd - Zero for all addresses and [1,1] for DWARF v4 (or less) address ranges and exec
    =maxpc - Minus 1 for all addresses and minus 2 for DWARF v4 (or less) address ranges
    =exec - Match with address ranges of executable sections
    =universal - Both: bfd and maxpc
```

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D86539
2022-07-19 11:18:36 +03:00
serge-sans-paille
a2ac383b44 [llvm] Fix forward declaration in Support/JSON.h
Some methods of json::Array require json::Value to be completely defined, so
they can't be defined in-class. Fix that by defining them out of class.

Fix #55780
2022-07-19 09:07:29 +02:00
Max Kazantsev
51f837a680 [NFC] Introduce API to detect tokens penetrating LCSSA form
Following discussion in PR56243, we need to somehow detect the situation
when token values penetrate LCSSA form for transforms that require that
it is maintained by all values (for example, to sustain use-def dominance
invarians). This patch introduces a parameter to LCSSA checkers to control
their ignorance about tokens.

Differential Revision: https://reviews.llvm.org/D129983
Reviewed By: efriedma
2022-07-19 13:52:30 +07:00
Shraiysh Vaishay
35fc666877 [OpenMP][IRBuilder] Add support for taskgroup
This patch adds support for generating taskgroup construct.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D128203
2022-07-19 10:49:34 +05:30
Lang Hames
67220c2ad7 [ORC] Fix serialization / deserialization of default-constructed ArrayRef<char>.
Avoids a zero-length memcpy from a null src, which caused errors on some of the
sanitizer bots. Also uses null when deserializing an empty ArrayRef (rather
than pointing to a zero length range in the middle of the input buffer).
2022-07-18 20:39:01 -07:00
Matt Arsenault
8d0383eb69 CodeGen: Remove AliasAnalysis from regalloc
This was stored in LiveIntervals, but not actually used for anything
related to LiveIntervals. It was only used in one check for if a load
instruction is rematerializable. I also don't think this was entirely
correct, since it was implicitly assuming constant loads are also
dereferenceable.

Remove this and rely only on the invariant+dereferenceable flags in
the memory operand. Set the flag based on the AA query upfront. This
should have the same net benefit, but has the possible disadvantage of
making this AA query nonlazy.

Preserve the behavior of assuming pointsToConstantMemory implying
dereferenceable for now, but maybe this should be changed.
2022-07-18 17:23:41 -04:00
Stanislav Mekhanoshin
523a99c0eb [AMDGPU] Support for gfx940 fp8 smfmac
Differential Revision: https://reviews.llvm.org/D129908
2022-07-18 12:12:41 -07:00
Stanislav Mekhanoshin
2695f0a688 [AMDGPU] Support for gfx940 fp8 mfma
Differential Revision: https://reviews.llvm.org/D129906
2022-07-18 11:49:56 -07:00
Stanislav Mekhanoshin
9fa5a6b7e8 [AMDGPU] Support for gfx940 fp8 conversions
Differential Revision: https://reviews.llvm.org/D129902
2022-07-18 11:48:43 -07:00
zhijian
a6316d6da5 [AIX] support read global symbol of big archive
Reviewers: James Henderson, Fangrui Song

Differential Revision: https://reviews.llvm.org/D124865
2022-07-18 10:43:30 -04:00
Simon Pilgrim
c2ab5c5514 [DAG] Fix typo in isDesirableToCommuteWithShift description. NFC. 2022-07-18 13:11:23 +01:00
Max Kazantsev
d693fd29f1 [Verifier] Make Verifier recognize undef tokens as correct IR
Undef tokens may appear in unreached code as result of RAUW of some optimization,
and it should not be considered as bad IR.

Patch by Dmitry Bakunevich!

Differential Revision: https://reviews.llvm.org/D128904
Reviewed By: mkazantsev
2022-07-18 16:26:06 +07:00
Nikita Popov
11079e8820 [IR] Don't treat callbr as indirect terminator
Callbr is no longer an indirect terminator in the sense that is
relevant here (that it's successors cannot be updated). The primary
effect of this change is that callbr no longer prevents formation
of loop simplify form.

I decided to drop the isIndirectTerminator() method entirely and
replace it with isa<IndirectBrInst>() checks. I assume this method
was added to abstract over indirectbr and callbr, but it never
really caught on, and there is nothing left to abstract anymore
at this point.

Differential Revision: https://reviews.llvm.org/D129849
2022-07-18 09:32:08 +02:00