Commit Graph

516435 Commits

Author SHA1 Message Date
Joseph Huber
a78861fc55 [NvlinkWrapper] Add support for --undefined (#113934)
Summary:
This flag is pretty canonical in ELF linkers, it allows us to force the
link job to extract a library if it defines a specific symbol. This is
mostly useful for letting us forcibly extract things that don't fit the
normal model (i.e. kernels) from static libraries.
2024-10-29 15:34:28 -05:00
Zequan Wu
8193832fb9 [lldb] Search main function with lldb::eFunctionNameTypeFull when getting default file and line. (#113980)
This is to work around the fact that
`SymbolFileNativePDB::FindFunctions` only support
`lldb::eFunctionNameTypeFull` and `lldb::eFunctionNameTypeMethod` now.
Since `main`'s full name is the same as base name (`main`), it's okay to
search with `lldb::eFunctionNameTypeFull` when trying to get the default
file and line. With this, `lldb/test/Shell/Driver/TestSingleQuote.test`
passes on Windows with NativePDB plugin.
2024-10-29 16:23:33 -04:00
Thurston Dang
70af40ba74 [hwasan] Fix forward '[hwasan] Flush stderr/stdout in tests (#114083)'
3754fc1e9a broke the build because subsequent checks depend on the line numbers

https://lab.llvm.org/buildbot/#/builders/174/builds/7534/steps/6/logs/FAIL__HWAddressSanitizer-x86_64__use-after-free_c
2024-10-29 20:14:14 +00:00
Joseph Huber
ccd73eeab3 [LinkerWrapper] Remove in-house handling of LTO (#113715)
Summary:
This should be the linker's job if the user creates any bitcode files,
then passing `-flto` to the linker for the toolchain should be able to
handle it. Right now this path is only used in the case where someone
does LTO w/ ld.gold targeting a CPU so I think we are safe here as that
will still be forwarded, for bfd it'll be an error as it would on the
host. I think I talked the SYCL team out of using this as well so I
should be good to delete it.
2024-10-29 13:06:55 -07:00
Steven Wu
ba8d9ce8d4 [ADT] Fix unused variable from #69528 (#114114)
Remove unused variable to fix build failures from bot.
2024-10-29 13:00:59 -07:00
David Majnemer
5c12434906 [X86] Emit comments explaining the immediate in vfpclass
This makes the assembly a lot more readable at a glance.

As an example:
```
  vfpclasspd $4, %zmm0, %k0 # k0 = isNegativeZero(zmm0)
```
2024-10-29 19:54:34 +00:00
z1nke
27ef549af2 [clang-tidy] Fix crash in modernize-use-designated-initializers check (#113688)
Fix #113652.

When calling `Node.isAggregate()` and `Node.isPOD()`, if `Node` is declared but
not defined, it will result in null pointer dereference (and if assertions are
enabled, it will cause an assertion failure).
2024-10-29 15:48:39 -04:00
Maryam Moghadas
8a0cb9ac86 [PowerPC] Add custom lowering for ssubo (#111748)
This patch is to improve the codegen for ssubo node for i32 in 64-bit
mode by custom lowering.
2024-10-29 15:43:05 -04:00
Thurston Dang
e205929399 [asan] Flush stderr in test (#114084)
This is the ASan equivalent of
https://github.com/llvm/llvm-project/pull/114083.

The x86_64_lam_qemu buildbots started failing

(https://lab.llvm.org/buildbot/#/builders/139/builds/5462/steps/2/logs/stdio).
Based on the logs, it appears the ASan check is correct but it did not
match the stderr/stdout output. This patch attempts to fix the issue by
flushing stderr as appropriate.
2024-10-29 12:40:54 -07:00
Adam Yang
3a1228a543 [SPIRV] Add GroupMemoryBarrierWithGroupSync intrinsic (#111888)
partially fixes #70103

### Changes
* Added int_spv_group_memory_barrier_with_group_sync intrinsic in
IntrinsicsSPIRV.td
* Added lowering for int_spv_group_memory_barrier_with_group_sync in
SPIRVInstructionSelector.cpp
* Added SPIRV backend test case

### Related PRs
* [[clang][HLSL] Add GroupMemoryBarrierWithGroupSync intrinsic
#111883](https://github.com/llvm/llvm-project/pull/111883)
* [[DXIL] Add GroupMemoryBarrierWithGroupSync intrinsic
#111884](https://github.com/llvm/llvm-project/pull/111884)
2024-10-29 12:40:01 -07:00
Thurston Dang
3754fc1e9a [hwasan] Flush stderr/stdout in tests (#114083)
The x86_64_lam_qemu buildbots started failing
(https://lab.llvm.org/buildbot/#/builders/139/builds/5462/steps/2/logs/stdio).
Based on the logs, it appears the HWASan check is correct but it did not
match the stderr/stdout output. This patch attempts to fix the issue by
flushing stderr/stdout as appropriate.
2024-10-29 12:38:56 -07:00
Rahul Joshi
a18af41c20 [LLVM] Change error messages to start with lower case (#113748)
Change LLVM Asm and TableGen Lexer/Parser error messages to begin with
lower case.
2024-10-29 12:26:33 -07:00
Ellis Hoag
9cc5a4bf66 Remove llvm::shouldOptForSize() from Utils.h (#112630)
Remove `llvm::shouldOptForSize()` from `Utils.h` since we can use
`llvm::shouldOptimizeForSize()` from `SizeOpts.h` instead.

Depends on https://github.com/llvm/llvm-project/pull/112626
2024-10-29 14:23:47 -05:00
Kazu Hirata
c79827cd15 [SandboxIR] Fix a warning
This patch fixes:

  llvm/lib/SandboxIR/Context.cpp:684:22: error: unused variable
  'MaxRegisteredCallbacks' [-Werror,-Wunused-const-variable]
2024-10-29 12:05:18 -07:00
Renaud Kauffmann
b9978f8c77 [flang][cuda] Adding variable registration in constructor (#113976)
1) Adding variable registration in constructor
2) Applying feedback from PR
https://github.com/llvm/llvm-project/pull/112989
2024-10-29 11:48:48 -07:00
Michael Buch
b4e1af0096 [lldb-dap] Always pass disableASLR to the DAP executable (#113891)
More context can be found in
https://github.com/llvm/llvm-project/pull/110303

For DAP tests running in constrained environments (e.g., Docker
containers), disabling ASLR isn't allowed. So we set `disableASLR=False`
(since https://github.com/llvm/llvm-project/pull/113593).

However, the `dap_server.py` will currently only forward the value
of `disableASLR` to the DAP executable if it's set to `True`. If the
DAP executable wasn't provided a `disableASLR` field it defaults to
`true` too:
f147437945/lldb/tools/lldb-dap/lldb-dap.cpp (L2103-L2104)

This means that passing `disableASLR=False` from the tests is currently
not possible.

This is also true for many of the other boolean arguments of
`request_launch`. But this patch only addresses `disableASLR` for now
since it's blocking a libc++ patch.
2024-10-29 18:40:06 +00:00
Wanyi
efc6d33be9 [lldb] Fix write only file action to truncate the file (#112657)
When `FileAction` opens file with write access, it doesn't clear the
file nor append to the end of the file if it already exists. Instead, it
writes from cursor index 0.

For example, by using the settings `target.output-path` and
`target.error-path`, lldb will redirect process stdout/stderr to files.
It then calls this function to write to the files which the above
symptoms appear.

## Test
- Added unit test checking the file flags
- Added 2 api tests checking
  - File content overwritten if the file path already exists
- Stdout and stderr redirection to the same file doesn't change its
behavior
2024-10-29 14:22:51 -04:00
Kelvin Li
8e14c6c172 [flang] Support -mabi=vec-extabi and -mabi=vec-default on AIX (#113215)
This option is to enable the AIX extended and default vector ABIs.
2024-10-29 14:20:11 -04:00
Lang Hames
9e37cbb469 [ORC] Add some missing FIXMEs, move a temporary Error into an if condition. 2024-10-29 11:12:48 -07:00
Lang Hames
f22c9ddb36 [ORC] Single-symbol convenience method does not need to be virtual.
This convenience method just calls the general case which is already virtual.
2024-10-29 11:12:48 -07:00
Jerry Sun
cdacc9b5c7 [TableGen] [NFC] Refine TableGen code to comply with clang-tidy checks (#113318)
Code cleanups for TableGen files, changes includes function names,
variable names and unused imports.

---------

Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com>
2024-10-29 11:10:54 -07:00
Louis Dionne
6563ed3162 [libc++][NFC] Remove trailing whitespace in the modulemap 2024-10-29 14:10:34 -04:00
LLVM GN Syncbot
b0dd368d57 [gn build] Port b510cdb895 2024-10-29 18:01:23 +00:00
Kazu Hirata
6f66530fd1 [mlir] Fix a warning
This patch fixes:

  mlir/lib/Pass/PassRegistry.cpp:425:37: error: ISO C++ requires the
  name after '::~' to be found in the same scope as the name before
  '::~' [-Werror,-Wdtor-name]
2024-10-29 10:55:34 -07:00
Min-Yih Hsu
ba65710908 [RISCV] Avoid redundant SchedRead on _TIED VPseudos (#113940)
_TIED and _MASK_TIED pseudos have one less operand compared to other
pseudos, thus we shouldn't attach the same number of SchedRead for these
instructions.

I don't think we have a way to (explicitly) check scheduling classes. So
I only test this patch with existing tests.
2024-10-29 10:49:35 -07:00
Brox Chen
528e975ac4 [AMDGPU][test]added unique and sort options for update_mc_test_check script (#111769)
add a unique and a sort option to the update_mc_test_check script.

These mc asm/dasm files are usually large in number of lines, and these
lines are mostly similar to each other. These options can be useful when
maintainer is merging or resolving conflicts by making the file
identifical

Also fixed a small issue in asm/dasm such that the auto generated header
line is
1. asm using ";" instead of "//" as comment marker
2. dasm using ";" instead of "#" as comment marker
2024-10-29 13:48:43 -04:00
Aaron Ballman
449523fa0f Nominate Vassil Vassilev for Modules and Plugins (#114058)
Vassil has significant experience helping users with the plugin
interface in Clang, especially around the new efforts to bring plugin
support to Windows. He also is knowledgeable about modules support.
2024-10-29 13:38:54 -04:00
Krystian Stasiowski
639a7ac648 [Clang][AST] Store injected template arguments in TemplateParameterList (#113579)
Currently, we store injected template arguments in
`RedeclarableTemplateDecl::CommonBase`. This approach has a couple
problems:
1. We can only access the injected template arguments of
`RedeclarableTemplateDecl` derived types, but other `Decl` kinds still
make use of the injected arguments (e.g.
`ClassTemplatePartialSpecializationDecl`,
`VarTemplatePartialSpecializationDecl`, and `TemplateTemplateParmDecl`).
2. Accessing the injected template arguments requires the common data
structure to be allocated. This may occur before we determine whether a
previous declaration exists (e.g. when comparing constraints), so if the
template _is_ a redeclaration, we end up discarding the common data
structure.

This patch moves the storage and access of injected template arguments
from `RedeclarableTemplateDecl` to `TemplateParameterList`.
2024-10-29 13:36:55 -04:00
Aaron Ballman
4abc357407 Nominate Sirraide for AST visitors and Sema (#114092)
Sirraide has been actively reviewing Sema code for a while now and
definitely has the expertise to help maintain that section of the
compiler. Further, he has been refactoring AST visitors to try to reduce
the compile time overhead associated with them and would be a good
resource for keeping an eye on that part of the code base too.
2024-10-29 13:36:22 -04:00
Harald van Dijk
950ee75909 [RISC-V] Fix check of minimum vlen. (#114055)
If we have a minimum vlen, we were adjusting StackSize to change the
unit from vscale to bytes, and then calculating the required padding
size for alignment in bytes. However, we then used that padding size as
an offset in vscale units, resulting in misplaced stack objects.

While it would be possible to adjust the object offsets by dividing
AlignmentPadding by ST.getRealMinVLen() / RISCV::RVVBitsPerBlock, we can
simplify the calculation a bit if instead we adjust the alignment to be
in vscale units.

@topperc This fixes a bug I am seeing after #110312, but I am not 100%
certain I am understanding the code correctly, could you please see if
this makes sense to you?
2024-10-29 17:30:30 +00:00
Steven Wu
b510cdb895 [ADT] Add TrieRawHashMap (#69528)
Implement TrieRawHashMap can be used to store object with its associated
hash. User needs to supply a strong hashing function to guarantee the
uniqueness of the hash of the objects to be inserted. A hash collision
is not supported and will lead to error or failed to insert.

TrieRawHashMap is thread-safe and lock-free and can be used as
foundation data structure to implement a content addressible storage.
TrieRawHashMap owns the data stored in it and is designed to be:
* Fast to lookup.
* Fast to "insert" if the data has already been inserted.
* Can be used without lock and doesn't require any knowledge of the
participating threads or extra coordination between threads.

It is not currently designed to be used to insert unique new data with
high contention, due to the limitation on the memory allocator.
2024-10-29 10:29:39 -07:00
Afanasyev Ivan
4e1b9d34f9 [mir-strip-debug] Fix debug location info strip for bundled instructions (#113676)
Fix bug that `mir-strip-debug` pass does not remove debug location from
bundled instructions.

Problem arises during testing that debug info does not affect
optimization passes output (`llvm-lit` with ` -Dllc="llc
-debugify-and-strip-all-safe"`), when pass operates on MIR with bundled
instructions + memory operands.

Let mir test check looks like:

```
CHECK-NEXT: BUNDLE {
CHECK-NEXT:   $r3 = LD $r1, $r2 :: (load (s64) from %ir.a, !tbaa !2)
CHECK-NEXT: }
```

So as `mir-strip-debug` pass does not process bundled instructions,
running `llc -debugify-and-strip-all-safe` on the test will produce the
following output:

```
BUNDLE {
  $r3 = LD $r1, $r2, debug-location !DILocation(line: 3, column: 1, scope: <0x608cb2b99b10>) :: (load (s64) from %ir.a, !tbaa !2)
}
```

And test will fail, but it shouldn't.

Seems like the root cause is that `mir-strip-debug` pass should remove
debug location from bundled instructions.
2024-10-29 10:26:15 -07:00
Joseph Huber
d661aea4c5 [OpenMP] Add support for custom callback in AMDGPUStream (#112785)
Summary:
We have the ability to schedule callbacks after certain events complete.
Currently we can register an arbitrary callback in CUDA, but can't in
AMDGPU. I am planning on using this support to move the RPC handling to
a separate thread, then using these callbacks to suspend / resume it
when no kernels are running. This is a preliminary patch to keep this
noise out of that one.
2024-10-29 10:18:32 -07:00
Adam Yang
9a5b3a1bbc [DXIL] Add GroupMemoryBarrierWithGroupSync intrinsic (#111884)
fixes #112974
partially fixes #70103

### Changes
- Added new tablegen based way of lowering dx intrinsics to DXIL ops.
- Added int_dx_group_memory_barrier_with_group_sync intrinsic in
IntrinsicsDirectX.td
- Added expansion for int_dx_group_memory_barrier_with_group_sync in
DXILIntrinsicExpansion.cpp`
- Added DXIL backend test case

### Related PRs
* [[clang][HLSL] Add GroupMemoryBarrierWithGroupSync intrinsic
#111883](https://github.com/llvm/llvm-project/pull/111883)
* [[SPIRV] Add GroupMemoryBarrierWithGroupSync intrinsic
#111888](https://github.com/llvm/llvm-project/pull/111888)
2024-10-29 10:17:35 -07:00
Aaron Ballman
f964514490 Nominate Shafik Yaghmour and Vlad Serebrennikov for C++ conformance (#114071)
Shafik and Vlad are both members of WG21 and both have familiarity with
reasoning about the C++ standard. They've both volunteered to help
answer conformance related questions, and this is an area where we get
quite a bit of questions so having a larger stable of maintainers is
quite useful.
2024-10-29 13:16:20 -04:00
Craig Topper
b1d0fe095b [RISCV] Remove trailing whitespace. NFC 2024-10-29 10:09:28 -07:00
Jubilee
f53889ffca [RISCV] Allow crypto features to imply dependents (#112659)
This relationship is a logical dependency.

Note Zvbc and Zvknhb. They are explicitly called out in the spec as
requiring 64 bits:
-
56ed7952d1/doc/vector/riscv-crypto-spec-vector.adoc
2024-10-29 10:07:20 -07:00
Sergio Afonso
a1f2fb6078 [MLIR][OpenMP] Prevent composite omp.simd related crashes (#113680)
This patch updates the translation of `omp.wsloop` with a nested
`omp.simd` to prevent uses of block arguments defined by the latter from
triggering null pointer dereferences.

This happens because the inner `omp.simd` operation representing
composite `do simd` constructs is currently skipped and not translated,
but this results in block arguments defined by it not being mapped to an
LLVM value. The proposed solution is to map these block arguments to the
LLVM value associated to the corresponding operand, which is defined
above.
2024-10-29 17:05:12 +00:00
Valentin Clement (バレンタイン クレメン)
b05fec97d5 [flang][cuda] Convert gpu.launch_func to CUFLaunchClusterKernel when cluster dims are present (#113959)
Kernel launch in CUF are converted to `gpu.launch_func`. When the kernel
has `cluster_dims` specified these get carried over to the
`gpu.launch_func` operation. This patch updates the special conversion
of `gpu.launch_func` when cluster dims are present to the newly added
entry point.
2024-10-29 10:02:08 -07:00
Valentin Clement (バレンタイン クレメン)
0b700f2333 [flang][cuda] Add entry point to launch global function with cluster_dims (#113958) 2024-10-29 10:01:49 -07:00
Jorge Gorbe Moya
12a8f504cf [SandboxIR] Use the proper gmock public header in unit tests.
This should fix the BuildKite bazel build.
2024-10-29 09:59:03 -07:00
Andrzej Warzyński
39ad84e4d1 [mlir][linalg] Split GenericPadOpVectorizationPattern into two patterns (#111349)
At the moment, `GenericPadOpVectorizationPattern` implements two
orthogonal transformations:
  1. Rewrites `tensor::PadOp` into a sequence of `tensor::EmptyOp`,
    `linalg::FillOp` and `tensor::InsertSliceOp`.
  2. Vectorizes (where possible) `tensor::InsertSliceOp` (see
    `tryVectorizeCopy`).

This patch splits `GenericPadOpVectorizationPattern` into two separate
patterns:
  1. `GeneralizePadOpPattern` for the first transformation (note that
    currently `GenericPadOpVectorizationPattern` inherits from
    `GeneralizePadOpPattern`).
  2. `InsertSliceVectorizePattern` to vectorize `tensor::InsertSliceOp`.

With this change, we gain the following:
  * a clear separation between pre-processing and vectorization
    transformations/stages,
  * a path to support masked vectorisation for `tensor.insert_slice`
    (with a dedicated pattern for vectorization, it is much easier to
    specify the input vector sizes used in masking),
  * more opportunities to vectorize `tensor.insert_slice`.

Note for downstream users:
--------------------------

If you were using `populatePadOpVectorizationPatterns`, following this
change you will also have to add
`populateInsertSliceVectorizationPatterns`.

Finer implementation details:
-----------------------------

1.  The majority of changes in this patch are copy & paste + some edits.
  1.1. The only functional change is that the vectorization of
    `tensor.insert_slice` is now broadly available (as opposed to being
    constrained to the pad vectorization pattern:
    `GenericPadOpVectorizationPattern`).
  1.2. Following-on from the above, `@pad_and_insert_slice_dest` is
    updated. As expected, the input `tensor.insert_slice` Op is no
    longer "preserved" and instead gets vectorized successfully.

2. The `linalg.fill` case in `getConstantPadVal` works under the
   assumption that only _scalar_ source values can be used. That's
   consistent with the definition of the Op, but it's not tested at the
   moment. Hence a test case in Linalg/invalid.mlir is added.

3. The behaviour of the two TD vectorization Ops,
   `transform.structured.vectorize_children_and_apply_patterns` and
   `transform.structured.vectorize` is preserved.
2024-10-29 16:57:23 +00:00
SpencerAbson
2a9dd8af5a [AArch64] Add assembly/disassembly for zeroing SVE FCVT{X} and BFCVT (#113916)
This patch adds assembly/disassembly support for the following SVE2.2
instructions

    - FCVT (zeroing)
    - FCVTX (zeroing)
    - BFCVT (zeroing)
    
In accordance with:
https://developer.arm.com/documentation/ddi0602/2024-09/SVE-Instructions
2024-10-29 16:55:19 +00:00
Jonas Devlieghere
2ab98dfe19 [lldb] Update link to GreenDragon in the docs 2024-10-29 09:45:29 -07:00
Fangrui Song
318bdd0aeb [StackSafetyAnalysis] Bail out when calling ifunc
An assertion failure arises when a call instruction calls a GlobalIFunc.
Since we cannot reason about the underlying function, just bail out.

Fix #87923

Pull Request: https://github.com/llvm/llvm-project/pull/113841
2024-10-29 09:26:47 -07:00
Jorge Gorbe Moya
4df71ab78e [SandboxIR] Add callbacks for instruction insert/remove/move ops (#112965) 2024-10-29 09:25:51 -07:00
Hugo Trachino
a9c417c28a [MLIR][SCF] Fix LoopPeelOp documentation (NFC) (#113179)
As an example, I added annotations to the peel_front unit test.

```
func.func @loop_peel_first_iter_op() {
  // CHECK: %[[C0:.+]] = arith.constant 0
  // CHECK: %[[C41:.+]] = arith.constant 41
  // CHECK: %[[C5:.+]] = arith.constant 5
  // CHECK: %[[C5_0:.+]] = arith.constant 5
  // CHECK: scf.for %{{.+}} = %[[C0]] to %[[C5_0]] step %[[C5]]
  // CHECK:   arith.addi
  // CHECK: scf.for %{{.+}} = %[[C5_0]] to %[[C41]] step %[[C5]]
  // CHECK:   arith.addi
  %0 = arith.constant 0 : index
  %1 = arith.constant 41 : index
  %2 = arith.constant 5 : index
  scf.for %i = %0 to %1 step %2 {
    arith.addi %i, %i : index
  }
  return
}

module attributes {transform.with_named_sequence} {
  transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) {
    %0 = transform.structured.match ops{["arith.addi"]} in %arg1 : (!transform.any_op) -> !transform.any_op
    %1 = transform.get_parent_op %0 {op_name = "scf.for"} : (!transform.any_op) -> !transform.op<"scf.for">
    %main_loop, %remainder = transform.loop.peel %1 {peel_front = true} : (!transform.op<"scf.for">) -> (!transform.op<"scf.for">, !transform.op<"scf.for">)
    transform.annotate %main_loop "main_loop" : !transform.op<"scf.for">
    transform.annotate %remainder "remainder" : !transform.op<"scf.for">
    transform.yield
  }
}
```
Gives :
```
  func.func @loop_peel_first_iter_op() {
    %c0 = arith.constant 0 : index
    %c41 = arith.constant 41 : index
    %c5 = arith.constant 5 : index
    %c5_0 = arith.constant 5 : index
    scf.for %arg0 = %c0 to %c5_0 step %c5 {
      %0 = arith.addi %arg0, %arg0 : index
    } {remainder}  // The first iteration loop (second result) has been annotated remainder
    scf.for %arg0 = %c5_0 to %c41 step %c5 {
      %0 = arith.addi %arg0, %arg0 : index
    } {main_loop} // The main loop (first result) has been annotated main_loop
    return
  }
```

---------

Co-authored-by: Andrzej Warzyński <andrzej.warzynski@gmail.com>
2024-10-29 15:47:13 +00:00
Simon Pilgrim
bf6c483e47 [clang][x86] Add constexpr support for SSE2 _mm_set*_epi* intrinsics 2024-10-29 15:39:15 +00:00
LLVM GN Syncbot
af44976cad [gn build] Port 6128ff6630 2024-10-29 15:18:09 +00:00
LLVM GN Syncbot
f906d765ba [gn build] Port 5ea694816b 2024-10-29 15:18:08 +00:00