Commit Graph

53645 Commits

Author SHA1 Message Date
Sameer Sahasrabuddhe
61b5cc6654 [LLVM] ConvergenceControlInst as a derived class of IntrinsicInst (#76230) 2023-12-23 07:58:43 +05:30
Yingwei Zheng
345d7b1618 [InstCombine] Fold minmax intrinsic using KnownBits information (#76242)
This patch tries to fold minmax intrinsic by using
`computeConstantRangeIncludingKnownBits`.
Fixes regression in
[_karatsuba_rec:cpython/Modules/_decimal/libmpdec/mpdecimal.c](c31943af16/Modules/_decimal/libmpdec/mpdecimal.c (L5460-L5462)),
which was introduced by #71396.
See also
https://github.com/dtcxzyw/llvm-opt-benchmark/issues/16#issuecomment-1865875756.

Alive2 for splat vectors with undef: https://alive2.llvm.org/ce/z/J8hKWd
2023-12-23 04:41:32 +08:00
Lucas Duarte Prates
e4f1c52832 [AArch64] Assembly support for the Armv9.5-A Memory System Extensions (#76237)
This implements assembly support for the Memory Systems Extensions
introduced as part of the Armv9.5-A architecture version.
The changes include:
* New subtarget feature for FEAT_TLBIW.
* New system registers for FEAT_HDBSS:
  * HDBSSBR_EL2 and HDBSSPROD_EL2.
* New system registers for FEAT_HACDBS:
  * HACDBSBR_EL2 and HACDBSCONS_EL2.
* New TLBI instructions for FEAT_TLBIW:
  * VMALLWS2E1(nXS), VMALLWS2E1IS(nXS) and VMALLWS2E1OS(nXS).
* New system register for FEAT_FGWTE3:
  * FGWTE3_EL3.
2023-12-22 14:40:29 +00:00
Abhina Sree
d430c145ba [CMake] Move check for dlfcn.h and dladdr to clang (#76163)
This patch checks for the presence of dlfcn.h and dladdr in clang to be used in clang/tools/libclang/CIndexer.cpp
2023-12-22 08:12:19 -05:00
Wang Pengcheng
17858ce6f3 [MacroFusion] Remove createBranchMacroFusionDAGMutation (#76209)
Instead, we add a `BranchOnly` parameter to indicate that only
branches with its predecessors will be fused.

X86 is the only user of `createBranchMacroFusionDAGMutation`.
2023-12-22 16:31:38 +08:00
Cyndy Ishida
38eea57e69 [ADT] fix grammatical typo in Twine.h docs, NFC 2023-12-21 14:59:14 -08:00
Felipe de Azevedo Piovezan
058e527434 [AccelTable][NFC] Fix typos and duplicated code (#76155)
Renaming a member variable from "Endoding" to "Encoding".

Also replace inlined code for "isNormalized" with a call to the
function, so that if the definition of normalization ever changes, we
only need to change the one place.
2023-12-21 16:10:30 -03:00
Tomas Matheson
7bd17212ef Re-land "[AArch64] Codegen support for FEAT_PAuthLR" (#75947)
This reverts commit 9f0f558742.

Fix expensive checks failure by properly marking register def for ADR.
2023-12-21 18:32:55 +00:00
Tomas Matheson
9f0f558742 Revert "[AArch64] Codegen support for FEAT_PAuthLR"
This reverts commit 5992ce90b8.

Builtbot failures with expensive checks enabled.
2023-12-21 16:25:55 +00:00
Kazu Hirata
e01c063684 [llvm] Use DenseMap::contains (NFC) 2023-12-21 08:18:47 -08:00
Nikita Popov
b8df88b41c [InstCombine] Support zext nneg in gep of sext add fold
Add m_NNegZext() and m_SExtLike() matchers to make doing these kinds
of changes simpler in the future.
2023-12-21 16:38:09 +01:00
Jay Foad
70b00b4a6a [AMDGPU] Rename AMDGPUGlobalAtomicRtn -> AMDGPUAtomicRtn (#76157)
It is used for FLAT atomics as well as Global atomics.
2023-12-21 14:53:17 +00:00
Tomas Matheson
5992ce90b8 [AArch64] Codegen support for FEAT_PAuthLR
- Adds a new +pc option to -mbranch-protection that will enable
  the use of PC as a diversifier in PAC branch protection code.

- When +pauth-lr is enabled (-march=armv9.5a+pauth-lr) in combination
  with -mbranch-protection=pac-ret+pc, the new 9.5-a instructions
  (pacibsppc, retaasppc, etc) are used.

Documentation for the relevant instructions can be found here:
https://developer.arm.com/documentation/ddi0602/2023-09/Base-Instructions/

Co-authored-by: Lucas Prates <lucas.prates@arm.com>
2023-12-21 14:18:33 +00:00
Paschalis Mpeis
c4ff0a67d1 [TLI] Add getLibFunc that accepts an Opcode and scalar Type. (#75919)
It sets a LibFunc similarly with the other two getLibFunc methods.
Currently, it supports only the FRem Instruction.

Add tests for FRem.
2023-12-21 11:02:54 +00:00
boxu.zhang
d3ef867082 [LoopUnroll] Make UnrollMaxUpperBound to be overridable by target (#76029)
The UnrollMaxUpperBound should be target dependent, since different
chips provide different register set which brings different ability of
storing more temporary values of a program. So I add a MaxUpperBound
value in UnrollingPreference which can be override by targets. All uses
of UnrollMaxUpperBound are replaced with UP.MaxUpperBound.
 
The default value is still 8 and the command line argument
'--unroll-max-upperbound' takes final effect if provided.
2023-12-21 09:47:46 +01:00
Ivan R. Ivanov
39f09ec245 Invalidate analyses after running Attributor in OpenMPOpt (#74908)
Using the LoopInfo from OMPInfoCache after the Attributor ran resulted
in a crash due to it being in an invalid state.

---------

Co-authored-by: Ivan Radanov Ivanov <ivanov2@llnl.gov>
2023-12-20 15:01:21 -08:00
Cyndy Ishida
c6f29dbb59 [readtapi] Setup simple stubify support (#76075)
Stubify broadly takes either tbd files or binary dylibs and turns them
into tbd files. In future patches, stubify will also allow additional
information to be embedded into the final TBD output too.

Add Util APIs to TextAPI for common operations used by readtapi for now.
2023-12-20 14:56:53 -08:00
Sam Clegg
4e8cb01b01 [WebAssembly] Add symbol information for shared libraries (#75238)
The current (experimental) spec for WebAssembly shared libraries does
not include a full symbol table like the object format. This change
extracts symbol information from the normal wasm exports.

This is the first step in having the linker report undefined symbols
when linking with shared libraries. The current behaviour is to ignore
all undefined symbols when linking with `-pie` or `-shared`.

See https://github.com/emscripten-core/emscripten/issues/18198
2023-12-20 11:13:09 -08:00
Cyndy Ishida
5ea15fab19 [TextAPI] Add support to convert RecordSlices -> InterfaceFile (#75007)
Introduce RecordVisitor. This is used for different clients that want to
extract information out of RecordSlice types.
The first and immediate use case is for serializing symbol information
into TBD files.
2023-12-20 08:47:10 -08:00
Lucas Duarte Prates
d43fc5a6ad Reland: [AArch64] Assembly support for the Checked Pointer Arithmetic Extension (#73777)
This introduces assembly support for the Checked Pointer Arithmetic
Extension (FEAT_CPA), annouced as part of the Armv9.5-A architecture
version.

The changes include:
* New subtarget feature for FEAT_CPA
* New scalar instruction for pointer arithmetic
  * ADDPT, SUBPT, MADDPT, and MSUBPT
* New SVE instructions for pointer arithmetic
  * ADDPT (vectors, predicated), ADDPT (vectors, unpredicated)
  * SUBPT (vectors, predicated), SUBPT (vectors, unpredicated)
  * MADPT and MLAPT
* New ID_AA64ISAR3_EL1 system register

Mode details about the extension can be found at:
* https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/arm-a-profile-architecture-developments-2023
* https://developer.arm.com/documentation/ddi0602/2023-09/

Co-authored-by: Rodolfo Wottrich <rodolfo.wottrich@arm.com>
2023-12-20 15:43:17 +00:00
Abhina Sree
e86a02ce89 Use llvm-config.h in CIndexer.cpp instead of private header (#75928)
Use llvm-config.h in CIndexer.cpp instead of private header
2023-12-20 08:44:40 -05:00
Paschalis Mpeis
2349731992 [TLI] Add SLEEFGNUABI mappings for fmod/fmodf fixed-width. (#75803)
Cleanup test sleef-calls-aarch64.ll:
- make the util update script's regex more clear
- eliminate scalar epilogues in tests
2023-12-20 09:08:17 +00:00
Joseph Huber
deab58d127 [ELF] Add CPU name detection for CUDA architectures (#75964)
Summary:
Recently we added support for detecting the CUDA processor with the ELF
flags. This allows us to get a string representation of it in other
code. This will be used by the offloading runtime.
2023-12-19 20:01:15 -06:00
Mingming Liu
78a195e100 Reland the reland "[PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier, use semicolon as delimiter for local-linkage varibles. " (#75954)
Simplify the compiler-rt test to make it more general for different
platforms, and use `*DAG` matchers for lines that may be emitted
out-of-order.
- The compiler-rt test passed on a Windows machine. Previously name
matchers don't work for MSVC mangling
(https://lab.llvm.org/buildbot/#/builders/127/builds/59907)
- `*DAG` matchers fixed the error in
https://lab.llvm.org/buildbot/#/builders/94/builds/17924

This is the second reland and fixed errors caught in first reland
(https://github.com/llvm/llvm-project/pull/75860)

**Original commit message**
Commit fe05193 (phab D156569), IRPGO names uses format
`[<filepath>;]<linkage-name>` while prior format is
`[<filepath>:<mangled-name>`. The format change would break the use case
demonstrated in (updated)
`llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll` and
`compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp`

This patch changes `GlobalValues::getGlobalIdentifer` to use the
semicolon.

To elaborate on the scenario how things break without this PR
1. IRPGO raw profiles stores (compressed) IRPGO names of functions in
one section, and per-function profile data in another section. The
[NameRef](fc715e4cd9/compiler-rt/include/profile/InstrProfData.inc (L72))
field in per-function profile data is the MD5 hash of IRPGO names.
2. When raw profiles are converted to indexed format profiles, the
profiled address is
[mapped](fc715e4cd9/llvm/lib/ProfileData/InstrProf.cpp (L876-L885))
to the MD5 hash of the callee.
3. In `pgo-instr-use` thin-lto prelink pipeline, MD5 hash of IRPGO names
will be
[annotated](fc715e4cd9/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp (L1707))
as value profiles, and used to import indirect-call-prom candidates. If
the annotated MD5 hash is computed from the new format while import uses
the prior format, the callee cannot be imported.

*
`compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp`
is added to have an end-to-end test.
* `llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll`
is updated to have better test coverage from another aspect (as runtime
tests are more sensitive to the environment and may be skipped by some
contributors)
2023-12-19 12:25:56 -08:00
Igor Kudrin
1e91f32ef7 [CommandLine] Add subcommand groups (#75678)
The patch introduces a `SubCommandGroup` class which represents a list
of subcommands. An option can be added to all these subcommands using
one `cl::sub(group)` command. This simplifies the declaration of options
that are shared across multiple subcommands of a tool.
2023-12-20 02:45:29 +07:00
Igor Kudrin
6a2a99fb45 [CommandLine][NFCI] Simplify enumerating subcommands of an option (#75679)
The patch adds a helper method to iterate over all subcommands to which
an option belongs. Duplicate code is removed and replaced with calls to
this new method.
2023-12-20 02:39:32 +07:00
Yusra Syeda
0768253c20 [SystemZ][z/OS] Add exception handling for XPLINK (#74638)
Adds emitting the exception table and the EH registers for XPLINK.

---------

Co-authored-by: Yusra Syeda <yusra.syeda@ibm.com>
2023-12-19 13:58:33 -05:00
Nikita Popov
a7c05bfd16 [ValueLattice] Remove redundant check (NFC)
This will already be checked inside markConstant().
2023-12-19 15:22:23 +01:00
Nikita Popov
0d3d445223 [LVI] Remove unnecessary TLI dependency
Only used in ConstantFoldCompareInstOperands(), which does not
actually use TLI.
2023-12-19 14:32:43 +01:00
Paschalis Mpeis
ddb6db4d09 [VFABI] Create FunctionType for vector functions (#75058)
`createFunctionType` returns a FunctionType  that may contain a mask,
which is currently placed as the last parameter to the Function.
The placement happens according to `VFParameters` of `VFInfo`, and it
should be able to handle VFABI specification changes.

Regarding the return type, it uses the scalar type of the input instruction,
as the specification does not encode in the mangled name such information.
If that ever happens, that information should be available from `VFInfo`.
2023-12-19 12:05:28 +00:00
Wang Pengcheng
9348d437f5 [SelectionDAG] Add space-optimized forms of OPC_EmitRegister (#73291)
The followed byte of `OPC_EmitRegister` is a MVT type, which is
usually i32 or i64.

We add `OPC_EmitRegisterI32` and `OPC_EmitRegisterI64` so that we
can reduce one byte.

Overall this reduces the llc binary size with all in-tree targets by
about 10K.
2023-12-19 17:31:49 +08:00
paperchalice
72c75501ec [CodeGen] Port LowerEmuTLS to new pass manager (#75171)
In fact, this pass need `llc` to test. `TargetMachine` seems redundant,
because before adding this pass `CodeGenPassBuilder` already checks it:

ed4194bb8d/llvm/include/llvm/CodeGen/CodeGenPassBuilder.h (L590-L592)
2023-12-19 14:44:35 +08:00
Teresa Johnson
6a7bbf712d [memprof][NFC] Free symbolizer memory eagerly (#75849)
Move the ownership of the symbolizer into symbolizeAndFilterStackFrames
so that it is freed on exit, when we are done with it, to reduce peak
memory in the reader. This reduces about 9G from the peak for one large
profile.
2023-12-18 20:50:08 -08:00
Mingming Liu
6ce23ea0ab Revert "Reland "[PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier, use semicolon as delimiter for local-linkage varibles. "" (#75888)
Reverts llvm/llvm-project#75860
- Mangled name mismatch on Windows
(https://lab.llvm.org/buildbot/#/builders/127/builds/59907/steps/8/logs/stdio)
2023-12-18 19:31:18 -08:00
Mingming Liu
c5871712ae Reland "[PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier, use semicolon as delimiter for local-linkage varibles. " (#75860)
Fixed build-bot failures caught by post-submit tests
1) Add the list of command line tools needed by new compiler-rt test into dependency.
2) Use `starts_with` to replace deprecated `startswith`.

**Original commit message**
Commit fe05193 (phab D156569), IRPGO names uses format
`[<filepath>;]<linkage-name>` while prior format is
`[<filepath>:<mangled-name>`. The format change would break the use case
demonstrated in (updated)
`llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll` and
`compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp`

This patch changes `GlobalValues::getGlobalIdentifer` to use the
semicolon.

To elaborate on the scenario how things break without this PR
1. IRPGO raw profiles stores (compressed) IRPGO names of functions in
one section, and per-function profile data in another section. The
[NameRef](fc715e4cd9/compiler-rt/include/profile/InstrProfData.inc (L72))
field in per-function profile data is the MD5 hash of IRPGO names.
2. When raw profiles are converted to indexed format profiles, the
profiled address is
[mapped](fc715e4cd9/llvm/lib/ProfileData/InstrProf.cpp (L876-L885))
to the MD5 hash of the callee.
3. In `pgo-instr-use` thin-lto prelink pipeline, MD5 hash of IRPGO names
will be
[annotated](fc715e4cd9/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp (L1707))
as value profiles, and used to import indirect-call-prom candidates. If
the annotated MD5 hash is computed from the new format while import uses
the prior format, the callee cannot be imported.

*
`compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp`
is added to have an end-to-end test.
* `llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll`
is updated to have better test coverage from another aspect (as runtime
tests are more sensitive to the environment and may be skipped by some
contributors)
2023-12-18 17:43:40 -08:00
criis
c0d5d36dda [llvm][Support] Lift raw_socket_stream implementation out into own files (#75653)
Move the implementation of raw_socket_stream from raw_ostream.h/cpp to
raw_socket_stream.h/cpp as requested in #73603.
2023-12-18 17:13:47 -08:00
Cyndy Ishida
e3627e2690 Reland '[TextAPI] Add DylibReader' (#75862)
> Add support for reading binary Mach-o dynamic libraries. It uses
libObject APIs for extracting information relevant to TAPI and tbd
files. This includes but is not limited to load commands encode data
like install names, current/compat versions, and symbols.

This originally broke because DylibReader uses Object and Object depends
on TextAPI. Breaking this up in a nested library prevents this cycle.
2023-12-18 16:55:30 -08:00
Justin Bogner
4f54d71501 [HLSL][DirectX] Move handling of resource element types into the frontend
Rather than shepherding a type name all the way to the backend as a
string and attempting to parse it, get the element type out of the AST
and store that in the resource annotation metadata directly.

Pull Request: https://github.com/llvm/llvm-project/pull/75674
2023-12-18 11:43:52 -07:00
Fangrui Song
96aca7c517 [LTO] Improve diagnostics handling when parsing module-level inline assembly (#75726)
Non-LTO compiles set the buffer name to "<inline asm>"
(`AsmPrinter::addInlineAsmDiagBuffer`) and pass diagnostics to
`ClangDiagnosticHandler` (through the `MCContext` handler in
`MachineModuleInfoWrapperPass::doInitialization`) to ensure that
the exit code is 1 in the presence of errors. In contrast, LTO compiles
spuriously succeed even if error messages are printed.

```
% cat a.c
void _start() {}
asm("unknown instruction");
% clang -c a.c
<inline asm>:1:1: error: invalid instruction mnemonic 'unknown'
    1 | unknown instruction
      | ^
1 error generated.
% clang -c -flto a.c; echo $?  # -flto=thin is the same
error: invalid instruction mnemonic 'unknown'
unknown instruction
^~~~~~~
error: invalid instruction mnemonic 'unknown'
unknown instruction
^~~~~~~
0
```

`CollectAsmSymbols` parses inline assembly and is transitively called by
both `ModuleSummaryIndexAnalysis::run` and `WriteBitcodeToFile`, leading
to duplicate diagnostics.

This patch updates `CollectAsmSymbols` to be similar to non-LTO
compiles.
```
% clang -c -flto=thin a.c; echo $?
<inline asm>:1:1: error: invalid instruction mnemonic 'unknown'
    1 | unknown instruction
      | ^
1 errors generated.
1
```

The `HasErrors` check does not prevent duplicate warnings but assembler
warnings are very uncommon.
2023-12-18 09:46:58 -08:00
Mingming Liu
3aa5d71127 Revert "[PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier, use semicolon as delimiter for local-linkage varibles." (#75835)
Reverts llvm/llvm-project#74008

The compiler-rt test failed due to `llvm-dis` not found
(https://lab.llvm.org/buildbot/#/builders/127/builds/59884)
Will revert and investigate how to require the proper dependency.
2023-12-18 09:39:55 -08:00
Mingming Liu
245cddae70 [PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier, use semicolon as delimiter for local-linkage varibles. (#74008)
Commit fe05193 (phab D156569), IRPGO names uses format
`[<filepath>;]<linkage-name>` while prior format is
`[<filepath>:<mangled-name>`. The format change would break the use case
demonstrated in (updated)
`llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll` and
`compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp`

This patch changes `GlobalValues::getGlobalIdentifer` to use the
semicolon.

To elaborate on the scenario how things break without this PR
1. IRPGO raw profiles stores (compressed) IRPGO names of functions in
one section, and per-function profile data in another section. The
[NameRef](fc715e4cd9/compiler-rt/include/profile/InstrProfData.inc (L72))
field in per-function profile data is the MD5 hash of IRPGO names.
2. When raw profiles are converted to indexed format profiles, the
profiled address is
[mapped](fc715e4cd9/llvm/lib/ProfileData/InstrProf.cpp (L876-L885))
to the MD5 hash of the callee.
3. In `pgo-instr-use` thin-lto prelink pipeline, MD5 hash of IRPGO names
will be
[annotated](fc715e4cd9/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp (L1707))
as value profiles, and used to import indirect-call-prom candidates. If
the annotated MD5 hash is computed from the new format while import uses
the prior format, the callee cannot be imported.

*`compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp`
is added to have an end-to-end test.
* `llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll`
is updated to have better test coverage from another aspect (as runtime
tests are more sensitive to the environment and may be skipped by some
contributors)
2023-12-18 09:10:39 -08:00
arrv-sc
74cf5254d2 [llvm][Support] Add indirection to call correct validate(...) function (#71966)
Previously "yamlize" overload for validatedMappingTraits was unconditionally calling "MappingTraits<T>::validate" even if "MappingContextTraits<T, Context>" was passed to it.

Therefore compilation failed when specifying "MappingContextTraits<T,Context>::validate()"
2023-12-18 13:02:17 +00:00
Paul Walker
dea16ebd26 [LLVM][IR] Replace ConstantInt's specialisation of getType() with getIntegerType(). (#75217)
The specialisation will not be valid when ConstantInt gains native
support for vector types.

This is largely a mechanical change but with extra attention paid to constant
folding, InstCombineVectorOps.cpp, LoopFlatten.cpp and Verifier.cpp to
remove the need to call `getIntegerType()`.

Co-authored-by: Nikita Popov <github@npopov.com>
2023-12-18 11:58:42 +00:00
Craig Topper
dbe9a60256 [RISCV] Correct the VLOperand for masked vssrl/vssra intrinsics.
Though I can't prove it matters for anything. The only use of
VLOperand I know of is for handling i64 splat operands to .vx
intrinsics on RV32. Shifts are special and always use XLen for .vx
so they are always legal.
2023-12-17 17:42:08 -08:00
Kazu Hirata
5ac12951b4 [ADT] Deprecate StringRef::{starts,ends}with (#75491)
This patch deprecates StringRef::{starts,ends}with.  Note that I've
replaced all known uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
2023-12-17 15:52:50 -08:00
Kazu Hirata
2570c7e284 [CodeGen] Remove unused forward declarations (NFC) 2023-12-17 09:09:39 -08:00
Kazu Hirata
4b3078ef2d [CodeGen] Remove unnecessary includes (NFC) 2023-12-17 09:09:38 -08:00
melonedo
3eaed9e6f5 [RISCV] Implement intrinsics for XCVbitmanip extension in CV32E40P (#74993)
Implement XCVbitmanip intrinsics for CV32E40P according to the
specification.

This commit is part of a patch-set to upstream the vendor specific
extensions of CV32E40P that need LLVM intrinsics to implement Clang
builtins.

Contributors: @CharKeaney, @ChunyuLiao, @jeremybennett, @lewis-revill,
@NandniJamnadas, @PaoloS02, @simonpcook, @xingmingjie.

Spec:
05481cf0ef/specifications/corev-builtin-spec.md (listing-of-pulp-bit-manipulation-builtins-xcvbitmanip).

Previously reviewed on Phabricator: https://reviews.llvm.org/D157510.
Parallel GCC patch:
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635795.html.

Co-authored-by: melonedo <funanzeng@gmail.com>
2023-12-17 19:29:40 +08:00
Kazu Hirata
a3952b4f02 [Analysis] Remove unused forward declarations (NFC) 2023-12-17 00:57:24 -08:00
Joseph Huber
8c262ed2e3 [NVPTX] Add ELF flags for Nvidia cubin files (#75624)
Summary:
Nvidia uses ELF as its file format for cubin files. This patch adds
support to allow detecting the architecture using the ELF flags only.
This will be used in the offloading runtime in the future.

These values are completely undocumented. They were determined by
manually modifying the ELF header of the cubin and checking the output
of the `nvisasm` tool.
2023-12-15 13:47:28 -06:00