This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
Note that llvm::support::endianness has been renamed to
llvm::endianness while becoming an enum class as opposed to an
enum. This patch replaces support::{big,little,native} with
llvm::endianness::{big,little,native}.
For DirectX, program signatures are encoded into three different binary
sections depending on if the signature is for inputs, outputs, or
patches. All three signature types use the same data structure encoding
so they can share a lot of logic.
This patch adds support for reading and writing program signature data
as both yaml and binary data.
Fixes#57743 and #57744
To make DWARFDebugAbbrev more amenable to error-handling, I would like
to change the return type of DWARFDebugAbbrev::parse from `void` to
`Error`. Users of DWARFDebugAbbrev can consume the error if they want to
use all the valid DWARF that was parsed (without worrying about the
malformed DWARF) or stop when the parse fails if the use case needs to
be strict.
This also will bring the LLVM DWARFDebugAbbrev interface closer to
LLDB's which opens up the opportunity for LLDB adopt the LLVM
implementation with minimal changes.
Currently, obj2yaml doesn't emit the offset of program headers, leaving
it to yaml2obj to calculate offsets based on `FirstSec` and `LastSec`.
This causes an obj2yaml->yaml2obj round trip to often produce an ELF
file that is not equivalent to the original, especially since it seems
common to have program headers at offset 0 whose first section starts at
a higher address. Besides being non-equivalent, the produced ELF files
also do not seem to work propery and readelf complains about them.
Taking a simple hello world program in C, compiled using either GCC or
Clang, the original ELF file has the following program headers (only
showing some relevant ones):
```
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000000040 0x0000000000000040 0x0000000000000040
0x00000000000002d8 0x00000000000002d8 R 0x8
INTERP 0x0000000000000318 0x0000000000000318 0x0000000000000318
0x000000000000001c 0x000000000000001c R 0x1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000630 0x0000000000000630 R 0x1000
LOAD 0x0000000000001000 0x0000000000001000 0x0000000000001000
0x0000000000000161 0x0000000000000161 R E 0x1000
...
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.gnu.property .note.gnu.build-id .note.ABI-tag .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt
03 .init .plt .text .fini
...
```
While this is the result of an obj2yaml->yaml2obj round trip:
```
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000000000 0x0000000000000040 0x0000000000000040
0x0000000000000000 0x0000000000000000 R 0x8
readelf: Error: the PHDR segment is not covered by a LOAD segment
INTERP 0x0000000000000318 0x0000000000000318 0x0000000000000318
0x000000000000001c 0x000000000000001c R 0x1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x0000000000000318 0x0000000000000000 0x0000000000000000
0x0000000000000318 0x0000000000000318 R 0x1000
LOAD 0x0000000000001000 0x0000000000001000 0x0000000000001000
0x0000000000000161 0x0000000000000161 R E 0x1000
...
Section to Segment mapping:
Segment Sections...
00
01 .interp
02
03 .init .plt .text .fini
...
```
Note that the offset of segment 2 changed from 0x0 to 0x318. This has
two effects:
- readelf complains "Error: the PHDR segment is not covered by a LOAD
segment" since PHDR was originally covered by segment 2 but not
anymore;
- Segment 2 effectively became empty according to the section to segment
mapping.
I addition to these, the output doesn't correctly execute anymore,
crashing with a "SIGSEGV (Address boundary error)".
This patch fixes the difference in program header layout after a round
trip by explicitly emitting offsets.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D145555
The DXContainer pipeline state information encodes a bunch of mask
vectors that are used to track things about the inputs and outputs from
each shader.
This adds support for reading and writing them throught he YAML test
interfaces. The writing logic in MC is extremely primitive and we'll
want to revisit the API for that, but since I'm not sure how we'll want
to generate the mask bits from DXIL during code generation I didn't want
to spend too much time on the API.
Fixes#59479
The pipeline state data captured in the PSV0 section of the DXContainer
file encodes signature elements which are read by the runtime to map
inputs and outputs from the GPU program.
This change adds support for generating and parsing signature elements
with testing driven through the ObjectYAML tooling.
Reviewed By: bogner
Differential Revision: https://reviews.llvm.org/D157671
Initially landed as 8c567e64f8, and
reverted in 4d800633b2.
../llvm/include/llvm/BinaryFormat/DXContainerConstants.def
../llvm/test/ObjectYAML/DXContainer/PSVv1-amplification.yaml
../llvm/test/ObjectYAML/DXContainer/PSVv1-compute.yaml
../llvm/test/ObjectYAML/DXContainer/PSVv1-domain.yaml
../llvm/test/ObjectYAML/DXContainer/PSVv1-geometry.yaml
../llvm/test/ObjectYAML/DXContainer/PSVv1-vertex.yaml
../llvm/test/ObjectYAML/DXContainer/PSVv2-amplification.yaml
../llvm/test/ObjectYAML/DXContainer/PSVv2-compute.yaml
../llvm/test/ObjectYAML/DXContainer/PSVv2-domain.yaml
../llvm/test/ObjectYAML/DXContainer/PSVv2-geometry.yaml
../llvm/test/ObjectYAML/DXContainer/PSVv2-vertex.yaml
The pipeline state data captured in the PSV0 section of the DXContainer
file encodes signature elements which are read by the runtime to map
inputs and outputs from the GPU program.
This change adds support for generating and parsing signature elements
with testing driven through the ObjectYAML tooling.
Reviewed By: bogner
Differential Revision: https://reviews.llvm.org/D157671
The offset recorded in export trie nodes is used to capture symbol
information like address and node size. This offset should be relative to the
root of the trie. This updates to `processExportNode` to follow that
expectation.
Reviewed By: pete, steven_wu
Differential Revision: https://reviews.llvm.org/D157431
Previously when objcopy generated section headers, it padded the LEB
that encodes the section size out to 5 bytes, matching the behavior of
clang. This is correct, but results in a binary that differs from the
input. This can sometimes have undesirable consequences (e.g. breaking
source maps).
This change makes the object reader remember the size of the LEB
encoding in the section header, so that llvm-objcopy can reproduce it
exactly. For sections not read from an object file (e.g. that
llvm-objcopy is adding itself), pad to 5 bytes.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D155535
When writing this initially I missed including the resource stride.
This change adds the resources stride to the serialized value.
I've also extended the testing and error reporting around parsing PSV
information. This adds tests to verify that the reader produces
meaningful error messages for malformed DXContainer files, and a test
that verifies the resource stride is respected in the reader even if
the stride isn't an expected or known value (as would happen if the
format changes in the future).
This is part of #59479.
Reviewed By: bogner, bob80905
Differential Revision: https://reviews.llvm.org/D155143
In an attempt to make it easier to catch errors when parsing the
debug_abbrev section, we should force users to call `parse` before
calling `begin`. In a follow-up change, I will change the return type of
`parse` from `void` to `Error`.
I also explored using the fallible_iterator pattern instead of forcing
users to parse everything up front. I think it would be a useful and
interesting pattern to implement, but it would require more extensive
changes to both DWARFDebugAbbrev and its users. Because my top priority
is improving the safety around parsing debug_abbrev, I'm opting to
preserve existing behavior until I or somebody else has time to refactor
to be able to implement a fallible_iterator.
Differential Revision: https://reviews.llvm.org/D154655
The generic ABI says:
> Padding is present, if necessary, to ensure 8 or 4-byte alignment for the next note entry (depending on whether the file is a 64-bit or 32-bit object). Such padding is not included in descsz.
Our parsing code currently aligns n_namesz. Fix the bug by aligning the start
offset of the descriptor instead. This issue has been benign because the primary
uses of sh_addralign=8 notes are `.note.gnu.property`, where
`sizeof(Elf_Nhdr) + sizeof("GNU") = 16` (already aligned by 8).
In practice, many 64-bit systems incorrectly use sh_addralign=4 notes.
We can use sh_addralign (= p_align) to decide the descriptor padding.
Treat an alignment of 0 and 1 as 4. This approach matches modern GNU readelf
(since 2018).
We have a few tests incorrectly using sh_addralign=0. We may make our behavior
stricter after fixing these tests.
Linux kernel dumped core files use `p_align=0` notes, so we need to support the
case for compatibility.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D150022
This patch continues implementing DirectX pipeline state validation
information by adding support for resource binding metadata.
Reviewed By: python3kgae
Differential Revision: https://reviews.llvm.org/D143130
DirectX shader pipeline state validation information is a fairly
complicated to serialize data structure. This patch adds the first bit
of support for reading and writing the runtime info structure which
comes first in the encoded data.
Subsequent patches will flesh out the remaining fields of the data
structure.
There is no official documentation for the format, but the format is
roughly documented in the code comment here:
https://github.com/microsoft/DirectXShaderCompiler/blob/main/include/dxc
/DxilContainer/DxilPipelineStateValidation.h#L731
Reviewed By: python3kgae
Differential Revision: https://reviews.llvm.org/D141649
Let Propeller use specialized IDs for basic blocks, instead of MBB number.
This allows optimizations not just prior to asm-printer, but throughout the entire codegen.
This patch only implements the functionality under the new `LLVM_BB_ADDR_MAP` version, but the old version is still being used. A later patch will change the used version.
####Background
Today Propeller uses machine basic block (MBB) numbers, which already exist, to map native assembly to machine IR. This is done as follows.
- Basic block addresses are captured and dumped into the `LLVM_BB_ADDR_MAP` section just before the AsmPrinter pass which writes out object files. This ensures that we have a mapping that is close to assembly.
- Profiling mapping works by taking a virtual address of an instruction and looking up the `LLVM_BB_ADDR_MAP` section to find the MBB number it corresponds to.
- While this works well today, we need to do better when we scale Propeller to target other Machine IR optimizations like spill code optimization. Register allocation happens earlier in the Machine IR pipeline and we need an annotation mechanism that is valid at that point.
- The current scheme will not work in this scenario because the MBB number of a particular basic block is not fixed and changes over the course of codegen (via renumbering, adding, and removing the basic blocks).
- In other words, the volatile MBB numbers do not provide a one-to-one correspondence throughout the lifetime of Machine IR. Profile annotation using MBB numbers is restricted to a fixed point; only valid at the exact point where it was dumped.
- Further, the object file can only be dumped before AsmPrinter and cannot be dumped at an arbitrary point in the Machine IR pass pipeline. Hence, MBB numbers are not suitable and we need something else.
####Solution
We propose using fixed unique incremental MBB IDs for basic blocks instead of volatile MBB numbers. These IDs are assigned upon the creation of machine basic blocks. We modify `MachineFunction::CreateMachineBasicBlock` to assign the fixed ID to every newly created basic block. It assigns `MachineFunction::NextMBBID` to the MBB ID and then increments it, which ensures having unique IDs.
To ensure correct profile attribution, multiple equivalent compilations must generate the same Propeller IDs. This is guaranteed as long as the MachineFunction passes run in the same order. Since the `NextBBID` variable is scoped to `MachineFunction`, interleaving of codegen for different functions won't cause any inconsistencies.
The new encoding is generated under the new version number 2 and we keep backward-compatibility with older versions.
####Impact on Size of the `LLVM_BB_ADDR_MAP` Section
Emitting the Propeller ID results in a 23% increase in the size of the `LLVM_BB_ADDR_MAP` section for the clang binary.
Reviewed By: tmsriram
Differential Revision: https://reviews.llvm.org/D100808
Use deduction guides instead of helper functions.
The only non-automatic changes have been:
1. ArrayRef(some_uint8_pointer, 0) needs to be changed into ArrayRef(some_uint8_pointer, (size_t)0) to avoid an ambiguous call with ArrayRef((uint8_t*), (uint8_t*))
2. CVSymbol sym(makeArrayRef(symStorage)); needed to be rewritten as CVSymbol sym{ArrayRef(symStorage)}; otherwise the compiler is confused and thinks we have a (bad) function prototype. There was a few similar situation across the codebase.
3. ADL doesn't seem to work the same for deduction-guides and functions, so at some point the llvm namespace must be explicitly stated.
4. The "reference mode" of makeArrayRef(ArrayRef<T> &) that acts as no-op is not supported (a constructor cannot achieve that).
Per reviewers' comment, some useless makeArrayRef have been removed in the process.
This is a follow-up to https://reviews.llvm.org/D140896 that introduced
the deduction guides.
Differential Revision: https://reviews.llvm.org/D140955
value() has undesired exception checking semantics and calls
__throw_bad_optional_access in libc++. Moreover, the API is unavailable without
_LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see
_LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS).
This fixes check-llvm.
Let Propeller use specialized IDs for basic blocks, instead of MBB number.
This allows optimizations not just prior to asm-printer, but throughout the entire codegen.
This patch only implements the functionality under the new `LLVM_BB_ADDR_MAP` version, but the old version is still being used. A later patch will change the used version.
####Background
Today Propeller uses machine basic block (MBB) numbers, which already exist, to map native assembly to machine IR. This is done as follows.
- Basic block addresses are captured and dumped into the `LLVM_BB_ADDR_MAP` section just before the AsmPrinter pass which writes out object files. This ensures that we have a mapping that is close to assembly.
- Profiling mapping works by taking a virtual address of an instruction and looking up the `LLVM_BB_ADDR_MAP` section to find the MBB number it corresponds to.
- While this works well today, we need to do better when we scale Propeller to target other Machine IR optimizations like spill code optimization. Register allocation happens earlier in the Machine IR pipeline and we need an annotation mechanism that is valid at that point.
- The current scheme will not work in this scenario because the MBB number of a particular basic block is not fixed and changes over the course of codegen (via renumbering, adding, and removing the basic blocks).
- In other words, the volatile MBB numbers do not provide a one-to-one correspondence throughout the lifetime of Machine IR. Profile annotation using MBB numbers is restricted to a fixed point; only valid at the exact point where it was dumped.
- Further, the object file can only be dumped before AsmPrinter and cannot be dumped at an arbitrary point in the Machine IR pass pipeline. Hence, MBB numbers are not suitable and we need something else.
####Solution
We propose using fixed unique incremental MBB IDs for basic blocks instead of volatile MBB numbers. These IDs are assigned upon the creation of machine basic blocks. We modify `MachineFunction::CreateMachineBasicBlock` to assign the fixed ID to every newly created basic block. It assigns `MachineFunction::NextMBBID` to the MBB ID and then increments it, which ensures having unique IDs.
To ensure correct profile attribution, multiple equivalent compilations must generate the same Propeller IDs. This is guaranteed as long as the MachineFunction passes run in the same order. Since the `NextBBID` variable is scoped to `MachineFunction`, interleaving of codegen for different functions won't cause any inconsistencies.
The new encoding is generated under the new version number 2 and we keep backward-compatibility with older versions.
####Impact on Size of the `LLVM_BB_ADDR_MAP` Section
Emitting the Propeller ID results in a 23% increase in the size of the `LLVM_BB_ADDR_MAP` section for the clang binary.
Reviewed By: tmsriram
Differential Revision: https://reviews.llvm.org/D100808
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
The exports trie used to be pointed by the information in LC_DYLD_INFO,
but when chained fixups are present, the exports trie is pointed by
LC_DYLD_EXPORTS_TRIE instead.
Modify the Object library to give access to the information pointed by
each of the load commands, and to fallback from one into the other when
the exports are requested.
Modify ObjectYAML to support dumping the export trie when pointed by
LC_DYLD_EXPORTS_TRIE and to parse the existence of a export trie also
when the load command is present.
This is a split of D134250 with improvements on top.
Reviewed By: alexander-shaposhnikov
Differential Revision: https://reviews.llvm.org/D134571
Add basic binary support for chained fixups. This allows basic tests
with chained fixups without trying to create a format for them until the
work on the Object library is considered finished.
Reviewed By: pete
Differential Revision: https://reviews.llvm.org/D134250
DXContainer files contain a part that has an MD5 of the generated
shader. This adds support to the ObjectYAML tooling to expand the hash
part data and hash iteself in preparation for adding hashing support to
DirectX code generation.
Reviewed By: python3kgae
Differential Revision: https://reviews.llvm.org/D136632
This is a split of D134250.
Supports for parsing and dumping the LC_DATA_IN_CODE contents (as binary
data).
This allows more complete testing of llvm-objdump in D133974.
Reviewed By: Higuoxing
Differential Revision: https://reviews.llvm.org/D134569
DXContainers contain a feature flag part, which stores a bitfield used
to denote what underlying hardware features the shader requires. This
change adds feature flags to the DXContainer YAML tooling to enable
testing generating feature flags during HLSL code generation.
Depends on D133980
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D134315
This patch refactors some of the DXContainer Object and YAML code to
make it easier to add more part parsing.
DXContainer has a whole bunch of constant values, so I've added a
DXContainerConstants.def file which will grow with constant
definitions, but starts with just part identifiers. I've also added a
utility to parse the part magic string into an enum, and converted the
code to use that utility and the enum instead of the part literal
string.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D133980
Similar to D73982 for yaml2obj.
* Hide unrelated options.
* Add an OVERVIEW: message.
* Disallow single-dash long options.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D129839
-o is very common among tools. yaml2obj supports -o and it surprised me that
obj2yaml doesn't support -o. Just add it which doesn't take much code.
Differential Revision: https://reviews.llvm.org/D129713
This patchs adds the necessary code for inspecting or creating offloading
binaries using the standing `obj2yaml` and `yaml2obj` features in LLVM.
Depends on D127774
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D127776
This is a resurrection of D106421 with the change that it keeps backward-compatibility. This means decoding the previous version of `LLVM_BB_ADDR_MAP` will work. This is required as the profile mapping tool is not released with LLVM (AutoFDO). As suggested by @jhenderson we rename the original section type value to `SHT_LLVM_BB_ADDR_MAP_V0` and assign a new value to the `SHT_LLVM_BB_ADDR_MAP` section type. The new encoding adds a version byte to each function entry to specify the encoding version for that function. This patch also adds a feature byte to be used with more flexibility in the future. An use-case example for the feature field is encoding multi-section functions more concisely using a different format.
Conceptually, the new encoding emits basic block offsets and sizes as label differences between each two consecutive basic block begin and end label. When decoding, offsets must be aggregated along with basic block sizes to calculate the final offsets of basic blocks relative to the function address.
This encoding uses smaller values compared to the existing one (offsets relative to function symbol).
Smaller values tend to occupy fewer bytes in ULEB128 encoding. As a result, we get about 17% total reduction in the size of the bb-address-map section (from about 11MB to 9MB for the clang PGO binary).
The extra two bytes (version and feature fields) incur a small 3% size overhead to the `LLVM_BB_ADDR_MAP` section size.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D121346