This patch adds a new flag: `--preserve-input-debuginfo-format`
This flag instructs the tool to not convert the debug info format
(intrinsics/records) of input IR, but to instead determine the format of
the input IR and overwrite the other format-determining flags so that we
process and output the file in the same format that we received it in.
This flag is turned off by llvm-link, llvm-lto, and llvm-lto2, and
should be turned off by any other tool that expects to parse multiple IR
modules and have their debug info formats match.
The motivation for this flag is to allow tools to not convert the debug
info format - verify-uselistorder and llvm-reduce, and any downstream
tools that seek to test or mutate IR as-is, without applying extraneous
modifications to the input. This is a necessary step to using debug
records by default in all (other) LLVM tools.
Before this fix, a duplicate llvm.dbg.value intrinsic referring to an
argument, after an alloca, would be generated with `$noreg`, losing
debug information. Instead, we silently drop the second debug info, so
it doesn't break the first one.
rdar://125375717
This is the major rename patch that prior patches have built towards.
The DPValue class is being renamed to DbgVariableRecord, which reflects
the updated terminology for the "final" implementation of the RemoveDI
feature. This is a pure string substitution + clang-format patch. The
only manual component of this patch was determining where to perform
these string substitutions: `DPValue` and `DPV` are almost exclusively
used for DbgRecords, *except* for:
- llvm/lib/target, where 'DP' is used to mean double-precision, and so
appears as part of .td files and in variable names. NB: There is a
single existing use of `DPValue` here that refers to debug info, which
I've manually updated.
- llvm/tools/gold, where 'LDPV' is used as a prefix for symbol
visibility enums.
Outside of these places, I've applied several basic string
substitutions, with the intent that they only affect DbgRecord-related
identifiers; I've checked them as I went through to verify this, with
reasonable confidence that there are no unintended changes that slipped
through the cracks. The substitutions applied are all case-sensitive,
and are applied in the order shown:
```
DPValue -> DbgVariableRecord
DPVal -> DbgVarRec
DPV -> DVR
```
Following the previous rename patches, it should be the case that there
are no instances of any of these strings that are meant to refer to the
general case of DbgRecords, or anything other than the DPValue class.
The idea behind this patch is therefore that pure string substitution is
correct in all cases as long as these assumptions hold.
Reland #82363 after fixing build failure
https://lab.llvm.org/buildbot/#/builders/5/builds/41428.
Memory sanitizer detects usage of `RawData` union member which is not
filled directly. Instead, the code relies on filling `Data` union
member, which is a struct consisting of signing schema parameters.
According to https://en.cppreference.com/w/cpp/language/union, this is
UB:
"It is undefined behavior to read from the member of the union that
wasn't most recently written".
Instead of relying on compiler allowing us to do dirty things, do not
use union and only store `RawData`. Particular ptrauth parameters are
obtained on demand via bit operations.
Original PR description below.
Emit `__ptrauth`-qualified types as `DIDerivedType` metadata nodes in IR
with tag `DW_TAG_LLVM_ptrauth_type`, baseType referring to the type
which has the qualifier applied, and the following parameters
representing the signing schema:
- `ptrAuthKey` (integer)
- `ptrAuthIsAddressDiscriminated` (boolean)
- `ptrAuthExtraDiscriminator` (integer)
- `ptrAuthIsaPointer` (boolean)
- `ptrAuthAuthenticatesNullValues` (boolean)
Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
- Use computeMaxCallFrameSize() in PEI::calculateCallFrameInfo() instead of duplicating the code.
- Set AdjustsStack in FinalizeISel instead of in computeMaxCallFrameSize().
Fixes: https://github.com/llvm/llvm-project/issues/85254
Hardware loops inserts PHIs at the position `getFirstNonPhi()`, which is
incorrect - instead, `getFirstNonPhiIt()` is required to not insert the
PHI after any debug records that immediately follow the last existing
PHI.
When dumping FDEs, `readelf` prints new location values after
`DW_CFA_advance_loc(*)` instructions, which looks quite convenient:
```
> readelf -wf test.o
...
... FDE ... pc=0000000000000030..0000000000000064
DW_CFA_advance_loc: 4 to 0000000000000034
...
DW_CFA_advance_loc: 4 to 0000000000000038
...
```
This patch makes `llvm-dwarfdump` and `llvm-readobj` do the same.
This patch adds support for parsing the proposed non-instruction debug
info ("RemoveDIs") from textual IR, and adds a test for the parser as well
as a set of verifier tests that are dependent on parsing to fire.
An important detail of this patch is the fact that although we can now
parse in the RemoveDIs (new) and Intrinsic (old) debug info formats, we
will always convert back to the old format at the end of parsing - this
is done for two reasons: firstly to ensure that every tool is able to
process IR printed in the new format, regardless of whether that tool
has had RemoveDIs support added, and secondly to maintain the effect of
the existing flags: for the tools where support for the new format has
been added, we will run LLVM passes in the new format iff
`--try-experimental-debuginfo-iterators=true`, and we will print in the
new format iff `--write-experimental-debuginfo-iterators=true`; the
format of the textual IR input should have no effect on either of these
features.
`DPLabel`, the RemoveDI version of `llvm.dbg.label`, was landed recently
at the same time the RemoveDIs IR printing and verifier patches were
landing. The patches were updated to not miscompile, but did not have
full-featured and correct support for DPLabel built in; this patch makes
the remaining set of changes to enable DPLabel support.
Emit `__ptrauth`-qualified types as `DIDerivedType` metadata nodes in IR
with tag `DW_TAG_LLVM_ptrauth_type`, baseType referring to the type
which has the qualifier applied, and the following parameters
representing the signing schema:
- `ptrAuthKey` (integer)
- `ptrAuthIsAddressDiscriminated` (boolean)
- `ptrAuthExtraDiscriminator` (integer)
- `ptrAuthIsaPointer` (boolean)
- `ptrAuthAuthenticatesNullValues` (boolean)
Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
This reapplication changes debug intrinsic declaration removal to only take
place when printing final IR, so that the processing format of the Module
does not affect the output.
This reverts commit d128448efd.
This patch adds support for printing the proposed non-instruction debug
info ("RemoveDIs") out to textual IR. This patch does not add any
bitcode support, parsing support, or documentation.
Printing of the new format is controlled by a flag added in this patch,
`--write-experimental-debuginfo`, which defaults to false. The new
format will be printed *iff* this flag is true, so whether we use the IR
format is completely independent of whether we use non-instruction debug
info during LLVM passes (which is controlled by the
`--try-experimental-debuginfo-iterators` flag).
Even with the flag disabled, some existing tests need to be updated, as this
patch causes debug intrinsic declarations to be changed in a round trip,
such that they always appear at the end of a module and have no attributes
(this has no functional change on the module).
The design of this new IR format was proposed previously on
Discourse, and any further discussion about the design can still be
contributed there:
https://discourse.llvm.org/t/rfc-debuginfo-proposed-changes-to-the-textual-ir-representation-for-debug-values/73491
The function to print DPValues currently tries to incorporate the
function it is part of, which is found through its marker; this means
when we try to print a DPValue with no marker, we dereference a nullptr.
We can print instructions without parents, and so the same should be
true for DPValues; this patch changes DPValue::print to check for a null
marker and avoid dereferencing it.
Fixes issue: https://github.com/llvm/llvm-project/issues/82230
Based on the discussion in
https://github.com/llvm/llvm-project/pull/80229
changed implementation to align with how .debug_abbrev is handled. So
that
.debug_names abbrev tag is a monotonically increasing index. This allows
for
tools like LLDB to access it in constant time using array like data
structure.
clang-19 debug build
before change
[41] .debug_names PROGBITS 0000000000000000 8f9e0350 137fdbe0 00 0 0 4
after change
[41] .debug_names PROGBITS 0000000000000000 8f9e0350 125bfdec 00 0 0 4
Reduction ~19.1MB
Fix crash mentioned in comments on
d759618df7.
The assertion being hit was complaining that we had dangling DPValues;
the DPValues attached to the terminator of StartBlock become dangling
after the terminator is erased, and they're never "flushed" back onto
the new terminator once it's added. Doing that makes the crash go away,
but doesn't replicate existing dbg.* behaviour. See the comment in the
patch.
This change both fixes the crash (because there are now no DPValues left
on the terminator to dangle) and replicates existing behaviour (moves
those DPValues down to the new block).
The current behavior of --verify is that it only verifies debug_info,
debug_abbrev and debug_names. This seems fairly arbitrary and might have
been unintentional, as originally the absence of any section flags
implied "all".
This patch changes the behavior so that the verifier now verifies
everything by default. It revealed two tests that had potentially
invalid DWARF:
1. dwarfdump-str-offsets.s is adding padding between two
debug_str_offset contributions. The standard does not explicitly allow
this behavior. See issue
https://github.com/llvm/llvm-project/issues/81558
2. dwarf5-macro.test uses a checked-in binary that has invalid
debug_str_offsets. One of its entries points to the _middle_ of the
string section:
error: .debug_str_offsets: contribution 0x0: index 0x4: invalid string
offset *0x18 == 0x455D, is neither zero nor immediately following a null
character
If we look at the closest offset to 0x455D in debug_str:
```
0x0000454e: "__SLONG32_TYPE int"
```
0x455D points to "int".
llvm.dbg.assign intrinsics have 2 {value, expression} pairs; fix hwasan to
update the second expression.
Fixes#76545. This is #78606 rebased and with the addition of DPValue handling.
Note the addition of --try-experimental-debuginfo-iterators in the tests and
some shuffling of code in MemoryTaggingSupport.cpp.
This adds the following values to the CodeView.h enums (and updates the
various functions that use them):
* CPUType:
* Added `Unknown`
* This is not currently documented in the online documentation, but this
is present in `cvconst.h` in the latest DIA SDK (Visual Studio 2022,
17.7.6)
* `Unknown` is the CPUType that is emitted by `aliasobj.exe` in the
Compile3Sym records, and can be found in objects that link with
`oldnames.lib`

* SourceLanguage (All of these are documented at
https://learn.microsoft.com/en-us/visualstudio/debugger/debug-interface-access/cv-cfl-lang?view=vs-2022
and are present in `cvconst.h` in the latest DIA SDK (Visual Studio
2022, 17.7.6))
* Added Go
* Added AliasObj
* emitted by `aliasobj.exe` in certain records, can be found in PDBs
that link with `oldnames.lib`
* Changed Swift to the official Microsoft enumeration
* Added `OldSwift`
* The old Swift enumeration of `S` was changed to `OldSwift` to allow
pdb dumping utilities to continue to emit correct source language
information for old PDBs
### WARNING
The `Swift` change is a potentially breaking change, as the swift
compiler will now emit `0x13` for the SourceLanguage type in PDB records
instead of `S`. This could potentially break utilities that relied on
the old enum value.
* CallType
* Added Swift
* This is not currently documented in the online documentation, but this
is present in `cvconst.h` in the latest DIA SDK (Visual Studio 2022,
17.7.6)
At the moment, the emergency spill slot is a fixed object for entry
functions and chain functions, and a regular stack object otherwise.
This patch adopts the latter behaviour for entry/chain functions too. It
seems this was always the intention [1] and it will also save us a bit
of stack space in cases where the first stack object has a large
alignment.
[1]
34c8b835b1
With this, I get a clean test suite running under RemoveDIs, the
non-intrinsic representation of debug-info, including under asan. We've
previously established that we generate identical binaries for some
large projects, so this i just edge-case cleanup. The changes:
* CodeGenPrepare fixups need to apply to dbg.assigns as well as
dbg.values (a dbg.assign is a dbg.value).
* Pin a test for constant-deletion to intrinsic debug-info: this very
rare scenario uses a different kill-location sigil in dbg.value mode to
RemoveDIs mode, which generates spurious test differences.
* Suppress a memory leak in a unit test: the code for dealing with
trailing debug-info in a block is necessarily fiddly, leading to this
leak when testing it. Developer-facing interfaces for moving
instructions around always deal with this behind the scenes.
* SROA, when replacing some vector-loads, needs to insert the
replacement loads ahead of any debug-info records so that their values
remain dominated by a definition. Set the head-bit indicating our
insertion should come before debug-info.
There are some rare circumstances where dbg.assign intrinsics can reach
FastISel. They are a more specialised kind of dbg.value intrinsic with
more information about the originating alloca. They only occur during
optimisation, but might reach FastISel through always_inlining an
optimised function into an optnone function.
This is a slight problem as it's not safe (for debug-info accuracy) to
ignore any intrinsics, and for RemoveDIs (the intrinsic-replacement
project) it causes a crash through an unhandled switch case. To get
around this, we can just treat the dbg.assign as a dbg.value (it's an
actual subclass) and use the variable location information from the
dbg.value fields. This loses a small amount of debug-info about stack
locations, but is more accurate than just ignoring the intrinsic.
(This has popped up deep in an LTO build of a large codebase while
testing RemoveDIs, I figured it'd be good to fix it for the
intrinsic-form at the same time, just to demonstrate the correct
behaviour).
We disabled these extra-special RUNlines due to unexpected interactions
between the various things we've been fixing. Re-enable them (they'll run
on the llvm-new-debug-iterators buildbot) as they all now pass.
The comment and code here seems to match getTypeForExtReturn. The
history shows that at the time this code was added, similar code existed
in SelectionDAGBuilder. SelectionDAGBuiler code has since been
refactored into getTypeForExtReturn.
This patch makes FastISel match SelectionDAGBuilder.
The test changes are because X86 has customization of
getTypeForExtReturn. So now we only extend returns to i8.
Stumbled onto this difference by accident.
The amount and format of output from `llvm-dwarfdump --verify` makes it
quite difficult to know if a change to a tool that produces or modifies
DWARF is causing new problems, or is fixing existing problems. This diff
adds a categorized summary of issues found by the DWARF verifier, on by
default, at the bottom of the error output.
The change includes a new `--error-display` option with 4 settings:
* `--error-display=quiet`: Only display if errors occurred, but no
details or summary are printed.
* `--error-display=summary`: Only display the aggregated summary of
errors with no error detail.
* `--error-display=details`: Only display the detailed error messages
with no summary (previous behavior)
* `--error-display=full`: Display both the detailed error messages and
the aggregated summary of errors (the default)
I changed a handful of tests that were failing due to new output, adding
the flag to use the old behavior for all but a couple. For those two I
added the new aggregated output to the expected output of the test.
The `OutputCategoryAggregator` is a pretty simple little class that
@clayborg suggested to allow code to only be run to dump detail if it's
enabled, while still collating counts of the category. Knowing that the
lambda passed in is only conditionally executed is pretty important
(handling errors has to be done *outside* the lambda). I'm happy to move
this somewhere else (and change/improve it) to be more broadly useful if
folks would like.
---------
Co-authored-by: Kevin Frei <freik@meta.com>
In instcombine, when we sink an instruction into a successor block, we try
to clone and salvage all the variable assignments that use that Value. This
is a behaviour that's (IMO) flawed, but there are important use cases where
we want to avoid regressions, thus we're implementing this for the
non-instruction debug-info representation.
This patch refactors the dbg.value sinking code into it's own function, and
installs a parallel implementation for DPValues, the non-instruction
debug-info container. This is mostly identical to the dbg.value
implementation, except that we don't have an easy-to-access ordering
between DPValues, and have to jump through extra hoops to establish one in
the (rare) cases where that ordering is required.
The test added represents a common use-case in LLVM where these behaviours
are important: a loop has been completely optimised away, leaving several
dbg.values in a row referring to an instruction that's going to sink. The
dbg.values should sink in both dbg.value and RemoveDIs mode, and
additionally only the last assignment should sink.
This commit introduces a helper function to DWARFAcceleratorTable::Entry
which follows DW_IDX_Parent attributes to returns the corresponding
parent Entry in the table.
It is tested by enhancing dwarfdump so that it now prints:
1. When data is corrupt.
2. When parent information is present, but the parent is not indexed.
3. The parent entry offset, when the parent is present and indexed. This
is printed in terms a real entry offset (the same that gets printed at
the start of each entry: "Entry @ 0x..."), instead of the encoded number
in the table (which is an offset from the start off the Entry list).
This makes it easy to visually inspect the dwarfdump and check what the
parent is.
Here's a raft of minor fixes for the RemoveDIs project that's replacing
dbg.value intrinsics with DPValue objects, all IMO trivial:
* When inserting functions or blocks and calling setIsNewDbgInfoFormat,
do that after setting the Parent pointer, just in case conversion from
(or to) dbg.value mode is triggered.
* When transferring DPValues from an empty range in a splice call, don't
transfer if there are no DPValues attached to the source block at all.
* stripNonLineTableDebugInfo should drop DPValues.
* In insertBefore, don't try to transfer DPValues if there aren't any.
This is the final patch for DPVAssign support, implementing the actual
creation of DPVAssigns and allowing them to be converted along with
dbg.values and dbg.declares. Numerous tests landed in previous patches
will no longer be rotten after this patch lands (previously they would
trivially pass due to DPVAssigns not actually being used), and a further
batch of tests have been added here that require the changes in this
patch before they pass.
This patch trivially updates various opt passes to handle DPVAssigns. In
all cases, this means some combination of generifying existing code to
handle DPValues and DbgAssignIntrinsics, iterating over DPValues where
previously we did not, or duplicating code for DbgAssignIntrinsics to
the equivalent DPValue function (in inlining and salvageDebugInfo).
This patch adds support for DPVAssigns across all of
AssignmentTrackingAnalysis except for AssignmentTrackingLowering, which
is implemented in a separate patch. This patch includes handling
DPValues in MemLocFragFill, the removal of redundant DPValues as part of
AssignmentTrackingAnalysis (which is different to the version in
`BasicBlockUtils.cpp`), and preventing the DPVAssigns from being
directly emitted in SelectionDAG (just as we don't emit llvm.dbg.assigns
directly, but receive a set of locations from
AssignmentTrackingAnalysis' output).