[DebugInfo] Enforce implicit constraints on distinct MDNodes

Add UNIQUED and DISTINCT properties in Metadata.def and use them to
implement restrictions on the `distinct` property of MDNodes:

* DIExpression can currently be parsed from IR or read from bitcode
  as `distinct`, but this property is silently dropped when printing
  to IR. This causes accepted IR to fail to round-trip. As DIExpression
  appears inline at each use in the canonical form of IR, it cannot
  actually be `distinct` anyway, as there is no syntax to describe it.
* Similarly, DIArgList is conceptually always uniqued. It is currently
  restricted to only appearing in contexts where there is no syntax for
  `distinct`, but for consistency it is treated equivalently to
  DIExpression in this patch.
* DICompileUnit is already restricted to always being `distinct`, but
  along with adding general support for the inverse restriction I went
  ahead and described this in Metadata.def and updated the parser to be
  general. Future nodes which have this restriction can share this
  support.

The new UNIQUED property applies to DIExpression and DIArgList, and
forbids them to be `distinct`. It also implies they are canonically
printed inline at each use, rather than via MDNode ID.

The new DISTINCT property applies to DICompileUnit, and requires it to
be `distinct`.

A potential alternative change is to forbid the non-inline syntax for
DIExpression entirely, as is done with DIArgList implicitly by requiring
it appear in the context of a function. For example, we would forbid:

    !named = !{!0}
    !0 = !DIExpression()

Instead we would only accept the equivalent inlined version:

    !named = !{!DIExpression()}

This essentially removes the ability to create a `distinct` DIExpression
by construction, as there is no syntax for `distinct` inline. If this
patch is accepted as-is, the result would be that the non-canonical
version is accepted, but the following would be an error and produce a diagnostic:

    !named = !{!0}
    ; error: 'distinct' not allowed for !DIExpression()
    !0 = distinct !DIExpression()

Also update some documentation to consistently use the inline syntax for
DIExpression, and to describe the restrictions on `distinct` for nodes
where applicable.

Reviewed By: StephenTozer, t-tye

Differential Revision: https://reviews.llvm.org/D104827
This commit is contained in:
Scott Linder
2021-06-28 19:40:45 +00:00
parent aad87328fa
commit 8cd35ad854
17 changed files with 275 additions and 134 deletions

View File

@@ -1758,7 +1758,6 @@ void ModuleBitcodeWriter::writeDIFile(const DIFile *N,
void ModuleBitcodeWriter::writeDICompileUnit(const DICompileUnit *N,
SmallVectorImpl<uint64_t> &Record,
unsigned Abbrev) {
assert(N->isDistinct() && "Expected distinct compile units");
Record.push_back(/* IsDistinct */ true);
Record.push_back(N->getSourceLanguage());
Record.push_back(VE.getMetadataOrNullID(N->getFile()));
@@ -2009,7 +2008,7 @@ void ModuleBitcodeWriter::writeDIExpression(const DIExpression *N,
SmallVectorImpl<uint64_t> &Record,
unsigned Abbrev) {
Record.reserve(N->getElements().size() + 1);
const uint64_t Version = 3 << 1;
const uint64_t Version = 4 << 1;
Record.push_back((uint64_t)N->isDistinct() | Version);
Record.append(N->elements_begin(), N->elements_end());
@@ -2154,6 +2153,20 @@ void ModuleBitcodeWriter::writeMetadataRecords(
if (const MDNode *N = dyn_cast<MDNode>(MD)) {
assert(N->isResolved() && "Expected forward references to be resolved");
#ifndef NDEBUG
switch (N->getMetadataID()) {
#define HANDLE_MDNODE_LEAF_UNIQUED(CLASS) \
case Metadata::CLASS##Kind: \
assert(!N->isDistinct() && "Expected non-distinct " #CLASS); \
break;
#define HANDLE_MDNODE_LEAF_DISTINCT(CLASS) \
case Metadata::CLASS##Kind: \
assert(N->isDistinct() && "Expected distinct " #CLASS); \
break;
#include "llvm/IR/Metadata.def"
}
#endif
switch (N->getMetadataID()) {
default:
llvm_unreachable("Invalid MDNode subclass");