This feature causes clang to crash when compiling Chrome - see
https://crbug.com/1405031 and
https://github.com/llvm/llvm-project/issues/59675
Revert "[clang] Fix a clang crash on invalid code in C++20 mode."
This reverts commit 32d7aae04f.
Revert "[clang] Remove overly restrictive aggregate paren init logic"
This reverts commit c77a91bb7b.
Revert "[clang][C++20] P0960R3 and P1975R0: Allow initializing aggregates from a parenthesized list of values"
This reverts commit 40c52159d3.
Revert "Fix lldb option handling since e953ae5bbc (part 2)"
Revert "Fix lldb option handling since e953ae5bbc313fd0cc980ce021d487e5b5199ea4"
GCC build hangs on this bot https://lab.llvm.org/buildbot/#/builders/37/builds/19104
compiling CMakeFiles/obj.clangBasic.dir/Targets/AArch64.cpp.d
The bot uses GNU 11.3.0, but I can reproduce locally with gcc (Debian 12.2.0-3) 12.2.0.
This reverts commit caa713559b.
This reverts commit 06b90e2e9c.
This reverts commit e953ae5bbc.
This avoids the continuous API churn when upgrading things to use
std::optional and makes trivial string replace upgrades possible.
I tested this with GCC 7.5, the oldest supported GCC I had around.
Differential Revision: https://reviews.llvm.org/D140332
std::optional::value() has undesired exception checking semantics and is
unavailable in older Xcode (see _LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS). The
call sites block std::optional migration.
This makes `ninja clang` work in the absence of llvm::Optional::value.
value() has undesired exception checking semantics and calls
__throw_bad_optional_access in libc++. Moreover, the API is unavailable without
_LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see
_LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS).
This fixes clang.
This patch implements P0960R3, which allows initialization of aggregates
via parentheses.
As an example:
```
struct S { int i, j; };
S s1(1, 1);
int arr1[2](1, 2);
```
This patch also implements P1975R0, which fixes the wording of P0960R3
for single-argument parenthesized lists so that statements like the
following are allowed:
```
S s2(1);
S s3 = static_cast<S>(1);
S s4 = (S)1;
int (&&arr2)[] = static_cast<int[]>(1);
int (&&arr3)[2] = static_cast<int[2]>(1);
```
This patch was originally authored by @0x59616e and completed by
@ayzhao.
Fixes#54040, Fixes#54041
Co-authored-by: Sheng <ox59616e@gmail.com>
Full write up : https://discourse.llvm.org/t/c-20-rfc-suggestion-desired-regarding-the-implementation-of-p0960r3/63744
Reviewed By: ilya-biryukov
Differential Revision: https://reviews.llvm.org/D129531
With this patch, the solver can infer results for not equal (!=) operator
over Ranges as well. This also fixes the issue of comparison between
different types, by first converting the RangeSets to the resulting type,
which then can be used for comparisons.
Patch by Manas.
Reviewed By: steakhal
Differential Revision: https://reviews.llvm.org/D112621
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
By default, clang assumes that all trailing array objects could be a
FAM. So, an array of undefined size, size 0, size 1, or even size 42 is
considered as FAMs for optimizations at least.
One needs to override the default behavior by supplying the
`-fstrict-flex-arrays=<N>` flag, with `N > 0` value to reduce the set of
FAM candidates. Value `3` is the most restrictive and `0` is the most
permissive on this scale.
0: all trailing arrays are FAMs
1: only incomplete, zero and one-element arrays are FAMs
2: only incomplete, zero-element arrays are FAMs
3: only incomplete arrays are FAMs
If the user is happy with consdering single-element arrays as FAMs, they
just need to remove the
`consider-single-element-arrays-as-flexible-array-members` from the
command line.
Otherwise, if they don't want to recognize such cases as FAMs, they
should specify `-fstrict-flex-arrays` anyway, which will be picked up by
CSA.
Any use of the deprecated analyzer-config value will trigger a warning
explaining what to use instead.
The `-analyzer-config-help` is updated accordingly.
Depends on D138657
Reviewed By: xazax.hun
Differential Revision: https://reviews.llvm.org/D138659
Casting a pointer to a suitably large integral type by reinterpret-cast
should result in the same value as by using the `__builtin_bit_cast()`.
The compiler exploits this: https://godbolt.org/z/zMP3sG683
However, the analyzer does not bind the same symbolic value to these
expressions, resulting in weird situations, such as failing equality
checks and even results in crashes: https://godbolt.org/z/oeMP7cj8q
Previously, in the `RegionStoreManager::getBinding()` even if `T` was
non-null, we replaced it with `TVR->getValueType()` in case the `MR` was
`TypedValueRegion`.
It doesn't make much sense to auto-detect the type if the type is
already given. By not doing the auto-detection, we would just do the
right thing and perform the load by that type.
This means that we will cast the value to that type.
So, in this patch, I'm proposing to do auto-detection only if the type
was null.
Here is a snippet of code, annotated by the previous and new dump values.
`LocAsInteger` should wrap the `SymRegion`, since we want to load the
address as if it was an integer.
In none of the following cases should type auto-detection be triggered,
hence we should eventually reach an `evalCast()` to lazily cast the loaded
value into that type.
```lang=C++
void LValueToRValueBitCast_dumps(void *p, char (*array)[8]) {
clang_analyzer_dump(p); // remained: &SymRegion{reg_$0<void * p>}
clang_analyzer_dump(array); // remained: {{&SymRegion{reg_$1<char (*)[8] array>}
clang_analyzer_dump((unsigned long)p);
// remained: {{&SymRegion{reg_$0<void * p>} [as 64 bit integer]}}
clang_analyzer_dump(__builtin_bit_cast(unsigned long, p)); <--------- change #1
// previously: {{&SymRegion{reg_$0<void * p>}}}
// now: {{&SymRegion{reg_$0<void * p>} [as 64 bit integer]}}
clang_analyzer_dump((unsigned long)array); // remained: {{&SymRegion{reg_$1<char (*)[8] array>} [as 64 bit integer]}}
clang_analyzer_dump(__builtin_bit_cast(unsigned long, array)); <--------- change #2
// previously: {{&SymRegion{reg_$1<char (*)[8] array>}}}
// now: {{&SymRegion{reg_$1<char (*)[8] array>} [as 64 bit integer]}}
}
```
Reviewed By: xazax.hun
Differential Revision: https://reviews.llvm.org/D136603
Refactor StaticAnalyzer to use clang::SarifDocumentWriter for
serializing sarif diagnostics.
Uses clang::SarifDocumentWriter to generate SARIF output in the
StaticAnalyzer.
Various bugfixes are also made to clang::SarifDocumentWriter.
Summary of changes:
clang/lib/Basic/Sarif.cpp:
* Fix bug in adjustColumnPos introduced from prev move, it now uses
FullSourceLoc::getDecomposedExpansionLoc which provides the correct
location (in the presence of macros) instead of
FullSourceLoc::getDecomposedLoc.
* Fix createTextRegion so that it handles caret ranges correctly,
this should bring it to parity with the previous implementation.
clang/test/Analysis/diagnostics/Inputs/expected-sarif:
* Update the schema URL to the offical website
* Add the emitted defaultConfiguration sections to all rules
* Annotate results with the "level" property
clang/lib/StaticAnalyzer/Core/SarifDiagnostics.cpp:
* Update SarifDiagnostics class to hold a clang::SarifDocumentWriter
that it uses to convert diagnostics to SARIF.
We now skip the destruction of array elements for `delete[] p`,
if the value of `p` is UnknownVal and does not have corresponding region.
This eliminate the crash in `getDynamicElementCount` on that
region and matches the behavior for deleting the array of
non-constant range.
Reviewed By: isuckatcs
Differential Revision: https://reviews.llvm.org/D136671
This revision fixes typos where there are 2 consecutive words which are
duplicated. There should be no code changes in this revision (only
changes to comments and docs). Do let me know if there are any
undesirable changes in this revision. Thanks.
This was done as a test for D137302 and it makes sense to push these changes
Reviewed By: shafik
Differential Revision: https://reviews.llvm.org/D137491
The -fstrict-flex-arrays=3 is the most restrictive type of flex arrays.
No number, including 0, is allowed in the FAM. In the cases where a "0"
is used, the resulting size is the same as if a zero-sized object were
substituted.
This is needed for proper _FORTIFY_SOURCE coverage in the Linux kernel,
among other reasons. So while the only reason for specifying a
zero-length array at the end of a structure is for specify a FAM,
treating it as such will cause _FORTIFY_SOURCE not to work correctly;
__builtin_object_size will report -1 instead of 0 for a destination
buffer size to keep any kernel internals from using the deprecated
members as fake FAMs.
For example:
struct broken {
int foo;
int fake_fam[0];
struct something oops;
};
There have been bugs where the above struct was created because "oops"
was added after "fake_fam" by someone not realizing. Under
__FORTIFY_SOURCE, doing:
memcpy(p->fake_fam, src, len);
raises no warnings when __builtin_object_size(p->fake_fam, 1) returns -1
and may stomp on "oops."
Omitting a warning when using the (invalid) zero-length array is how GCC
treats -fstrict-flex-arrays=3. A warning in that situation is likely an
irritant, because requesting this option level is explicitly requesting
this behavior.
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836
Differential Revision: https://reviews.llvm.org/D134902
Discourse mail:
https://discourse.llvm.org/t/analyzer-why-do-we-suck-at-modeling-c-dynamic-memory/65667
malloc() returns a piece of uninitialized dynamic memory. We were (almost)
always able to model this behaviour. Its C++ counterpart, operator new is a
lot more complex, because it allows for initialization, the most complicated of which is the usage of constructors.
We gradually became better in modeling constructors, but for some reason, most
likely for reasons lost in history, we never actually modeled the case when the
memory returned by operator new was just simply uninitialized. This patch
(attempts) to fix this tiny little error.
Differential Revision: https://reviews.llvm.org/D135375
It turns out we can reach the `Init.castAs<nonlock::CompoundVal>()`
expression with other kinds of SVals. Such as by `nonloc::ConcreteInt`
in this example: https://godbolt.org/z/s4fdxrcs9
```lang=C++
int buffer[10];
void b();
void top() {
b(&buffer);
}
void b(int *c) {
*c = 42; // would crash
}
```
In this example, we try to store `42` to the `Elem{buffer, 0}`.
This situation can appear if the CallExpr refers to a function
declaration without prototype. In such cases, the engine will pick the
redecl of the referred function decl which has function body, hence has
a function prototype.
This weird situation will have an interesting effect to the AST, such as
the argument at the callsite will miss a cast, which would cast the
`int (*)[10]` expression into `int *`, which means that when we evaluate
the `*c = 42` expression, we want to bind `42` to an array, causing the
crash.
Look at the AST of the callsite with and without the function prototype:
https://godbolt.org/z/Gncebcbdb
The only difference is that without the proper function prototype, we
will not have the `ImplicitCastExpr` `BitCasting` from `int (*)[10]`
to `int *` to match the expected type of the parameter declaration.
In this patch, I'm proposing to emit a cast in the mentioned edge-case,
to bind the argument value of the expected type to the parameter.
I'm only proposing this if the runtime definition has exactly the same
number of parameters as the callsite feeds it by arguments.
If that's not the case, I believe, we are better off by binding `Unknown`
to those parameters.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D136162
Previously, `LazyCompoundVal` bindings to subregions referred by
`LazyCopoundVals`, were not marked as //lazily copied//.
This change returns `LazyCompoundVals` from `getInterestingValues()`,
so their regions can be marked as //lazily copied// in `RemoveDeadBindingsWorker::VisitBinding()`.
Depends on D134947
Authored by: Tomasz Kamiński <tomasz.kamiński@sonarsource.com>
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D135136
To illustrate our current understanding, let's start with the following program:
https://godbolt.org/z/33f6vheh1
```lang=c++
void clang_analyzer_printState();
struct C {
int x;
int y;
int more_padding;
};
struct D {
C c;
int z;
};
C foo(D d, int new_x, int new_y) {
d.c.x = new_x; // B1
assert(d.c.x < 13); // C1
C c = d.c; // L
assert(d.c.y < 10); // C2
assert(d.z < 5); // C3
d.c.y = new_y; // B2
assert(d.c.y < 10); // C4
return c; // R
}
```
In the code, we create a few bindings to subregions of root region `d` (`B1`, `B2`), a constrain on the values (`C1`, `C2`, ….), and create a `lazyCompoundVal` for the part of the region `d` at point `L`, which is returned at point `R`.
Now, the question is which of these should remain live as long the return value of the `foo` call is live. In perfect a word we should preserve:
# only the bindings of the subregions of `d.c`, which were created before the copy at `L`. In our example, this includes `B1`, and not `B2`. In other words, `new_x` should be live but `new_y` shouldn’t.
# constraints on the values of `d.c`, that are reachable through `c`. This can be created both before the point of making the copy (`L`) or after. In our case, that would be `C1` and `C2`. But not `C3` (`d.z` value is not reachable through `c`) and `C4` (the original value of`d.c.y` was overridden at `B2` after the creation of `c`).
The current code in the `RegionStore` covers the use case (1), by using the `getInterestingValues()` to extract bindings to parts of the referred region present in the store at the point of copy. This also partially covers point (2), in case when constraints are applied to a location that has binding at the point of the copy (in our case `d.c.x` in `C1` that has value `new_x`), but it fails to preserve the constraints that require creating a new symbol for location (`d.c.y` in `C2`).
We introduce the concept of //lazily copied// locations (regions) to the `SymbolReaper`, i.e. for which a program can access the value stored at that location, but not its address. These locations are constructed as a set of regions referred to by `lazyCompoundVal`. A //readable// location (region) is a location that //live// or //lazily copied// . And symbols that refer to values in regions are alive if the region is //readable//.
For simplicity, we follow the current approach to live regions and mark the base region as //lazily copied//, and consider any subregions as //readable//. This makes some symbols falsy live (`d.z` in our example) and keeps the corresponding constraints alive.
The rename `Regions` to `LiveRegions` inside `RegionStore` is NFC change, that was done to make it clear, what is difference between regions stored in this two sets.
Regression Test: https://reviews.llvm.org/D134941
Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>
Reviewed By: martong, xazax.hun
Differential Revision: https://reviews.llvm.org/D134947
The Clang Static Analyzer will crash on this code:
```lang=C++
struct Box {
int value;
};
template <Box V> int get() {
return V.value;
}
template int get<Box{-1}>();
```
https://godbolt.org/z/5Yb1sMMMb
The problem is that we don't account for encountering `TemplateParamObjectDecl`s
within the `DeclRefExpr` handler in the `ExprEngine`.
IMO we should create a new memregion for representing such template
param objects, to model their language semantics.
Such as:
- it should have global static storage
- for two identical values, their addresses should be identical as well
http://eel.is/c%2B%2Bdraft/temp.param#8
I was thinking of introducing a `TemplateParamObjectRegion` under `DeclRegion`
for this purpose. It could have `TemplateParamObjectDecl` as a field.
The `TemplateParamObjectDecl::getValue()` returns `APValue`, which might
represent multiple levels of structures, unions and other goodies -
making the transformation from `APValue` to `SVal` a bit complicated.
That being said, for now, I think having `Unknowns` for such cases is
definitely an improvement to crashing, hence I'm proposing this patch.
Reviewed By: xazax.hun
Differential Revision: https://reviews.llvm.org/D135763
ProcessMemberDtor(), ProcessDeleteDtor(), and ProcessAutomaticObjDtor():
Fix static analyzer warnings with suspicious dereference of pointer
'Pred' in function call before NULL checks - NFCI
Differential Revision: https://reviews.llvm.org/D135290
In case when the prvalue is returned from the function (kind is one
of `SimpleReturnedValueKind`, `CXX17ElidedCopyReturnedValueKind`),
then it construction happens in context of the caller.
We pass `BldrCtx` explicitly, as `currBldrCtx` will always refer to callee
context.
In the following example:
```
struct Result {int value; };
Result create() { return Result{10}; }
int accessValue(Result r) { return r.value; }
void test() {
for (int i = 0; i < 2; ++i)
accessValue(create());
}
```
In case when the returned object was constructed directly into the
argument to a function call `accessValue(create())`, this led to
inappropriate value of `blockCount` being used to locate parameter region,
and as a consequence resulting object (from `create()`) was constructed
into a different region, that was later read by inlined invocation of
outer function (`accessValue`).
This manifests itself only in case when calling block is visited more
than once (loop in above example), as otherwise there is no difference
in `blockCount` value between callee and caller context.
This happens only in case when copy elision is disabled (before C++17).
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D132030
showBRParamDiagnostics assumed stores happen only via function parameters while that
can also happen via implicit parameters like 'self' or 'this'.
The regression test caused a failed assert in the original cast to ParmVarDecl.
Differential Revision: https://reviews.llvm.org/D133815
Most of the state traits used for non-POD array evaluation were
only cleaned up if the ctors/dtors were inlined, since the cleanup
happened in ExprEngine::processCallExit(). This patch makes sure
they are removed even if said functions are not inlined.
Differential Revision: https://reviews.llvm.org/D133643
By this change the `exploded-graph-rewriter` will display the class kind
of the expression of the environment entry. It makes easier to decide if
the given entry corresponds to the lvalue or to the rvalue of some
expression.
It turns out the rewriter already had support for visualizing it, but
probably was never actually used?
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D132109
`LazyCompoundVals` should only appear as `default` bindings in the
store. This fixes the second case in this patch-stack.
Depends on: D132142
Reviewed By: xazax.hun
Differential Revision: https://reviews.llvm.org/D132143