clang-p2996

Author	SHA1	Message	Date
NagyDonat	9a16c12abe	[NFC] Remove semicolons after function definitions (#87764 ) They were accidentally left behind when https://github.com/llvm/llvm-project/pull/86536 converted some lambdas into stand-alone methods. This fixes warnings from -Wc++98-compat-extra-semi	2024-04-05 12:01:43 +02:00
NagyDonat	163301d785	[analyzer] Remove barely used class 'KnownSVal' (NFC) (#86953 ) The class `KnownSVal` was very magical abstract class within the `SVal` class hierarchy: with a hacky `classof` method it acted as if it was the common ancestor of the classes `UndefinedSVal` and `DefinedSVal`. However, it was only used in two `getAs<KnownSVal>()` calls and the signatures of two methods, which does not "pay for" its weird behavior, so I created this commit that removes it and replaces its use with more straightforward solutions.	2024-04-05 11:22:08 +02:00
NagyDonat	fb299cae51	[analyzer] Make recognition of hardened __FOO_chk functions explicit (#86536 ) In builds that use source hardening (-D_FORTIFY_SOURCE), many standard functions are implemented as macros that expand to calls of hardened functions that take one additional argument compared to the "usual" variant and perform additional input validation. For example, a `memcpy` call may expand to `__memcpy_chk()` or `__builtin___memcpy_chk()`. Before this commit, `CallDescription`s created with the matching mode `CDM::CLibrary` automatically matched these hardened variants (in a addition to the "usual" function) with a fairly lenient heuristic. Unfortunately this heuristic meant that the `CLibrary` matching mode was only usable by checkers that were prepared to handle matches with an unusual number of arguments. This commit limits the recognition of the hardened functions to a separate matching mode `CDM::CLibraryMaybeHardened` and applies this mode for functions that have hardened variants and were previously recognized with `CDM::CLibrary`. This way checkers that are prepared to handle the hardened variants will be able to detect them easily; while other checkers can simply use `CDM::CLibrary` for matching C library functions (and they won't encounter surprising argument counts). The initial motivation for refactoring this area was that previously `CDM::CLibrary` accepted calls with more arguments/parameters than the expected number, so I wasn't able to use it for `malloc` without accidentally matching calls to the 3-argument BSD kernel malloc. After this commit this "may have more args/params" logic will only activate when we're actually matching a hardened variant function (in `CDM::CLibraryMaybeHardened` mode). The recognition of "sprintf()" and "snprintf()" in CStringChecker was refactored, because previously it was abusing the behavior that extra arguments are accepted even if the matched function is not a hardened variant. This commit also fixes the oversight that the old code would've recognized e.g. `__wmemcpy_chk` as a hardened variant of `memcpy`. After this commit I'm planning to create several follow-up commits that ensure that checkers looking for C library functions use `CDM::CLibrary` as a "sane default" matching mode. This commit is not truly NFC (it eliminates some buggy corner cases), but it does not intentionally modify the behavior of CSA on real-world non-crazy code. As a minor unrelated change I'm eliminating the argument/variable "IsBuiltin" from the evalSprintf function family in CStringChecker, because it was completely unused. --------- Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>	2024-04-05 11:20:27 +02:00
Chris B	9434c08347	[HLSL] Implement array temporary support (#79382 ) HLSL constant sized array function parameters do not decay to pointers. Instead constant sized array types are preserved as unique types for overload resolution, template instantiation and name mangling. This implements the change by adding a new `ArrayParameterType` which represents a non-decaying `ConstantArrayType`. The new type behaves the same as `ConstantArrayType` except that it does not decay to a pointer. Values of `ConstantArrayType` in HLSL decay during overload resolution via a new `HLSLArrayRValue` cast to `ArrayParameterType`. `ArrayParamterType` values are passed indirectly by-value to functions in IR generation resulting in callee generated memcpy instructions. The behavior of HLSL function calls is documented in the [draft language specification](https://microsoft.github.io/hlsl-specs/specs/hlsl.pdf) under the Expr.Post.Call heading. Additionally the design of this implementation approach is documented in [Clang's documentation](https://clang.llvm.org/docs/HLSL/FunctionCalls.html) Resolves #70123	2024-04-01 12:10:10 -05:00
Chris B	28ddbd4a86	[NFC] Refactor ConstantArrayType size storage (#85716 ) In PR #79382, I need to add a new type that derives from ConstantArrayType. This means that ConstantArrayType can no longer use `llvm::TrailingObjects` to store the trailing optional Expr*. This change refactors ConstantArrayType to store a 60-bit integer and 4-bits for the integer size in bytes. This replaces the APInt field previously in the type but preserves enough information to recreate it where needed. To reduce the number of places where the APInt is re-constructed I've also added some helper methods to the ConstantArrayType to allow some common use cases that operate on either the stored small integer or the APInt as appropriate. Resolves #85124.	2024-03-26 14:15:56 -05:00
Balazs Benics	32b828306e	[analyzer] Set and display CSA analysis entry points as notes on debugging (#84823 ) When debugging CSA issues, sometimes it would be useful to have a dedicated note for the analysis entry point, aka. the function name you would need to pass as "-analyze-function=XYZ" to reproduce a specific issue. One way we use (or will use) this downstream is to provide tooling on top of creduce to enhance to supercharge productivity by automatically reduce cases on crashes for example. This will be added only if the "-analyzer-note-analysis-entry-points" is set or the "analyzer-display-progress" is on. This additional entry point marker will be the first "note" if enabled, with the following message: "[debug] analyzing from XYZ". They are prefixed by "[debug]" to remind the CSA developer that this is only meant to be visible for them, for debugging purposes. CPP-5012	2024-03-25 15:24:03 +01:00
NagyDonat	e1d4ddb0c6	Reapply "[analyzer] Accept C library functions from the `std` namespace" again (#85791 ) This reapplies `80ab8234ac` again, after fixing a name collision warning in the unit tests (see the revert commit `13ccaf9b9d` for details). In addition to the previously applied changes, this commit also clarifies the code in MallocChecker that distinguishes POSIX "getline()" and C++ standard library "std::getline()" (which are two completely different functions). Note that "std::getline()" was (accidentally) handled correctly even without this clarification; but it's better to explicitly handle and test this corner case. --------- Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>	2024-03-25 12:43:51 +01:00
Balazs Benics	e925968e78	[analyzer] Support C++23 static operator calls (#84972 ) Made by following: https://github.com/llvm/llvm-project/pull/83585#issuecomment-1980340866 Thanks for the details Tomek! CPP-5080	2024-03-22 12:04:44 +01:00
Alejandro Álvarez Ayllón	730ca47a0c	[clang][analyzer] Model getline/getdelim preconditions and evaluation (#83027 ) According to POSIX 2018. 1. lineptr, n and stream can not be NULL. 2. If n is non-zero, lineptr must point to a region of at least n bytes, or be a NULL pointer. Additionally, if lineptr is not NULL, *n must not be undefined.	2024-03-22 11:50:34 +01:00
T-Gruber	86d479fd7c	Adapted MemRegion::getDescriptiveName to handle ElementRegions (#85104 ) Fixes https://github.com/llvm/llvm-project/issues/84463 Changes: - Adapted MemRegion::getDescriptiveName - Added unittest to check name for a given clang::ento::ElementRegion - Some format changes due to clang-format --------- Co-authored-by: Andreas Steinhausen <andreas.steinhausen@concenrio.io> Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>	2024-03-21 18:27:53 +01:00
Balazs Benics	c8772940ee	[analyzer] Wrap SymbolicRegions by ElementRegions before getting a FieldRegion (#85211 ) Inside the ExprEngine when we process the initializers, we create a PostInitializer program-point, which will refer to the field being initialized, see `FieldLoc` inside `ExprEngine::ProcessInitializer`. When a constructor (of which we evaluate the initializer-list) is analyzed in top-level context, then the `this` pointer will be represented by a `SymbolicRegion`, (as it should be). This means that we will form a `FieldRegion{SymbolicRegion{.}}` as the initialized region. ```c++ class Bear { public: void brum() const; }; class Door { public: // PostInitializer would refer to "FieldRegion{SymRegion{this}}" // whereas in the store and everywhere else it would be: // "FieldRegion{ELementRegion{SymRegion{Ty, this}, 0, Ty}". Door() : ptr(nullptr) { ptr->brum(); // Bug } private: Bear ptr; }; ``` We (as CSA folks) decided to avoid the creation of FieldRegions directly of symbolic regions in the past: `f8643a9b31` --- In this patch, I propose to also canonicalize it as in the mentioned patch, into this: `FieldRegion{ElementRegion{SymbolicRegion{Ty, .}, 0, Ty}` This would mean that FieldRegions will/should never simply wrap a SymbolicRegion directly, but rather an ElementRegion that is sitting in between. This patch should have practically no observable effects, as the store (due to the mentioned patch) was made resilient to this issue, but we use `PostInitializer::getLocationValue()` for an alternative reporting, where we faced this issue. Note that in really rare cases it suppresses now dereference bugs, as demonstrated in the test. It is because in the past we failed to follow the region of the PostInitializer inside the StoreSiteFinder visitor - because it was using this code: ```c++ // If this is a post initializer expression, initializing the region, we // should track the initializer expression. if (std::optional<PostInitializer> PIP = Pred->getLocationAs<PostInitializer>()) { const MemRegion FieldReg = (const MemRegion *)PIP->getLocationValue(); if (FieldReg == R) { StoreSite = Pred; InitE = PIP->getInitializer()->getInit(); } } ``` Notice that the equality check didn't pass for the regions I'm canonicalizing in this patch. Given the nature of this change, we would rather upstream this patch. CPP-4954	2024-03-21 18:22:22 +01:00
huang-me	8f68022f8e	[clang][analyzer] Fix crash in loop unrolling (#82089 ) StaticAnalyzer didn't check if the variable is declared in `CompoundStmt` under `SwitchStmt`, which make static analyzer reach root without finding the declaration. Fixes #68819 --------- Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>	2024-03-14 09:16:40 +01:00
Philip Reames	13ccaf9b9d	Revert "Reapply "[analyzer] Accept C library functions from the `std` namespace"" This reverts commit `e48d5a838f`. Fails to build on x86-64 w/gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04) with the following message: ../llvm-project/clang/unittests/StaticAnalyzer/IsCLibraryFunctionTest.cpp:41:28: error: declaration of ‘std::unique_ptr<clang::ASTUnit> IsCLibraryFunctionTest::ASTUnit’ changes meaning of ‘ASTUnit’ [-fpermissive] 41 \| std::unique_ptr<ASTUnit> ASTUnit; \| ^~~~~~~ In file included from ../llvm-project/clang/unittests/StaticAnalyzer/IsCLibraryFunctionTest.cpp:4: ../llvm-project/clang/include/clang/Frontend/ASTUnit.h:89:7: note: ‘ASTUnit’ declared here as ‘class clang::ASTUnit’ 89 \| class ASTUnit { \| ^~~~~~~	2024-03-13 10:19:42 -07:00
NagyDonat	e48d5a838f	Reapply "[analyzer] Accept C library functions from the `std` namespace" This reapplies f32b04d4ea91ad1018c25a1d4178cc4392d34968i, after fixing the use-after-free of ASTUnit in the unittest. https://github.com/llvm/llvm-project/pull/84469#issuecomment-1992163439 Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>	2024-03-13 14:48:42 +01:00
Diego A. Estrada Rivera	7bee91fadf	[analyzer][NFC] Turn NodeBuilderContext into a class (#84638 ) From issue #73088. I changed `NodeBuilderContext` into a class. Additionally, there were some other mentions of the former being a struct which I also changed into a class. This is my first time working with an issue so I will be open to hearing any advice or changes that need to be done.	2024-03-12 18:21:31 +01:00
NagyDonat	f32b04d4ea	Revert "[analyzer] Accept C library functions from the `std` namespace" (#84926 ) Reverts llvm/llvm-project#84469 because it causes buildbot failures. I'll examine them and re-submit the change.	2024-03-12 16:01:04 +01:00
NagyDonat	80ab8234ac	[analyzer] Accept C library functions from the `std` namespace (#84469 ) Previously, the function `isCLibraryFunction()` and logic relying on it only accepted functions that are declared directly within a TU (i.e. not in a namespace or a class). However C++ headers like <cstdlib> declare many C standard library functions within the namespace `std`, so this commit ensures that functions within the namespace `std` are also accepted. After this commit it will be possible to match functions like `malloc` or `free` with `CallDescription::Mode::CLibrary`. --------- Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>	2024-03-12 13:51:12 +01:00
Exile	d4687fe7d1	[analyzer] Fix crash on dereference invalid return value of getAdjustedParameterIndex() (#83585 ) Fixes #78810 Thanks for Snape3058 's comment --------- Co-authored-by: miaozhiyuan <miaozhiyuan@feysh.com>	2024-03-06 17:01:30 +01:00
Alejandro Álvarez Ayllón	67c6ad6f30	[clang][analyzer] Model allocation behavior or getdelim/geline (#83138 ) `getdelim` and `getline` may free, allocate, or re-allocate the input buffer, ensuring its size is enough to hold the incoming line, the delimiter, and the null terminator. `*lineptr` must be a valid argument to `free`, which means it can be either 1. `NULL`, in which case these functions perform an allocation equivalent to a call to `malloc` even on failure. 2. A pointer returned by the `malloc` family of functions. Other pointers are UB (`alloca`, a pointer to a static, to a stack variable, etc.)	2024-03-06 16:52:18 +01:00
NagyDonat	ad1b2a8129	[analyzer] Demonstrate superfluous unsigned >= 0 assumption (#78442 ) This commit adds a testcase which highlights the current incorrect behavior of the CSA diagnostic generation: it produces a note which says "Assuming 'arg' is >= 0" in a situation where this is not a fresh assumption because 'arg' is an unsigned integer. I also created ticket 78440 to track this bug.	2024-03-06 16:42:31 +01:00
Balazs Benics	a87dc23a62	[clang][NFC] Trim license header comments to 81 characters (#82919 ) clang-format would format these headers poorly by splitting it into multiple lines.	2024-03-06 16:32:14 +01:00
Balazs Benics	88414c8862	[analyzer][NFC] Remove dead code (#83968 ) Remove the unused method `CoreEngine::ExecuteWorkListWithInitialState`.	2024-03-05 10:30:28 +01:00
NagyDonat	52a460f9d4	[analyzer] Refactor CallDescription match mode (NFC) (#83432 ) The class `CallDescription` is used to define patterns that are used for matching `CallEvent`s. For example, a `CallDescription{{"std", "find_if"}, 3}` matches a call to `std::find_if` with 3 arguments. However, these patterns are somewhat fuzzy, so this pattern could also match something like `std::__1::find_if` (with an additional namespace layer), or, unfortunately, a `CallDescription` for the well-known function `free()` can match a C++ method named `free()`: https://github.com/llvm/llvm-project/issues/81597 To prevent this kind of ambiguity this commit introduces the enum `CallDescription::Mode` which can limit the pattern matching to non-method function calls (or method calls etc.). After this NFC change, one or more follow-up commits will apply the right pattern matching modes in the ~30 checkers that use `CallDescription`s. Note that `CallDescription` previously had a `Flags` field which had only two supported values: - `CDF_None` was the default "match anything" mode, - `CDF_MaybeBuiltin` was a "match only C library functions and accept some inexact matches" mode. This commit preserves `CDF_MaybeBuiltin` under the more descriptive name `CallDescription::Mode::CLibrary` (or `CDM::CLibrary`). Instead of this "Flags" model I'm switching to a plain enumeration becasue I don't think that there is a natural usecase to combine the different matching modes. (Except for the default "match anything" mode, which is currently kept for compatibility, but will be phased out in the follow-up commits.)	2024-03-04 15:43:37 +01:00
Chris B	5c57fd717d	[HLSL] Vector standard conversions (#71098 ) HLSL supports vector truncation and element conversions as part of standard conversion sequences. The vector truncation conversion is a C++ second conversion in the conversion sequence. If a vector truncation is in a conversion sequence an element conversion may occur after it before the standard C++ third conversion. Vector element conversions can be boolean conversions, floating point or integral conversions or promotions. [HLSL Draft Specification](https://microsoft.github.io/hlsl-specs/specs/hlsl.pdf) --------- Co-authored-by: Aaron Ballman <aaron@aaronballman.com>	2024-02-15 14:58:06 -06:00
Artem Dergachev	017675fff1	[attributes][analyzer] Generalize [[clang::suppress]] to declarations. (#80371 ) The attribute is now allowed on an assortment of declarations, to suppress warnings related to declarations themselves, or all warnings in the lexical scope of the declaration. I don't necessarily see a reason to have a list at all, but it does look as if some of those more niche items aren't properly supported by the compiler itself so let's maintain a short safe list for now. The initial implementation raised a question whether the attribute should apply to lexical declaration context vs. "actual" declaration context. I'm using "lexical" here because it results in less warnings suppressed, which is the conservative behavior: we can always expand it later if we think this is wrong, without breaking any existing code. I also think that this is the correct behavior that we will probably never want to change, given that the user typically desires to keep the suppressions as localized as possible.	2024-02-13 14:57:55 -08:00
Erich Keane	f655778300	[OpenACC] Implement AST for OpenACC Compute Constructs (#81188 ) 'serial', 'parallel', and 'kernel' constructs are all considered 'Compute' constructs. This patch creates the AST type, plus the required infrastructure for such a type, plus some base types that will be useful in the future for breaking this up. The only difference between the three is the 'kind'( plus some minor clause legalization rules, but those can be differentiated easily enough), so rather than representing them as separate AST nodes, it seems to make sense to make them the same. Additionally, no clause AST functionality is being implemented yet, as that fits better in a separate patch, and this is enough to get the 'naked' constructs implemented. This is otherwise an 'NFC' patch, as it doesn't alter execution at all, so there aren't any tests. I did this to break up the review workload and to get feedback on the layout.	2024-02-13 06:02:13 -08:00
Artem Dergachev	243bfed683	[analyzer][HTMLRewriter] Cache partial rewrite results. (#80220 ) This is a follow-up for `721dd3bc2` [analyzer] NFC: Don't regenerate duplicate HTML reports. Because HTMLRewriter re-runs the Lexer for syntax highlighting and macro expansion purposes, it may get fairly expensive when the rewriter is invoked multiple times on the same file. In the static analyzer (which uses HTMLRewriter for HTML output mode) we only get away with this because there are usually very few reports emitted per file. But if loud checkers are enabled, such as `webkit.*`, this may explode in complexity and even cause the compiler to run over the 32-bit SourceLocation addressing limit. This patch caches intermediate results so that re-lexing only needed to happen once. As the clever __COUNTER__ test demonstrates, "once" is still too many. Ideally we shouldn't re-lex anything at all, which remains a TODO.	2024-02-01 13:07:21 -08:00
Artem Dergachev	56e241a07f	[analyzer] Unbreak [[clang::suppress]] on checkers without decl-with-issue. (#79398 ) There are currently a few checkers that don't fill in the bug report's "decl-with-issue" field (typically a function in which the bug is found). The new attribute `[[clang::suppress]]` uses decl-with-issue to reduce the size of the suppression source range map so that it didn't need to do that for the entire translation unit. I'm already seeing a few problems with this approach so I'll probably redesign it in some point as it looks like a premature optimization. Not only checkers shouldn't be required to pass decl-with-issue (consider clang-tidy checkers that never had such notion), but also it's not necessarily uniquely determined (consider leak suppressions at allocation site). For now I'm adding a simple stop-gap solution that falls back to building the suppression map for the entire TU whenever decl-with-issue isn't specified. Which won't happen in the default setup because luckily all default checkers do provide decl-with-issue. --------- Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>	2024-01-31 13:55:31 -08:00
Andrey Ali Khan Bolshakov	ef67f63fa5	Fix analyzer crash on 'StructuralValue' (#79764 ) `OpaqueValueExpr` doesn't necessarily contain a source expression. Particularly, after #78041, it is used to carry the type and the value kind of a non-type template argument of floating-point type or referring to a subobject (those are so called `StructuralValue` arguments). This fixes #79575.	2024-01-30 13:03:55 +01:00
cor3ntin	ad1a65fcac	[Clang][C++26] Implement Pack Indexing (P2662R3). (#72644 ) Implements https://isocpp.org/files/papers/P2662R3.pdf The feature is exposed as an extension in older language modes. Mangling is not yet supported and that is something we will have to do before release.	2024-01-27 10:23:38 +01:00
NagyDonat	9b71393569	[analyzer] Avoid a crash in a debug printout function (#79446 ) Previously the function `RangeConstraintManager::printValue()` crashed when it encountered an empty rangeset (because `RangeSet::getBitwidth()` and `RangeSet::isUnsigned()` assert that the rangeset is not empty). This commit adds a special case that avoids this behavior. As `printValue()` is only used by the checker debug.ExprInspection (and during manual debugging), the impacts of this commit are very limited. --------- Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>	2024-01-25 17:03:09 +01:00
Artem Dergachev	721dd3bc2f	[analyzer] NFC: Don't regenerate duplicate HTML reports. This is a performance optimization for HTML diagnostics output mode. Currently they're incredibly inefficient: * The HTMLRewriter is re-run from scratch on every file on every report. Each such re-run involves re-lexing the entire file and producing a syntax-highlighted webpage of the entire file, with text behind macros duplicated as pop-up macro expansion tooltips. Then, warning and note bubbles are injected into the page. Only the bubble part is different across reports; everything else can theoretically be cached. * Additionally, if duplicate reports are emitted (with the same issue hash), HTMLRewriter will be re-run even though the output file is going to be discarded due to filename collision. This is mostly an issue for path-insensitive bug reports because path-sensitive bug reports are already deduplicated by the BugReporter as part of searching for the shortest bug path. But on some translation units almost 80% of bug reports are dry-run here. We only get away with all this because there are usually very few reports emitted per file. But if loud checkers are enabled, such as `webkit.`, this may explode in complexity and even cause the compiler to run over the 32-bit SourceLocation addressing limit. (We're re-lexing everything each time, remember?) This patch hotfixes the second* problem. Adds a FIXME for the first problem, which will require more yak shaving to solve. rdar://120801986	2024-01-11 15:16:10 -08:00
Balazs Benics	8ee3dfd746	[analyzer][NFC] Take SVal and NonLoc by value	2024-01-01 22:00:32 +01:00
Balazs Benics	18f219c5ac	[analyzer][NFC] Cleanup BugType lazy-init patterns (#76655 ) Cleanup most of the lazy-init `BugType` legacy. Some will be preserved, as those are slightly more complicated to refactor. Notice, that the default category for `BugType` is `LogicError`. I omitted setting this explicitly where I could. Please, actually have a look at the diff. I did this manually, and we rarely check the bug type descriptions and stuff in tests, so the testing might be shallow on this one.	2024-01-01 18:53:36 +01:00
Artem Dergachev	ef3f476097	[attributes][analyzer] Implement [[clang::suppress]] - suppress static analysis warnings. The new attribute can be placed on statements in order to suppress arbitrary warnings produced by static analysis tools at those statements. Previously such suppressions were implemented as either informal comments (eg. clang-tidy `// NOLINT:`) or with preprocessor macros (eg. clang static analyzer's `#ifdef __clang_analyzer__`). The attribute provides a universal, formal, flexible and neat-looking suppression mechanism. Implement support for the new attribute in the clang static analyzer; clang-tidy coming soon. The attribute allows specifying which specific warnings to suppress, in the form of free-form strings that are intended to be specific to the tools, but currently none are actually supported; so this is also going to be a future improvement. Differential Revision: https://reviews.llvm.org/D93110	2023-12-13 18:09:16 -08:00
Kazu Hirata	f3dcc2351c	[clang] Use StringRef::{starts,ends}_with (NFC) (#75149 ) This patch replaces uses of StringRef::{starts,ends}with with StringRef::{starts,ends}_with for consistency with std::{string,string_view}::{starts,ends}_with in C++20. I'm planning to deprecate and eventually remove StringRef::{starts,ends}with.	2023-12-13 08:54:13 -08:00
DonatNagyE	67f387c67e	[analyzer] Let the checkers query upper and lower bounds on symbols (#74141 ) This commit extends the class `SValBuilder` with the methods `getMinValue()` and `getMaxValue()` to that work like `SValBuilder::getKnownValue()` but return the minimal/maximal possible value the `SVal` is not perfectly constrained. This extension of the ConstraintManager API is discussed at: https://discourse.llvm.org/t/expose-the-inferred-range-information-in-warning-messages/75192 As a simple proof-of-concept application of this new API, this commit extends a message from `core.BitwiseShift` with some range information that reports the assumptions of the analyzer. My main motivation for adding these methods is that I'll also want to use them in `ArrayBoundCheckerV2` to make the error messages less awkward, but I'm starting with this simpler and less important usecase because I want to avoid merge conflicts with my other commit https://github.com/llvm/llvm-project/pull/72107 which is currently under review. The testcase `too_large_right_operand_compound()` shows a situation where querying the range information does not work (and the extra information is not added to the error message). This also affects the debug utility `clang_analyzer_value()`, so the problem isn't in the fresh code. I'll do some investigations to resolve this, but I think that this commit is a step forward even with this limitation.	2023-12-04 17:19:50 +01:00
DonatNagyE	0424546ed4	[analyzer] Use AllocaRegion in MallocChecker (#72402 ) ...to model the results of alloca() and _alloca() calls. Previously it acted as if these functions were returning memory from the heap, which led to alpha.security.ArrayBoundV2 producing incorrect messages.	2023-11-28 16:34:44 +01:00
Gábor Spaits	527fcb8e5d	[analyzer] Add std::variant checker (#66481 ) As my BSc thesis I've implemented a checker for std::variant and std::any, and in the following weeks I'll upload a revised version of them here. # Prelude @Szelethus and I sent out an email with our initial plans here: https://discourse.llvm.org/t/analyzer-new-checker-for-std-any-as-a-bsc-thesis/65613/2 We also created a stub checker patch here: https://reviews.llvm.org/D142354. Upon the recommendation of @haoNoQ , we explored an option where instead of writing a checker, we tried to improve on how the analyzer natively inlined the methods of std::variant and std::any. Our attempt is in this patch https://reviews.llvm.org/D145069, but in a nutshell, this is what happened: The analyzer was able to model much of what happened inside those classes, but our false positive suppression machinery erroneously suppressed it. After months of trying, we could not find a satisfying enhancement on the heuristic without introducing an allowlist/denylist of which functions to not suppress. As a result (and partly on the encouragement of @Xazax-hun) I wrote a dedicated checker! The advantage of the checker is that it is not dependent on the standard's implementation and won't put warnings in the standard library definitions. Also without the checker it would be difficult to create nice user-friendly warnings and NoteTags -- as per the standard's specification, the analysis is sinked by an exception, which we don't model well now. # Design ideas The working of the checker is straightforward: We find the creation of an std::variant instance, store the type of the variable we want to store in it, then save this type for the instance. When retrieving type from the instance we check what type we want to retrieve as, and compare it to the actual type. If the two don't march we emit an error. Distinguishing variants by instance (e.g. MemRegion *) is not the most optimal way. Other checkers, like MallocChecker uses a symbol-to-trait map instead of region-to-trait. The upside of using symbols (which would be the value of a variant, not the variant itself itself) is that the analyzer would take care of modeling copies, moves, invalidation, etc, out of the box. The problem is that for compound types, the analyzer doesn't create a symbol as a result of a constructor call that is fit for this job. MallocChecker in contrast manipulates simple pointers. My colleges and I considered the option of making adjustments directly to the memory model of the analyzer, but for the time being decided against it, and go with the bit more cumbersome, but immediately viable option of simply using MemRegions. # Current state and review plan This patch contains an already working checker that can find and report certain variant/any misuses, but still lands it in alpha. I plan to upload the rest of the checker in later patches. The full checker is also able to "follow" the symbolic value held by the std::variant and updates the program state whenever we assign the value stored in the variant. I have also built a library that is meant to model union-like types similar to variant, hence some functions being a bit more multipurpose then is immediately needed. I also intend to publish my std::any checker in a later commit. --------- Co-authored-by: Gabor Spaits <gabor.spaits@ericsson.com> Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>	2023-11-21 14:02:22 +01:00
Vlad Serebrennikov	dda8e3de35	[clang][NFC] Refactor `ImplicitParamDecl::ImplicitParamKind` This patch converts `ImplicitParamDecl::ImplicitParamKind` into a scoped enum at namespace scope, making it eligible for forward declaring. This is useful for `preferred_type` annotations on bit-fields.	2023-11-06 12:01:09 +03:00
Vlad Serebrennikov	a9070f22a2	[clang][NFC] Refactor `CXXConstructExpr::ConstructionKind` This patch converts `CXXConstructExpr::ConstructionKind` into a scoped enum in namespace scope, making it eligible for forward declaring. This is useful in cases like annotating bit-fields with `preferred_type`.	2023-11-05 16:38:45 +03:00
Balazs Benics	bde5717d46	[analyzer][NFC] Rework SVal kind representation (#71039 ) The goal of this patch is to refine how the `SVal` base and sub-kinds are represented by forming one unified enum describing the possible SVals. This means that the `unsigned SVal::Kind` and the attached bit-packing semantics would be replaced by a single unified enum. This is more conventional and leads to a better debugging experience by default. This eases the need of using debug pretty-printers, or the use of runtime functions doing the printing for us like we do today by calling `Val.dump()` whenever we inspect the values. Previously, the first 2 bits of the `unsigned SVal::Kind` discriminated the following quartet: `UndefinedVal`, `UnknownVal`, `Loc`, or `NonLoc`. The rest of the upper bits represented the sub-kind, where the value represented the index among only the `Loc`s or `NonLoc`s, effectively attaching 2 meanings of the upper bits depending on the base-kind. We don't need to pack these bits, as we have plenty even if we would use just a plan-old `unsigned char`. Consequently, in this patch, I propose to lay out all the (non-abstract) `SVal` kinds into a single enum, along with some metadata (`BEGIN_Loc`, `END_Loc`, `BEGIN_NonLoc`, `END_NonLoc`) artificial enum values, similar how we do with the `MemRegions`. Note that in the unified `SVal::Kind` enum, to differentiate `nonloc::ConcreteInt` from `loc::ConcreteInt`, I had to prefix them with `Loc` and `NonLoc` to resolve this ambiguity. This should not surface in general, because I'm replacing the `nonloc::Kind` enum items with `inline constexpr` global constants to mimic the original behavior - and offer nicer spelling to these enum values. Some `SVal` constructors were not marked explicit, which I now mark as such to follow best practices, and marked others as `/implicit/` to clarify the intent. During refactoring, I also found at least one function not marked `LLVM_ATTRIBUTE_RETURNS_NONNULL`, so I did that. The `TypeRetrievingVisitor` visitor had some accidental dead code, namely: `VisitNonLocConcreteInt` and `VisitLocConcreteInt`. Previously, the `SValVisitor` expected visit handlers of `VisitNonLocXXXXX(nonloc::XXXXX)` and `VisitLocXXXXX(loc::XXXXX)`, where I felt that envoding `NonLoc` and `Loc` in the name is not necessary as the type of the parameter would select the right overload anyways, so I simplified the naming of those visit functions. The rest of the diff is a lot of times just formatting, because `getKind()` by nature, frequently appears in switches, which means that the whole switch gets automatically reformatted. I could probably undo the formatting, but I didn't want to deviate from the rule unless explicitly requested.	2023-11-04 15:26:59 +01:00
Balazs Benics	51d15d13de	[analyzer] Fix assertion failure in `CXXInstanceCall::getCXXThisVal` (#70837 ) Workaround the case when the `this` pointer is actually a `NonLoc`, by returning `Unknown` instead. The solution isn't ideal, as `this` should be really a `Loc`, but due to how casts work, I feel this is our easiest and best option. As this patch presents, I'm evaluating a cast to transform the `NonLoc`. However, given that `evalCast()` can't be cast from `NonLoc` to a pointer type thingy (`Loc`), we end up with `Unknown`. It is because `EvalCastVisitor::VisitNonLocSymbolVal()` only evaluates casts that happen from NonLoc to NonLocs. When I tried to actually implement that case, I figured: 1) Create a `SymbolicRegion` from that `nonloc::SymbolVal`; but `SymbolRegion` ctor expects a pointer type for the symbol. 2) Okay, just have a `SymbolCast`, getting us the pointer type; but `SymbolRegion` expects `SymbolData` symbols, not generic `SymExpr`s, as stated: > // Because pointer arithmetic is represented by ElementRegion layers, > // the base symbol here should not contain any arithmetic. 3) We can't use `ElementRegion`s to perform this cast because to have an `ElementRegion`, you already have to have a `SubRegion` that you want to cast, but the point is that we don't have that. At this point, I gave up, and just left a FIXME instead, while still returning `Unknown` on that path. IMO this is still better than having a crash. Fixes #69922	2023-11-04 11:11:24 +01:00
Ella Ma	b6b31e791b	[analyzer] Fix uninitialized base class with initializer list when ctor is not declared in the base class Fixes #70464 When ctor is not declared in the base class, initializing the base class with the initializer list will not trigger a proper assignment of the base region, as a CXXConstructExpr doing that is not available in the AST. This patch checks whether the init expr is an InitListExpr under a base initializer, and adds a binding if so.	2023-11-01 17:50:01 +08:00
Qizhi Hu	1b6b4d6a08	[analyzer] Loop should contain CXXForRangeStmt (#70190 ) Static analyze can't report diagnose when statement after a CXXForRangeStmt and enable widen, because `ExprEngine::processCFGBlockEntrance` lacks of CXXForRangeStmt and when `AMgr.options.maxBlockVisitOnPath - 1` equals to `blockCount`, it can't widen. After next iteration, `BlockCount >= AMgr.options.maxBlockVisitOnPath` holds and generate a sink node. Add `CXXForRangeStmt` makes it work. Co-authored-by: huqizhi <836744285@qq.com>	2023-10-26 21:11:51 +08:00
Gábor Spaits	c68bc1726c	[analyzer] Fix note for member reference (#68691 ) In the following code: ```cpp int main() { struct Wrapper {char c; int &ref; }; Wrapper w = {.c = 'a', .ref = (int )0 }; w.ref = 1; } ``` The clang static analyzer will produce the following warnings and notes: ``` test.cpp:12:11: warning: Dereference of null pointer [core.NullDereference] 12 \| w.ref = 1; \| ~~~~~~^~~ test.cpp:11:5: note: 'w' initialized here 11 \| Wrapper w = {.c = 'a', .ref = (int )0 }; \| ^~~~~~~~~ test.cpp:12:11: note: Dereference of null pointer 12 \| w.ref = 1; \| ~~~~~~^~~ 1 warning generated. ``` In the line where `w` is created, the note gives information about the initialization of `w` instead of `w.ref`. Let's compare it to a similar case where a null pointer dereference happens to a pointer member: ```cpp int main() { struct Wrapper {char c; int ptr; }; Wrapper w = {.c = 'a', .ptr = nullptr }; w.ptr = 1; } ``` Here the following error and notes are seen: ``` test.cpp:18:12: warning: Dereference of null pointer (loaded from field 'ptr') [core.NullDereference] 18 \| w.ptr = 1; \| ~~~ ^ test.cpp:17:5: note: 'w.ptr' initialized to a null pointer value 17 \| Wrapper w = {.c = 'a', .ptr = nullptr }; \| ^~~~~~~~~ test.cpp:18:12: note: Dereference of null pointer (loaded from field 'ptr') 18 \| w.ptr = 1; \| ~~~ ^ 1 warning generated. ``` Here the note that shows the initialization the initialization of `w.ptr` in shown instead of `w`. This commit is here to achieve similar notes for member reference as the notes of member pointers, so the report looks like the following: ``` test.cpp:12:11: warning: Dereference of null pointer [core.NullDereference] 12 \| w.ref = 1; \| ~~~~~~^~~ test.cpp:11:5: note: 'w.ref' initialized to a null pointer value 11 \| Wrapper w = {.c = 'a', .ref = (int )0 }; \| ^~~~~~~~~ test.cpp:12:11: note: Dereference of null pointer 12 \| w.ref = 1; \| ~~~~~~^~~ 1 warning generated. ``` Here the initialization of `w.ref` is shown instead of `w`. --------- Authored-by: Gábor Spaits <gabor.spaits@ericsson.com> Reviewed-by: Donát Nagy <donat.nagy@ericsson.com>	2023-10-16 10:55:31 +02:00
vabridgers	dd01633c81	[analyzer] Fix crash in BasicValueFactory.cpp with __int128_t integers (#67212 ) This change avoids a crash in BasicValueFactory by checking the bit width of an APSInt to avoid calling getZExtValue if greater than 64-bits. This was caught by our internal, randomized test generator. Clang invocation clang -cc1 -analyzer-checker=optin.portability.UnixAPI case.c <src-root>/llvm/include/llvm/ADT/APInt.h:1488: uint64_t llvm::APInt::getZExtValue() const: Assertion `getActiveBits() <= 64 && "Too many bits for uint64_t"' failed. ... #9 <address> llvm::APInt::getZExtValue() const <src-root>/llvm/include/llvm/ADT/APInt.h:1488:5 clang::BinaryOperatorKind, llvm::APSInt const&, llvm::APSInt const&) <src-root>/clang/lib/StaticAnalyzer/Core/BasicValueFactory.cpp:307:37 llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::BinaryOperatorKind, clang::ento::NonLoc, clang::ento::NonLoc, clang::QualType) <src-root>/clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp:531:31 llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::BinaryOperatorKind, clang::ento::SVal, clang::ento::SVal, clang::QualType) <src-root>/clang/lib/StaticAnalyzer/Core/SValBuilder.cpp:532:26 ...	2023-10-02 09:54:22 -05:00
Corentin Jabot	af4751738d	[C++] Implement "Deducing this" (P0847R7) This patch implements P0847R7 (partially), CWG2561 and CWG2653. Reviewed By: aaron.ballman, #clang-language-wg Differential Revision: https://reviews.llvm.org/D140828	2023-10-02 14:33:02 +02:00
DonatNagyE	23b88e8123	[analyzer] Remove inaccurate legacy handling of bad bitwise shifts (#66647 ) Previously, bitwise shifts with constant operands were validated by the checker `core.UndefinedBinaryOperatorResult`. However, this logic was unreliable, and commit `25b9696b61` added the dedicated checker `core.BitwiseShift` which validated the preconditions of all bitwise shifts with a more accurate logic (that uses the real types from the AST instead of the unreliable type information encoded in `APSInt` objects). This commit disables the inaccurate logic that could mark bitwise shifts as 'undefined' and removes the redundant shift-related warning messages from core.UndefinedBinaryOperatorResult. The tests that were validating this logic are also deleted by this commit; but I verified that those testcases trigger the expected bug reports from `core.BitwiseShift`. (I didn't convert them to tests of `core.BitwiseShift`, because that checker already has its own extensive test suite with many analogous testcases.) I hope that there will be a time when the constant folding will be reliable, but until then we need hacky solutions like this improve the quality of results.	2023-09-29 20:02:38 +02:00
vabridgers	da26500aa8	[analyzer] Fix crash analyzing _BitInt() in evalIntegralCast (#66782 ) evalIntegralCast was using makeIntVal, and when _BitInt() types were introduced this exposed a crash in evalIntegralCast as a result. This is a reapply of a previous patch that failed post merge on the arm buildbots, because arm cannot handle large BitInts. Pinning the triple for the testcase solves that problem. Improve evalIntegralCast to use makeIntVal more efficiently to avoid the crash exposed by use of _BitInt. This was caught with our internal randomized testing. <src-root>/llvm/include/llvm/ADT/APInt.h:1510: int64_t llvm::APInt::getSExtValue() const: Assertion `getSignificantBits() <= 64 && "Too many bits for int64_t"' failed.a ... #9 <address> llvm::APInt::getSExtValue() const <src-root>/llvm/include/llvm/ADT/APInt.h:1510:5 llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::SVal, clang::QualType, clang::QualType) <src-root>/clang/lib/StaticAnalyzer/Core/SValBuilder.cpp:607:24 clang::Expr const, clang::ento::ExplodedNode, clang::ento::ExplodedNodeSet&) <src-root>/clang/lib/StaticAnalyzer/Core/ExprEngineC.cpp:413:61 ... Fixes: https://github.com/llvm/llvm-project/issues/61960 Reviewed By: donat.nagy	2023-09-20 06:11:39 -05:00

1 2 3 4 5 ...

2990 Commits