clang-p2996

Author	SHA1	Message	Date
Kazu Hirata	b48b422c08	[Serialization] Avoid repeated hash lookups (NFC) (#126429 )	2025-02-09 13:33:46 -08:00
David Pagan	a5fc7c3ac1	[clang][OpenMP] New OpenMP 6.0 assumption clause, 'no_openmp_constructs' (#125933 ) Add initial parsing/sema support for new assumption clause so clause can be specified. For now, it's ignored, just like the others. Added support for 'no_openmp_construct' to release notes. Testing - Updated appropriate LIT tests. - Testing: check-all	2025-02-06 12:41:10 -08:00
Michael Park	a9e249f64e	[C++20][Modules][Serialization] Delay marking pending incomplete decl chains until the end of `finishPendingActions`. (#121245 ) The call to `hasBody` inside `finishPendingActions` that bumps the `PendingIncompleteDeclChains` size from `0` to `1`, and also sets the `LazyVal->LastGeneration` to `6` which matches the `LazyVal->ExternalSource->getGeneration()` value of `6`. Later, the iterations over `redecls()` (which calls `getNextRedeclaration`) is expected to trigger the reload, but it does not since the generation numbers match. The proposed solution is to perform the marking of incomplete decl chains at the end of `finishPendingActions`. This way, all of the incomplete decls are marked incomplete as a post-condition of `finishPendingActions`. It's also safe to delay this operation since any operation being done within `finishPendingActions` has `NumCurrentElementsDeserializing == 1`, which means that any calls to `CompleteDeclChain` would simply add to the `PendingIncompleteDeclChains` without doing anything anyway.	2025-02-03 11:22:02 -08:00
Nikolas Klauser	0865ecc515	[clang] Extend diagnose_if to accept more detailed warning information, take 2 (#119712 ) This is take two of #70976. This iteration of the patch makes sure that custom diagnostics without any warning group don't get promoted by `-Werror` or `-Wfatal-errors`. This implements parts of the extension proposed in https://discourse.llvm.org/t/exposing-the-diagnostic-engine-to-c/73092/7. Specifically, this makes it possible to specify a diagnostic group in an optional third argument.	2025-01-28 08:41:31 +01:00
Ilya Biryukov	f63e8ed16e	Revert "[Modules] Delay deserialization of preferred_name attribute at r… (#122726 )" This reverts commit `c3ba6f378e`. We are seeing performance regressions of up to 40% on some compilations with this patch, we will investigate and reland after fixing performance issues.	2025-01-22 18:17:37 +01:00
Chuanqi Xu	fb2c9d940a	[C++20] [Modules] Makes sure internal declaration won't be found by other TU (#123059 ) Close https://github.com/llvm/llvm-project/issues/61427 And this is also helpful to implement https://github.com/llvm/llvm-project/issues/112294 partially. The implementation strategy mimics https://github.com/llvm/llvm-project/pull/122887. This patch split the internal declarations from the general lookup table so that other TU can't find the internal declarations.	2025-01-17 21:03:53 +08:00
Viktoriia Bakalova	c3ba6f378e	[Modules] Delay deserialization of preferred_name attribute at r… (#122726 ) …ecord level. This fixes the incorrect diagnostic emitted when compiling the following snippet ``` // string_view.h template<class _CharT> class basic_string_view; typedef basic_string_view<char> string_view; template<class _CharT> class __attribute__((__preferred_name__(string_view))) basic_string_view { public: basic_string_view() { } }; inline basic_string_view<char> foo() { return basic_string_view<char>(); } // A.cppm module; #include "string_view.h" export module A; // Use.cppm module; #include "string_view.h" export module Use; import A; ``` The diagnostic is ``` string_view.h:11:5: error: 'basic_string_view<char>::basic_string_view' from module 'A.<global>' is not present in definition of 'string_view' provided earlier ``` The underlying issue is that deserialization of the `preferred_name` attribute triggers deserialization of `basic_string_view<char>`, which triggers the deserialization of the `preferred_name` attribute again (since it's attached to the `basic_string_view` template). The deserialization logic is implemented in a way that prevents it from going on a loop in a literal sense (it detects early on that it has already seen the `string_view` typedef when trying to start its deserialization for the second time), but leaves the typedef deserialization in an unfinished state. Subsequently, the `string_view` typedef from the deserialized module cannot be merged with the same typedef from `string_view.h`, resulting in the above diagnostic. This PR resolves the problem by delaying the deserialization of the `preferred_name` attribute until the deserialization of the `basic_string_view` template is completed. As a result of deferring, the deserialization of the `preferred_name` attribute doesn't need to go on a loop since the type of the `string_view` typedef is already known when it's deserialized.	2025-01-17 09:10:58 +01:00
Chuanqi Xu	c5e4afe673	[C++20] [Modules] Support module level lookup (#122887 ) (#123281 ) Close https://github.com/llvm/llvm-project/issues/90154 This patch is also an optimization to the lookup process to utilize the information provided by `export` keyword. Previously, in the lookup process, the `export` keyword only takes part in the check part, it doesn't get involved in the lookup process. That said, previously, in a name lookup for 'name', we would load all of declarations with the name 'name' and check if these declarations are valid or not. It works well. But it is inefficient since it may load declarations that may not be wanted. Note that this patch actually did a trick in the lookup process instead of bring module information to DeclarationName or considering module information when deciding if two declarations are the same. So it may not be a surprise to me if there are missing cases. But it is not a regression. It should be already the case. Issue reports are welcomed. In this patch, I tried to split the big lookup table into a lookup table as before and a module local lookup table, which takes a combination of the ID of the DeclContext and hash value of the primary module name as the key. And refactored `DeclContext::lookup()` method to take the module information. So that a lookup in a DeclContext won't load declarations that are local to other modules. And also I think it is already beneficial to split the big lookup table since it may reduce the conflicts during lookups in the hash table. BTW, this patch introduced a regression for a reachability rule in C++20 but it was false-negative. See 'clang/test/CXX/module/module.interface/p7.cpp' for details. This patch is not expected to introduce any other regressions for non-c++20-modules users since the module local lookup table should be empty for them.	2025-01-17 13:41:44 +08:00
Chuanqi Xu	263fed7ce9	[AST] Add OriginalDC argument to ExternalASTSource::FindExternalVisibleDeclsByName (#123152 ) Part for relanding https://github.com/llvm/llvm-project/pull/122887. I split this to test where the performance regession comes from if modules are not used.	2025-01-17 12:46:00 +08:00
Chuanqi Xu	731db2a03e	Revert "[C++20] [Modules] Support module level lookup (#122887 )" This reverts commit `7201cae106`.	2025-01-16 10:23:11 +08:00
Chuanqi Xu	7201cae106	[C++20] [Modules] Support module level lookup (#122887 ) Close https://github.com/llvm/llvm-project/issues/90154 This patch is also an optimization to the lookup process to utilize the information provided by `export` keyword. Previously, in the lookup process, the `export` keyword only takes part in the check part, it doesn't get involved in the lookup process. That said, previously, in a name lookup for 'name', we would load all of declarations with the name 'name' and check if these declarations are valid or not. It works well. But it is inefficient since it may load declarations that may not be wanted. Note that this patch actually did a trick in the lookup process instead of bring module information to DeclarationName or considering module information when deciding if two declarations are the same. So it may not be a surprise to me if there are missing cases. But it is not a regression. It should be already the case. Issue reports are welcomed. In this patch, I tried to split the big lookup table into a lookup table as before and a module local lookup table, which takes a combination of the ID of the DeclContext and hash value of the primary module name as the key. And refactored `DeclContext::lookup()` method to take the module information. So that a lookup in a DeclContext won't load declarations that are local to other modules. And also I think it is already beneficial to split the big lookup table since it may reduce the conflicts during lookups in the hash table. BTW, this patch introduced a regression for a reachability rule in C++20 but it was false-negative. See 'clang/test/CXX/module/module.interface/p7.cpp' for details. This patch is not expected to introduce any other regressions for non-c++20-modules users since the module local lookup table should be empty for them. --- On the API side, this patch unfortunately add a maybe-confusing argument `Module NamedModule` to `ExternalASTSource::FindExternalVisibleDeclsByName()`. People may think we can get the information from the first argument `const DeclContext DC`. But sadly there are declarations (e.g., namespace) can appear in multiple different modules as a single declaration. So we have to add additional information to indicate this.	2025-01-15 15:15:35 +08:00
David Pagan	ad38e24eb7	[clang][OpenMP] Add 'align' modifier for 'allocate' clause (#121814 ) The 'align' modifier is now accepted in the 'allocate' clause. Added LIT tests covering codegen, PCH, template handling, and serialization for 'align' modifier. Added support for align-modifier to release notes. Testing - New allocate modifier LIT tests. - OpenMP LIT tests. - check-all	2025-01-13 05:44:48 -08:00
erichkeane	be32621ce8	[OpenACC] Implement 'device' and 'host' clauses for 'update' These two clauses just take a 'var-list' and specify where the variables should be copied from/to. This patch implements the AST nodes for them and ensures they properly take a var-list.	2025-01-09 09:28:58 -08:00
erichkeane	2c2accbcc6	[OpenACC] Enable 'self' sema for 'update' construct The 'self' clause is an unfortunately difficult one, as it has a significantly different meaning between 'update' and the other constructs. This patch introduces a way for the 'self' clause to work as both. I considered making this two separate AST nodes (one for 'self' on 'update' and one for the others), however this makes the automated macros/etc for supporting a clause break. Instead, 'self' has the ability to act as either a condition or as a var-list clause. As this is the only one of its kind, it is implemented all within it. If in the future we have more that work like this, we should consider rewriting a lot of the macros that we use to make clauses work, and make them separate ast nodes.	2025-01-08 13:19:33 -08:00
erichkeane	ff24e9a19e	[OpenACC] Implement 'default_async' sema A fairly simple one, only valid on the 'set' construct, this clause takes an int expression. Most of the work was already done as a part of parsing, so this patch ends up being a lot of infrastructure.	2025-01-06 11:03:18 -08:00
Chuanqi Xu	4b35dd57b8	[Serialization] Try to clean up PendingUndeducedFunctionDecls when PendingUndeducedFunctionDecls is not empty Close https://github.com/llvm/llvm-project/issues/120277 This turns out to be a simple oversight initially. See the analysis in `ba1e84fb8f` for the wider background.	2024-12-23 15:14:38 +08:00
erichkeane	bdf2555308	[OpenACC] Implement 'device_num' clause sema for 'init'/'shutdown' This is a very simple sema implementation, and just required AST node plus the existing diagnostics. This patch adds tests and adds the AST node required, plus enables it for 'init' and 'shutdown' (only!)	2024-12-19 12:21:51 -08:00
erichkeane	fbb14dd977	[OpenACC] Implement 'use_device' clause AST/Sema This is a clause that is only valid on 'host_data' constructs, and identifies variables which it should use the current device address. From a Sema perspective, the only thing novel here is mild changes to how ActOnVar works for this clause, else this is very much like the rest of the 'var-list' clauses.	2024-12-16 09:35:57 -08:00
erichkeane	1ab81f8e7f	[OpenACC] Implement 'delete' AST/Sema for 'exit data' construct 'delete' is another clause that has very little compile-time implication, but needs a full AST that takes a var list. This patch ipmlements it fully, plus adds sufficient test coverage.	2024-12-16 06:44:53 -08:00
Dmitry Polukhin	38b3d87bd3	[C++20][Modules] Load function body from the module that gives canonical decl (#111992 ) Summary: Fix crash from reproducer provided in https://github.com/llvm/llvm-project/pull/109167#issuecomment-2405289565 Also fix issues with merged inline friend functions merged during deserialization. Test Plan: check-clang	2024-12-16 12:22:43 +00:00
erichkeane	3351b3bf8d	[OpenACC] implement 'detach' clause sema This is another new clause specific to 'exit data' that takes a pointer argument. This patch implements this the same way we do a few other clauses (like attach) that have the same restrictions.	2024-12-13 13:51:41 -08:00
erichkeane	2244d2e75c	[OpenACC] Implement 'if_present' clause sema The 'if_present' clause controls the replacement of addresses in the var-list in current device memory. This clause can only go on 'host_device'. From a Sema perspective, there isn't anything to do beyond add this to AST and pass it on.	2024-12-13 13:04:57 -08:00
erichkeane	003eb5e80d	[OpenACC] Implement 'finalize' clause sema This is a very simple clause as far as sema is concerned. It is only valid on 'exit data', and doesn't have any rules involving it, so it is simply applied and passed onto the MLIR.	2024-12-13 10:41:02 -08:00
Chuanqi Xu	20e9049509	[Serialization] Support loading template specializations lazily (#119333 ) Reland https://github.com/llvm/llvm-project/pull/83237 --- (Original comments) Currently all the specializations of a template (including instantiation, specialization and partial specializations) will be loaded at once if we want to instantiate another instance for the template, or find instantiation for the template, or just want to complete the redecl chain. This means basically we need to load every specializations for the template once the template declaration got loaded. This is bad since when we load a specialization, we need to load all of its template arguments. Then we have to deserialize a lot of unnecessary declarations. For example, ``` // M.cppm export module M; export template <class T> class A {}; export class ShouldNotBeLoaded {}; export class Temp { A<ShouldNotBeLoaded> AS; }; // use.cpp import M; A<int> a; ``` We have a specialization ` A<ShouldNotBeLoaded>` in `M.cppm` and we instantiate the template `A` in `use.cpp`. Then we will deserialize `ShouldNotBeLoaded` surprisingly when compiling `use.cpp`. And this patch tries to avoid that. Given that the templates are heavily used in C++, this is a pain point for the performance. This patch adds MultiOnDiskHashTable for specializations in the ASTReader. Then we will only deserialize the specializations with the same template arguments. We made that by using ODRHash for the template arguments as the key of the hash table. To review this patch, I think `ASTReaderDecl::AddLazySpecializations` may be a good entry point.	2024-12-11 09:40:47 +08:00
Haowei Wu	12bdeba76e	Revert "[Serialization] Support load lazy specialization lazily" This reverts commit `b5bd192111`. It brokes multiple llvm bots including clang-x64-windows-msvc	2024-12-06 10:33:57 -08:00
Chuanqi Xu	b5bd192111	[Serialization] Support load lazy specialization lazily Currently all the specializations of a template (including instantiation, specialization and partial specializations) will be loaded at once if we want to instantiate another instance for the template, or find instantiation for the template, or just want to complete the redecl chain. This means basically we need to load every specializations for the template once the template declaration got loaded. This is bad since when we load a specialization, we need to load all of its template arguments. Then we have to deserialize a lot of unnecessary declarations. For example, ``` // M.cppm export module M; export template <class T> class A {}; export class ShouldNotBeLoaded {}; export class Temp { A<ShouldNotBeLoaded> AS; }; // use.cpp import M; A<int> a; ``` We should a specialization ` A<ShouldNotBeLoaded>` in `M.cppm` and we instantiate the template `A` in `use.cpp`. Then we will deserialize `ShouldNotBeLoaded` surprisingly when compiling `use.cpp`. And this patch tries to avoid that. Given that the templates are heavily used in C++, this is a pain point for the performance. This patch adds MultiOnDiskHashTable for specializations in the ASTReader. Then we will only deserialize the specializations with the same template arguments. We made that by using ODRHash for the template arguments as the key of the hash table. To review this patch, I think `ASTReaderDecl::AddLazySpecializations` may be a good entry point. The patch was reviewed in https://github.com/llvm/llvm-project/pull/83237 but that PR is a stacked PR. But I feel the intention of the stacked PRs get lost during the review process. So I feel it is better to merge the commits into a single commit instead of merging them in the PR page. It is better for us to cherry-pick and revert.	2024-12-06 10:52:35 +08:00
Chuanqi Xu	99de065b85	Revert "[Serialization] Downgrade inconsistent flags from erros to warnings (#115416 )" This reverts commit `74449ab86b`. See the post commit message in https://github.com/llvm/llvm-project/pull/115416	2024-11-27 11:35:49 +08:00
Chuanqi Xu	74449ab86b	[Serialization] Downgrade inconsistent flags from erros to warnings (#115416 ) There were many many "voices" about the too strict flags checking in modules. Although they rarely challenge this, maybe due to they respect to the compiler implementation details. But from my point of view, there are cases it is "fine" to have different flags. Especially we're too conservative to mark almost language options in `clang/include/clang/Basic/LangOptions.def` as incompatible options (see the comments in the front of the file). In my understanding, this should come from PCH initially since it is natural to ask your headers to be compiled with the same flags with your TU. And then, when Apple and Google goes to implement clang module, they don't challenge it too since they have a closed world where they have a strong control over the ecosystem so that they can make it consistent. Yes, consistency is great and ODR violation are awful. But this is the world we're living today. This is the C++'s ecosystem in the open ended world. Image a situation that we're using a third party module and we add a new option to our library, then the build bails out! THIS IS SUPER ANNOYING. And makes it non practical to make a modular C++ ecosystem. ( This was discussed many times in SG15. And the consensus is, the build systems should generate different BMI based on different flags. But this manner can't avoid ODR violation completely and it would add the times of module files that need to be built, which may kill the benefit of faster compilation of modules. However, I think the build systems may need to do the similar things in the end of the day. Considering libc++'s hardening mechanism (https://libcxx.llvm.org/Hardening.html). So the conclusion of the paragraph is, although this seems related to build systems, I think they are actually unrelated story. ) I think we should give our users a chance to disable such checks. It is theoretically unsafe. But we've done our job to tell the users that it MAY be bad. Then I feel it is C++-ish to give users more freedom even if they may shoot their foot. This shouldn't change any thing. Users who want previous behavior can get it easily by `-Werror=`.	2024-11-27 10:53:03 +08:00
Jan Svoboda	b769e3544a	[clang][serialization] Blobify IMPORTS strings and signatures (#116095 ) This PR changes a part of the PCM format to store string-like things in the blob attached to a record instead of VBR6-encoding them into the record itself. Applied to the `IMPORTS` section (which is very hot), this speeds up dependency scanning by 2.8%.	2024-11-18 11:45:41 -08:00
Kadir Cetinkaya	5845688e91	Reapply "[clang] Introduce diagnostics suppression mappings (#112517 )" This reverts commit `5f140ba547`.	2024-11-13 10:35:22 +01:00
Balázs Kéri	7a1fdbb9c0	[clang][AST] Add 'IgnoreTemplateParmDepth' to structural equivalence cache (#115518 ) Structural equivalence check uses a cache to store already found non-equivalent values. This cache can be reused for calls (ASTImporter does this). Value of "IgnoreTemplateParmDepth" can have an effect on the structural equivalence therefore it is wrong to reuse the same cache for checks with different values of 'IgnoreTemplateParmDepth'. The current change adds the 'IgnoreTemplateParmDepth' to the cache key to fix the problem.	2024-11-13 09:25:22 +01:00
Kadir Cetinkaya	5f140ba547	Revert "[clang] Introduce diagnostics suppression mappings (#112517 )" This reverts commit `12e3ed8de8`. This reverts commit `41e3919ded`. There are some buildbot breakages in https://lab.llvm.org/buildbot/#/builders/18/builds/6832.	2024-11-12 18:30:42 +01:00
kadir çetinkaya	41e3919ded	[clang] Introduce diagnostics suppression mappings (#112517 ) This implements https://discourse.llvm.org/t/rfc-add-support-for-controlling-diagnostics-severities-at-file-level-granularity-through-command-line/81292. Users now can suppress warnings for certain headers by providing a mapping with globs, a sample file looks like: ``` [unused] src:* src:clang/=emit ``` This will suppress warnings from `-Wunused` group in all files that aren't under `clang/` directory. This mapping file can be passed to clang via `--warning-suppression-mappings=foo.txt`. At a high level, mapping file is stored in DiagnosticOptions and then processed with rest of the warning flags when creating a DiagnosticsEngine. This is a functor that uses SpecialCaseLists underneath to match against globs coming from the mappings file. This implies processing warning options now performs IO, relevant interfaces are updated to take in a VFS, falling back to RealFileSystem when one is not available.	2024-11-12 10:53:43 +01:00
Jan Svoboda	25d1ac11d5	[clang][deps] Only write preprocessor info into PCMs (#115239 ) This patch builds on top of https://github.com/llvm/llvm-project/pull/115237 and https://github.com/llvm/llvm-project/pull/115235, only passing the `Preprocessor` object to `ASTWriter`. This reduces the size of scanning PCM files by 1/3 and speeds up scans by 16%.	2024-11-11 13:07:08 -08:00
Jan Svoboda	9d4837f47c	[clang][deps][modules] Allocate input file paths lazily (#114457 ) This PR builds on top of #113984 and attempts to avoid allocating input file paths eagerly. Instead, the `InputFileInfo` type used by `ASTReader` now only holds `StringRef`s that point into the PCM file buffer, and the full input file paths get resolved on demand. The dependency scanner makes use of this in a bit of a roundabout way: `ModuleDeps` now only holds (an owning copy of) the short unresolved input file paths, which get resolved lazily. This can be a big win, I'm seeing up to a 5% speedup.	2024-11-11 09:46:50 -08:00
Ilya Biryukov	f02b1cc99e	[ASTWriter] Detect more non-affecting FileIDs to reduce source location duplication (#112015 ) Currently, any FileID that references a module map file that was required for a compilation is considered as affecting. This misses an important opportunity to reduce the source location space taken by the resulting PCM. In particular, consider the situation where the same module map file is passed multiple times in the dependency chain: ```shell $ clang -fmodule-map-file=foo.modulemap ... -o mod1.pcm $ clang -fmodule-map-file=foo.modulemap -fmodule-file=mod1.pcm ... -o mod2.pcm ... $ clang -fmodule-map-file=foo.modulemap -fmodule-file=mod$((N-1)).pcm ... -o mod$N.pcm ``` Because `foo.modulemap` is read before reading any of the `.pcm` files, we have to create a unique `FileID` for it when creating each module. However, when reading the `.pcm` files, we will reuse the `FileID` loaded from it for the same module map file and the `FileID` we created can never be used again, but we will still mark it as affecting and it will take the source location space in the output PCM. For a chain of N dependencies, this results in the file taking `N * (size of file)` source location space, which could be significant. For examples, we observer internally that some targets that run out of 2GB of source location space end up wasting up to 20% of that space in module maps as described above. I take extra care to still write the InputFile entries for those files that occupied source location space before. It is required for correctness of clang-scan-deps.	2024-11-08 09:10:37 +01:00
Krystian Stasiowski	44ab3805b5	Revert "Reapply "[Clang][Sema] Refactor collection of multi-level template argument lists (#106585 , #111173 )" (#111852 )" (#115159 ) This reverts commit `2bb3d3a3f3`.	2024-11-06 09:25:29 -05:00
David Pagan	435e58468a	[clang][OpenMP] Add 'allocator' modifier for 'allocate' clause. (#114883 ) The 'allocator' modifier is now accepted in the 'allocate' clause. Added LIT tests covering codegen, PCH, template handling, and serialization for 'allocator' modifier. Added support for allocator-modifier to release notes. Testing - New allocate modifier LIT tests. - OpenMP LIT tests. - check-all - relevant sollve_vv test cases tests/5.2/scope/test_scope_allocate_construct.c	2024-11-05 17:06:41 -08:00
Jan Svoboda	e494e2694a	[clang][lex] Remove `HeaderFileInfo::Framework` (#114460 ) This PR removes the `HeaderFileInfo::Framework` member and reduces the size of this data type from 32B to 16B. This should improve Clang's memory usage in situations where it keeps track of lots of header files. NFCI. Depends on #114459.	2024-10-31 16:33:28 -07:00
Jan Svoboda	19b4f17d4c	[clang][lex] Remove `-index-header-map` (#114459 ) This PR removes the `-index-header-map` functionality from Clang. AFAIK this was only used internally at Apple and is now dead code. The main motivation behind this change is to enable the removal of `HeaderFileInfo::Framework` member and reducing the size of that data structure. rdar://84036149	2024-10-31 16:04:35 -07:00
Jan Svoboda	a553c620b7	[clang][modules] Avoid allocations when reading blob paths (#113984 ) When reading a path from a bitstream blob, `ASTReader` performs up to three allocations: 1. Conversion of the `StringRef` blob into `std::string` to conform to the `ResolveImportedPath()` API that takes `std::string &`. 2. Concatenation of the module file prefix directory and the relative path into a fresh `SmallString<128>` buffer in `ResolveImportedPath()`. 3. Propagating the result out of `ResolveImportedPath()` by calling `std::string::assign()` on the out-parameter. This patch makes is so that we avoid allocations altogether (amortized) by: 1. Avoiding conversion of the `StringRef` blob into `std::string` and changing the `ResolveImportedPath()` API. 2. Using one "global" buffer to hold the concatenation. 3. Returning `StringRef` that points into the buffer and ensuring the contents are not overwritten while it lives. Note that in some places of the bitstream we don't store paths as blobs, but rather as records that get VBR-encoded. This makes the allocation in (1) unavoidable. I plan to fix this in a follow-up PR by changing the PCM format. Moreover, there are some data structures (e.g. `serialization::InputFileInfo`) that store deserialized and resolved paths as `std::string`. If we don't access them frequently, it would be more efficient to store just the unresolved `StringRef` and resolve them on demand (within some kind of shared buffer to prevent allocations). This PR alone improves `clang-scan-deps` performance on my workload by 3.6%.	2024-10-31 10:18:21 -07:00
Jan Svoboda	be60afec92	[clang][modules] De-duplicate some logic in `HeaderFileInfoTrait` (#114330 )	2024-10-31 09:05:06 -07:00
Jan Svoboda	19131c7f36	[clang][modules][lldb] Fix build after #113391 Instead of changing the return type of `ModuleMap::findOrCreateModule`, this patch adds a counterpart that only returns `Module *` and thus has the same signature as `createModule()`, which is important in `ASTReader`.	2024-10-28 12:50:53 -07:00
Jan Svoboda	6c6351ee35	[clang][modules] Optimize construction and usage of the submodule index (#113391 ) This patch avoids eagerly populating the submodule index on `Module` construction. The `StringMap` allocation shows up in my profiles of `clang-scan-deps`, while the index is not necessary most of the time. We still construct it on-demand. Moreover, this patch avoids performing qualified submodule lookup in `ASTReader` whenever we're serializing a module graph whose top-level module is unknown. This is pointless, since that's guaranteed to never find any existing submodules anyway. This speeds up `clang-scan-deps` by ~0.5% on my workload.	2024-10-28 11:47:59 -07:00
Jan Svoboda	da1a16ae10	[clang][modules] Preserve the module map that allowed inferring (#113389 ) With inferred modules, the dependency scanner takes care to replace the fake "__inferred_module.map" path with the file that allowed the module to be inferred. However, this only worked when such a module was imported directly in the TU. Whenever such module got loaded transitively, the scanner would fail to perform the replacement. This is caused by the fact that PCM files are lossy and drop this information. This patch makes sure that PCMs include this file for each submodule (in the `SUBMODULE_DEFINITION` record), fixes one existing test with an incorrect assertion, and does a little drive-by refactoring of `ModuleMap`.	2024-10-28 11:24:27 -07:00
Jan Svoboda	0ffa29fe81	[clang][modules] Timestamp PCM files when writing (#112452 ) Clang uses timestamp files to track the last time an implicitly-built PCM file was verified to be up-to-date with regard to its inputs. With `-fbuild-session-{file,timestamp}=` and `-fmodules-validate-once-per-build-session` this reduces the number of times a PCM file is checked per "build session". The behavior I'm seeing with the current scheme is that when lots of Clang instances wait for the same PCM to be built, they race to validate it as soon as the file lock gets released, causing lots of concurrent IO. This patch makes it so that the timestamp is written by the same Clang instance responsible for building the PCM while still holding the lock. This makes it so that whenever a PCM file gets compiled, it's never re-validated in the same build session. I believe this is as sound as the current scheme. One thing to be aware of is that there might be a time interval between accessing input file N and writing the timestamp file, where changes to input files 0..<N would not result in a rebuild. Since this is the case current scheme too, I'm not too concerned about that. I've seen this speed up `clang-scan-deps` by ~27%.	2024-10-22 15:08:02 -07:00
Abhina Sree	46dc91e7d9	[SystemZ][z/OS] Add new openFileForReadBinary function, and pass IsText parameter to getBufferForFile (#111723 ) This patch adds an IsText parameter to the following getBufferForFile, getBufferForFileImpl. We introduce a new virtual function openFileForReadBinary which defaults to openFileForRead except in RealFileSystem which uses the OF_None flag instead of OF_Text. The default is set to OF_Text instead of OF_None, this change in value does not affect any other platforms other than z/OS. Setting this parameter correctly is required to open files on z/OS in the correct encoding. The IsText parameter is based on the context of where we open files, for example, in the ASTReader, HeaderMap requires that files always be opened in binary even though they might be tagged as text.	2024-10-21 08:20:22 -04:00
Boaz Brickner	09cc75e2cc	[clang] Deduplicate the logic that only warns once when stack is almost full (#112552 ) Zero diff in behavior.	2024-10-18 10:11:14 +02:00
Erich Keane	c8cbdc659c	[OpenACC] Implement 'loop' 'vector' clause (#112259 ) The 'vector' clause specifies the iterations to be executed in vector or SIMD mode. There are some limitations on which associated compute contexts may be associated with this and have arguments, but otherwise this is a fairly unrestricted clause. It DOES have region limits like 'gang' and 'worker'.	2024-10-15 06:12:19 -07:00
Erich Keane	cf456ed2a4	[OpenACC] implement loop 'worker' clause. (#112206 ) The worker clause specifies iterations of the loop/ that are executed in parallel by distributing the iterations among the multiple works within a single gang. The sema rules for this type are simply that it cannot be combined with a `kernel` construct with a `num_workers` clause, child `loop` clauses cannot contain a `gang` or `worker` clause, and that the argument is oly allowed when associated with a `kernel`.	2024-10-14 09:08:24 -07:00

1 2 3 4 5 ...

1752 Commits