clang-p2996

Author	SHA1	Message	Date
Simon Pilgrim	6ba0b9f68a	[X86][SLM] Fix PBLENDVB uops and throughput SLM PBLENDVB is just as bad as BLENDVPD/PS - so model it as such, fixing the rr vs rm uops diff as well. The Intel AoM appears to have a copy+paste typo with PBLENDW, it doesn't match Agner or InstLatX64. Noticed while investigating some of the weird discrepancies reported by the D103695 helper script (SLM had much better vector shift throughputs than it should).	2021-09-03 11:31:29 +01:00
gbreynoo	e28cd75a50	[OptTable] Reapply Improve error message output for grouped short options This reapplies `71d7fed3bc` which was reverted by `3e2bd82f02`. This change includes the fix for breaking the sanitizer bots. As seen in https://bugs.llvm.org/show_bug.cgi?id=48880 the current implementation for parsing grouped short options can return unclear error messages. This change fixes the example given in the ticket in which a flag is incorrectly given an argument. Also when parsing a group we now keep reading past the first incorrect option and output errors for all incorrect options in the group. Differential Revision: https://reviews.llvm.org/D108770	2021-09-03 11:13:52 +01:00
Hongtao Yu	7ca8030030	[CSSPGO] Enable loading MD5 CS profile. Adding the compiler support of MD5 CS profile based on pervious context split work D107299. A MD5 CS profile is about 40% smaller than the string-based extbinary profile. As a result, the compilation is 15% faster. There are a few conversion from real names to md5 names that have been made on the sample loader and context tracker side to get it work. Reviewed By: wenlei, wmi Differential Revision: https://reviews.llvm.org/D108342	2021-09-01 09:19:47 -07:00
Kevin Athey	3e2bd82f02	Revert "[OptTable] Improve error message output for grouped short options" This reverts commit `71d7fed3bc`. Reason: broke sanitizer bots more info: https://reviews.llvm.org/D108770	2021-08-31 14:06:11 -07:00
wlei	964053d56f	[llvm-profgen] Support LBR only perf script This change aims at supporting LBR only sample perf script which is used for regular(Non-CS) profile generation. A LBR perf script includes a batch of LBR sample which starts with a frame pointer and a group of 32 LBR entries is followed. The FROM/TO LBR pair and the range between two consecutive entries (the former entry's TO and the latter entry's FROM) will be used to infer function profile info. An example of LBR perf script(created by `perf script -F ip,brstack -i perf.data`) ``` 40062f 0x40062f/0x4005b0/P/-/-/9 0x400645/0x4005ff/P/-/-/1 0x400637/0x400645/P/-/-/1 ... 4005d7 0x4005d7/0x4005e5/P/-/-/8 0x40062f/0x4005b0/P/-/-/6 0x400645/0x4005ff/P/-/-/1 ... ... ``` For implementation: - Extended a new child class `LBRPerfReader` for the sample parsing, reused all the functionalities in `extractLBRStack` except for an extension to parsing leading instruction pointer. - `HybridSample` is reused(just leave the call stack empty) and the parsed samples is still aggregated in `AggregatedSamples`. After that, range samples, branch sample, address samples are computed and recorded. - Reused `ContextSampleCounterMap` to store the raw profile, since it's no need to aggregation by context, here it just registered one sample counter with a fake context key. - Unified to use `show-raw-profile` instead of `show-unwinder-output` to dump the intermediate raw profile, see the comments of the format of the raw profile. For CS profile, it remains to output the unwinder output. Profile generation part will come soon. Differential Revision: https://reviews.llvm.org/D108153	2021-08-31 13:28:17 -07:00
gbreynoo	71d7fed3bc	[OptTable] Improve error message output for grouped short options As seen in https://bugs.llvm.org/show_bug.cgi?id=48880 the current implementation for parsing grouped short options can return unclear error messages. This change fixes the example given in the ticket in which a flag is incorrectly given an argument. Also when parsing a group we now keep reading past the first incorrect option and output errors for all incorrect options in the group. Differential Revision: https://reviews.llvm.org/D108770	2021-08-31 16:41:08 +01:00
Simon Pilgrim	7ec7272b80	[MCA][X86] Add basic coverage for icelake arch Copy the skylake-avx512 tests for icelake-server coverage. Add icelake/rocketlake/tigerlake test coverage to the relevent generic tests as well.	2021-08-31 12:20:09 +01:00
Hongtao Yu	b9db70369b	[CSSPGO] Split context string to deduplicate function name used in the context. Currently context strings contain a lot of duplicated function names and that significantly increase the profile size. This change split the context into a series of {name, offset, discriminator} tuples so function names used in the context can be replaced by the index into the name table and that significantly reduce the size consumed by context. A follow-up improvement made in the compiler and profiling tools is to avoid reconstructing full context strings which is time- and memory- consuming. Instead a context vector of `StringRef` is adopted to represent the full context in all scenarios. As a result, the previous prevalent profile map which was implemented as a `StringRef` is now engineered as an unordered map keyed by `SampleContext`. `SampleContext` is reshaped to using an `ArrayRef` to represent a full context for CS profile. For non-CS profile, it falls back to use `StringRef` to represent a contextless function name. Both the `ArrayRef` and `StringRef` objects are underpinned by real array and string objects that are stored in producer buffers. For compiler, they are maintained by the sample reader. For llvm-profgen, they are maintained in `ProfiledBinary` and `ProfileGenerator`. Full context strings can be generated only in those cases of debugging and printing. When it comes to profile format, nothing has changed to the text format, though internally CS context is implemented as a vector. Extbinary format is only changed for CS profile, with an additional `SecCSNameTable` section which stores all full contexts logically in the form of `vector<int>`, which each element as an offset points to `SecNameTable`. All occurrences of contexts elsewhere are redirected to using the offset of `SecCSNameTable`. Testing This is no-diff change in terms of code quality and profile content (for text profile). For our internal large service (aka ads), the profile generation is cut to half, with a 20x smaller string-based extbinary format generated. The compile time of ads is dropped by 25%. Differential Revision: https://reviews.llvm.org/D107299	2021-08-30 20:09:29 -07:00
Keith Smiley	b5da3120b8	[llvm-cov][NFC] Add test for coverage-prefix-map remappings This test covers acts as a regression test for these fixes: `c75a0a1e9d` `dd388ba3e0` Differential Revision: https://reviews.llvm.org/D108805	2021-08-30 17:19:57 -07:00
Haowei Wu	31e61c58b0	[ifs] Add option to hide undefined symbols This change add an option to llvm-ifs to hide undefined symbols from its output. Differential Revision: https://reviews.llvm.org/D108428	2021-08-27 11:15:56 -07:00
Roman Lebedev	d4d459e747	[X86] AMD Zen 3: MULX w/ mem operand has the same throughput as with reg op Exegesis is faulty and sometimes when measuring throughput^-1 produces snippets that have loop-carried dependencies, which must be what caused me to incorrectly measure it originally. After looking much more carefully, the inverse throughput should match that of the MULX w/ reg op. As per llvm-exegesis measurements.	2021-08-27 13:27:05 +03:00
Roman Lebedev	0f04936a2d	[X86] AMD Zen 3: MULX produces low part of the result in 3cy, +1cy for high part As per llvm-exegesis measurements.	2021-08-27 13:27:05 +03:00
Roman Lebedev	db2c6cd99c	[NFC][X86][MCA] AMD Zen 3: improve MULX test coverage Latency for MULX isn't right	2021-08-27 13:27:05 +03:00
Andrea Di Biagio	4a5b191703	[X86][MCA] Address the latest issues with MULX reported in PR51495. It turns out that SchedWrite WriteIMulH was always assigned to the low half of the result of a MULX (rather than to the high half). To avoid confusion, this patch swaps the two MULX writes in the tablegen definition of MULX32/64. That way, write names better describe what they actually refer to; this also avoids further complications if in future we decide to reuse the same MulH writes to also model other scalar integer multiply instructions. I also had to swap the latency values for the two MULX writes to make sure that the change is effectively an NFC. In fact, none of the existing x86 tests were affected by this small refactoring. This patch also fixes a bug in MCA: a wrong latency value was propagated for instructions that perform multiple writes to a same register. This last issue was found by Roman while testing MULX on targets that define a different latency for the Low/High part of the result. Differential Revision: https://reviews.llvm.org/D108727	2021-08-26 12:08:20 +01:00
David Green	6ffc6951a3	[AArch64] Remove unpredictable from narrowing instructions. Like other similar instructions the xtn2 family do not have side effects, and explicitly marking them as such can help improve scheduling freedom.	2021-08-26 09:43:44 +01:00
David Green	9474b03d41	[AArch64] Add a Cortex-A55 NEON scheduler test case.	2021-08-26 09:43:44 +01:00
Esme-Yi	b21ed75e10	[llvm-readobj][XCOFF] Add support for `--needed-libs` option. Summary: This patch is trying to add support for llvm-readobj --needed-libs option under XCOFF. For XCOFF, the needed libraries can be found from the Import File ID Name Table of the Loader Section. Currently, I am using binary inputs in the test since yaml2obj does not yet support for writing the Loader Section and the import file table. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D106643	2021-08-26 07:17:06 +00:00
Fangrui Song	4a66a11286	[LLVMgold.so][test] Make comdat-nodeduplicate.ll work with binutils<2.27	2021-08-25 16:59:06 -07:00
Andrea Di Biagio	6181427bb9	[X86][MCA] Add more tests for MULX (PR51495). llvm-mca still reports a wrong latency for the case where the two destination registers of MULX are the same.	2021-08-25 21:28:21 +01:00
Alfonso Sánchez-Beato	cdd407286a	[llvm-objcopy] [COFF] Consider section flags when adding section The --set-section-flags option was being ignored when adding a new section. Take it into account if present. Fixes https://llvm.org/PR51244 Reviewed By: jhenderson, MaskRay Differential Revision: https://reviews.llvm.org/D106942	2021-08-25 23:11:41 +03:00
Rong Xu	24201b6437	[SampleFDO] Set ProfileIsFS bit properly from the internal option We have "-profile-isfs" internal option for text, binary, and compactbinary format (mostly for debug and test purpose). We need to set the related flag in FunctionSamples so that ProfileIsFS is written to the header in extbinary format. Differential Revision: https://reviews.llvm.org/D108707	2021-08-25 09:07:34 -07:00
Wenlei He	a6f15e9a49	[CSSPGO] Use probe inline tree to track zero size fully optimized context for pre-inliner This is a follow up diff for BinarySizeContextTracker to track zero size for fully optimized inlinee. When an inlinee is fully optimized away, we won't be able to get its size through symbolizing instructions, hence we will treat the corresponding context size as unknown. However by traversing the inlined probe forest, we know what're original inlinees regardless of optimization. If a context show up in inlined probes, but not during symbolization, we know that it's fully optimized away hence its size is zero instead of unknown. It should provide more accurate size cost estimation for pre-inliner to make better inline decisions in llvm-profgen. Differential Revision: https://reviews.llvm.org/D108350	2021-08-25 09:01:11 -07:00
Andrea Di Biagio	5f848b311f	[X86][SchedModel] Fix latency the Hi register write of MULX (PR51495). Before this patch, WriteIMulH reported a latency value which is correct for the RR variant of MULX, but not for the RM variant. This patch fixes the issue by introducing a new WriteIMulHLd, which is meant to be used only by the RM variant of MULX. Differential Revision: https://reviews.llvm.org/D108701	2021-08-25 16:12:09 +01:00
Vyacheslav Zakharin	2e192ab1f4	[CodeExtractor] Preserve topological order for the return blocks. Differential Revision: https://reviews.llvm.org/D108673	2021-08-25 08:09:01 -07:00
Andrea Di Biagio	fe13b81ed9	[X86][NFC] Pre-commit llvm-mca tests for PR51495. WriteIMulH reports an incorrect latency for RM variants of MULX.	2021-08-25 14:17:17 +01:00
Fangrui Song	9b96b0865d	llvm-xray {convert,extract}: Add --demangle No demangling may be a better default in the future. Add `--demangle` for migration convenience. Reviewed By: Enna1 Differential Revision: https://reviews.llvm.org/D108100	2021-08-24 13:35:19 -07:00
Patrick Holland	e4ebfb5786	[MCA] Adding an AMDGPUCustomBehaviour implementation. This implementation allows mca to model the desired behaviour of the s_waitcnt instruction. This patch also adds the RetireOOO flag to the AMDGPU instructions within the scheduling model. This flag is only used by mca and allows instructions to finish out-of-order which helps mca's simulations more closely model the actual device. Differential Revision: https://reviews.llvm.org/D104730	2021-08-24 13:33:58 -07:00
Arthur Eubanks	d2e103644b	[llvm-reduce] Remove various module data This removes the data layout, target triple, source filename, and module identifier when possible. Reviewed By: swamulism Differential Revision: https://reviews.llvm.org/D108568	2021-08-24 09:45:31 -07:00
David Green	50f4ae58eb	[AArch64] Correct store ReadAdrBase operand It appears that the Read operand for stores was being placed on the first operand (the stored value) not the address base. This adds a ReadST for the stored value operand, allowing the ReadAdrBase to correctly act upon the address. Differential Revision: https://reviews.llvm.org/D108287	2021-08-23 21:07:55 +01:00
David Green	955c9437fd	[AArch64] Add Scheduling tests for Load/Store ReadAdv operands.	2021-08-23 21:07:55 +01:00
Alexey Lapshin	07d44cc0b1	[DWARF][Verifier] Do not add child DieRangeInfo with empty address range to the parent. verifyDieRanges function checks for the intersected address ranges. It adds child DieRangeInfo into parent DieRangeInfo to check whether children have overlapping address ranges. It is safe to not add DieRangeInfo with empty address range into parent's children list. This decreases the number of children which should be navigated and as a result decreases execution time(parents having a lot of children with empty ranges spend much time navigating them). For this command: "llvm-dwarfdump --verify clang-repl" execution time decreased from 220 sec till 75 sec. Differential Revision: https://reviews.llvm.org/D107554	2021-08-22 19:39:21 +03:00
Christian Fetzer	9116211d18	[Coverage][llvm-cov] Correctly export branch coverage in LCOV format Commit `9f2967bcfe` introduced support for branch coverage including export to the LCOV format. This commit corrects the LCOV field name for branches from BFH to BRH. The mistake seems to have slipped in as typo because the correct field name BRH is used in the comment section at the beginning of the file. Differential Revision: https://reviews.llvm.org/D108358	2021-08-20 13:44:25 -05:00
Andrea Di Biagio	35d4292a73	[X86][SchedModels] Fix missing ReadAdvance for MULX and ADCX/ADOX (PR51494) Before this patch, instructions MULX32rm and MULX64rm were missing a ReadAdvance for the implicit read of register EDX/RDX. This patch fixes the issue, and it also introduces a new SchedWrite for the two variants of MULX. The general idea behind this last change is to eventually decrease the number of InstRW in the scheduling models. This patch also adds a ReadAdvance for the implicit read of EFLAGS in ADCX/ADOX. Differential Revision: https://reviews.llvm.org/D108372	2021-08-20 17:39:51 +01:00
Maryam Benimmar	2cdfd0b259	[AIX][XCOFF] 64-bit relocation reading support Support XCOFFDumper relocation reading support This patch is part of D103696 partition Reviewed By: daltenty, Helflym Differential Revision: https://reviews.llvm.org/D104646	2021-08-19 21:56:57 -04:00
Andrzej Warzynski	dcc6b7b1d5	[OptTable] Refine how `printHelp` treats empty help texts Currently, `printHelp` behaves differently for options that: * do not define `HelpText` (such options _are not printed_), and * define its `HelpText` as `HelpText<"">` (such options _are printed_). In practice, both approaches lead to no help text and `printHelp` should treat them consistently. This patch addresses that by making `printHelpt` check the length of the help text to be printed. All affected tests have been updated accordingly. The option definitions for llvm-cvtres have been updated with a short description or "Not implemented" for options that are ignored by the tool. Differential Revision: https://reviews.llvm.org/D107557	2021-08-19 09:30:15 +00:00
Wenlei He	eca03d2768	[CSSPGO] Track and use context-sensitive post-optimization function size to drive global pre-inliner in llvm-profgen This change enables llvm-profgen to use accurate context-sensitive post-optimization function byte size as a cost proxy to drive global preinline decisions. To do this, BinarySizeContextTracker is introduced to track function byte size under different inline context during disassembling. In preinliner, we can not query context byte size under switch `context-cost-for-preinliner`. The tracker uses a reverse trie to keep size of functions under different context (callee as parent, caller as child), and it can give best/longest possible matching context size for given input context. The new size cost is off by default. There're a few TODOs that needs to addressed: 1) avoid dangling string from `Offset2LocStackMap`, which will be addressed in split context work; 2) using inlinee's entry probe to make sure we have correct zero size for inlinee that's completely optimized away after inlining. Some tuning is also needed. Differential Revision: https://reviews.llvm.org/D108180	2021-08-18 22:50:57 -07:00
Andrea Di Biagio	2d53e54f0e	[X86][NFC] Pre-commit tests for PR51494	2021-08-18 19:55:21 +01:00
Maryam Benimmar	7151a8aada	[PowerPC][AIX] llvm-readobj: Convert some errors to warnings. Report warnings rather than errors, so that llvm-readobj doesn't bail out on malformed inputs. Differential Revision: https://reviews.llvm.org/D106783	2021-08-18 11:04:08 -04:00
Xu Mingjie	168ee72718	[NFC][llvm-xray] add a llvm-xray convert option `no-demangle` When option `--symbolize` is true, llvm-xray convert will demangle function name on default. This patch adds a llvm-xray convert option `no-demangle` to determine whether to demangle function name when symbolizing function ids from the input log. Reviewed By: MaskRay, smeenai Differential Revision: https://reviews.llvm.org/D108019	2021-08-18 12:22:04 +08:00
Jozef Lawrynowicz	108ba4f4a4	[llvm-readobj] Refactor ELFDumper::printAttributes() The current implementation of printAttributes makes it fiddly to extend attribute support for new targets. By refactoring the code so all target specific variables are initialized in a switch/case statement, it becomes simpler to extend attribute support for new targets. Reviewed By: jhenderson, MaskRay Differential Revision: https://reviews.llvm.org/D107968	2021-08-17 13:28:31 -07:00
Tozer	6d5e31baaa	Fix 2: [MCParser] Correctly handle CRLF line ends when consuming line comments Fixes an issue with revision `5c6f748c` and `ad40cb88`. Adds an mcpu argument to the test command, preventing an invalid default CPU from being used on some platforms.	2021-08-17 17:13:21 +01:00
Fangrui Song	c56b4cfd4b	[llvm-objdump] -T: print symbol versions Similar to D94907 (llvm-nm -D). The output will match GNU objdump 2.37. Older versions don't use ` (version)` for undefined symbols. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D108097	2021-08-17 09:10:50 -07:00
Tozer	ad40cb8821	Fix: [MCParser] Correctly handle CRLF line ends when consuming line comments Fixes an issue with revision `5c6f748c`. Move the test added in the above commit into the X86 folder, ensuring that it is only run on targets where its triple is valid.	2021-08-17 16:16:19 +01:00
Tozer	5c6f748cbc	[MCParser] Correctly handle CRLF line ends when consuming line comments Fixes issue: https://bugs.llvm.org/show_bug.cgi?id=47983 The AsmLexer currently has an issue with lexing line comments in files with CRLF line endings, in which it reads the carriage return as being part of the line comment. This causes an error for certain valid comment layouts; this patch fixes this by excluding the carriage return from the line comment. Differential Revision: https://reviews.llvm.org/D90234	2021-08-17 15:52:51 +01:00
Fangrui Song	54e76cb17a	[split-file] Default to --no-leading-lines It turns out that the --leading-lines may be a bad default. [[#@LINE+-num]] is rarely used.	2021-08-16 19:23:11 -07:00
Hongtao Yu	f27fee623d	[SamplePGO][NFC] Dump function profiles in order Sample profiles are stored in a string map which is basically an unordered map. Printing out profiles by simply walking the string map doesn't enforce an order. I'm sorting the map in the decreasing order of total samples to enable a more stable dump, which is good for comparing two dumps. Reviewed By: wenlei, wlei Differential Revision: https://reviews.llvm.org/D108147	2021-08-16 17:22:30 -07:00
Fangrui Song	935a6d4024	[test] Change llvm-xray options to use the preferred double-dash forms and change -f= to -f	2021-08-15 21:19:04 -07:00
David Blaikie	44d0a99a12	Add missing triple for test	2021-08-15 12:32:12 -07:00
David Blaikie	62a4c2c10e	DWARFVerifier: Check section-relative references at the end of the section This ensures that debug_types references aren't looked for in debug_info section. Behavior is still going to be questionable in an unlinked object file - since cross-cu references could refer to symbols in another .debug_info (or, in theory, .debug_types) chunk - but if a producer only uses ref_addr to refer to things within the same .debug_info chunk in an object file (eg: whole program optimization/LTO - producing two CUs into a single .debug_info section in an object file - the ref_addrs there could be resolved relative to that .debug_info chunk, not needing to consider comdat (DWARFv5 type units or other creatures) chunks of .debug_info, etc)	2021-08-15 11:40:24 -07:00
David Blaikie	2af4db7d5c	Migrate DWARFVerifier tests to lit-based yaml instead of gtest with embedded yaml Improves maintainability (edit/modify the tests without recompiling) and error messages (previously the failure would be a gtest failure mentioning nothing of the input or desired text) and the option to improve tests with more checks. (maybe these tests shouldn't all be in separate files - we could probably have DWARF yaml that contains multiple errors while still being fairly maintainable - the various invalid offsets (ref_addr, rnglists, ranges, etc) could probably be all in one test, but for the simple sake of the migration I just did the mechanical thing here)	2021-08-13 19:09:41 -07:00

1 2 3 4 5 ...

5315 Commits