clang-p2996

Author	SHA1	Message	Date
Fangrui Song	e1a073c9d9	[ELF] Change Ctx::target to unique_ptr (#111260 ) also rename `TargetInfo *getXXXTargetInfo` to `void setXXXTargetInfo` and change it to set `ctx.target`. This ensures that when `ctx` becomes a local variable, two lld invocations will not reuse the function-local static variable. --- Reland after commit `c35214c131` ([ELF] Initialize TargetInfo members).	2024-10-07 23:14:02 -07:00
Paul Kirth	2ca850111f	Revert "[ELF] Change Ctx::target to unique_ptr (#111260 )" (#111449 ) This patch seems to be breaking the windows build bots. https://lab.llvm.org/buildbot/#/builders/63/builds/1953 We also see this in Fuchsia's Linux CI: https://fxbug.dev/372010530 This reverts commit `4ec06b1743`.	2024-10-07 15:43:01 -07:00
Fangrui Song	4ec06b1743	[ELF] Change Ctx::target to unique_ptr (#111260 ) also rename `TargetInfo *getXXXTargetInfo` to `void setXXXTargetInfo` and change it to set `ctx.target`. This ensures that when `ctx` becomes a local variable, two lld invocations will not reuse the function-local static variable.	2024-10-06 21:47:13 -07:00
Fangrui Song	cfd3289a1f	[ELF] Pass Ctx & to some free functions	2024-10-06 19:36:21 -07:00
Fangrui Song	f2b0133858	[ELF] Move static nextGroupId isInGroup to LinkerDriver	2024-10-06 17:38:35 -07:00
Fangrui Song	079b8327ec	[ELF] Pass Ctx & to InputFiles and SyntheticSections	2024-09-29 16:06:47 -07:00
Fangrui Song	df0864e761	[ELF] Move elf::symtab into Ctx Remove the global variable `symtab` and add a member variable (`std::unique_ptr<SymbolTable>`) to `Ctx` instead. This is one step toward eliminating global states. Pull Request: https://github.com/llvm/llvm-project/pull/109612	2024-09-23 10:33:43 -07:00
Fangrui Song	e1a1f18022	[ELF] Make `Config config` part of `Ctx ctx`	2024-09-22 18:18:27 -07:00
Fangrui Song	6f482010ae	[ELF] Replace config-> with ctx.arg.	2024-09-21 22:46:13 -07:00
Fangrui Song	d9045420ae	[ELF] Add Config &Ctx::arg. NFC And migrate LinkerDriver member functions to use `ctx.arg.x` instead of `config->x`.	2024-09-16 16:44:53 -07:00
Fangrui Song	bffb26f153	[ELF] Add LinkerDriver::ctx. NFC	2024-09-16 16:25:28 -07:00
Fangrui Song	e88b7ff016	[ELF] Move InStruct into Ctx. NFC Ctx was introduced in March 2022 as a more suitable place for such singletons. llvm/Support/thread.h includes <thread>, which transitively includes sstream in libc++ and uses ios_base::in, so we cannot use `#define in ctx.sec`. `symtab, config, ctx` are now the only variables using LLVM_LIBRARY_VISIBILITY.	2024-09-15 22:15:02 -07:00
Vitaly Buka	a248ec3178	Revert "[ELF] Move InStruct into Ctx. NFC" The define breaks `std::in`. https://lab.llvm.org/buildbot/#/builders/169/builds/3253 This reverts commit `2531b46264`.	2024-09-15 18:22:42 -07:00
Fangrui Song	2531b46264	[ELF] Move InStruct into Ctx. NFC Ctx was introduced in March 2022 as a more suitable place for such singletons. `#define in ctx.sec` is used for now to avoid migrating `in.xxx`.	2024-09-15 16:59:28 -07:00
Fangrui Song	40e8e4ddcb	[ELF] Move partitions into ctx. NFC Ctx was introduced in March 2022 as a more suitable place for such singletons.	2024-09-15 14:52:56 -07:00
Fangrui Song	b4feb26606	[ELF] Move target to Ctx. NFC Ctx was introduced in March 2022 as a more suitable place for such singletons. Follow-up to driver (2022-10) and script (2024-08).	2024-08-21 23:53:36 -07:00
Fangrui Song	4629aa1797	[ELF] Move script into Ctx. NFC Ctx was introduced in March 2022 as a more suitable place for such singletons. We now use default-initialization for `LinkerScript` and should pay attention to non-class types (e.g. `dot` is initialized by commit `503907dc50`).	2024-08-21 21:23:28 -07:00
Fangrui Song	c62fa63ff1	[ELF] Move mainPart to Ctx. NFC Ctx was introduced in March 2022 as a more suitable place for such singletons.	2024-08-21 20:08:11 -07:00
Fangrui Song	89b1468345	[ELF] Move ppc64noTocRelax to Ctx. NFC Ctx was introduced in March 2022 as a more suitable place for such singletons.	2024-08-21 00:10:31 -07:00
Oliver Stannard	a1c6467bd9	[lld][ARM] Fix assertion when mixing ARM and Thumb objects (#101985 ) Previously, we selected the Thumb2 PLT sequences if any input object is marked as not supporting the ARM ISA, which then causes assertion failures when calls from ARM code in other objects are seen. I think the intention here was to only use Thumb PLTs when the target does not have the ARM ISA available, signalled by no objects being marked as having it available. To do that we need to track which ISAs we have seen as we parse the build attributes, and defer the decision about PLTs until all input objects have been parsed. This bug was triggered by real code in picolibc, which have some versions of string.h functions built with Thumb2-only build attributes, so that they are compatible with v7-A, v7-R and v7-M. Fixes #99008.	2024-08-07 10:20:26 +01:00
Fangrui Song	2fe3bbdf67	[ELF] Move outputSections into Ctx. NFC Ctx was introduced in March 2022 as a more suitable place for such singletons.	2024-08-03 11:50:48 -07:00
Fangrui Song	03be619d94	[ELF] Move ElfSym into Ctx. NFC Ctx was introduced in March 2022 as a more suitable place for such singletons. ctx's hidden visibility optimizes generated instructions. This change fixes a pitfall: certain ElfSym members (e.g. globalOffsetTable, tlsModuleBase) were not zeroed and might be stale when lld::elf::link was invoked the second time.	2024-08-03 11:20:32 -07:00
Fangrui Song	09dd0febbb	[ELF] Move Out into Ctx. NFC Ctx was introduced in March 2022 as a more suitable place for such singletons. ctx's hidden visibility optimizes generated instructions. bufferStart and tlsPhdr, which are not OutputSection, can now be moved outside of `Out`.	2024-08-03 11:00:11 -07:00
Fangrui Song	5d972c582a	[ELF] Add -z nosectionheader GNU ld since 2.41 supports this option, which is mildly useful. It omits the section header table and non-ALLOC sections (including .symtab/.strtab (--strip-all)). This option is simple to implement and might be used by LLDB to test program headers parsing without the section header table (#100900). -z sectionheader, which is the default, is also added. Pull Request: https://github.com/llvm/llvm-project/pull/101286	2024-07-31 12:57:23 -07:00
Fangrui Song	8e2476e102	[ELF] Move SymbolAux into Ctx. NFC The number of uses is modest.	2024-07-28 20:51:33 -07:00
Fangrui Song	fd791f0fe5	[ELF] Move TarWriter into Ctx. NFC Similar to `e980f16d52`.	2024-07-28 15:32:22 -07:00
Fangrui Song	4d9020ca0b	[ELF] Implement --force-group-allocation GNU ld's relocatable linking behaviors: * Sections with the `SHF_GROUP` flag are handled like sections matched by the `--unique=pattern` option. They are processed like orphan sections and ignored by input section descriptions. * Section groups' (usually named `.group`) content is updated as the section indexes are updated. Section groups can be discarded with `/DISCARD/ : { (.group) }`. `-r --force-group-allocation` discards section groups and allows sections with the `SHF_GROUP` flag to be matched like normal sections. If two section group members are placed into the same output section, their relocation sections (if present) are combined as well. This behavior can be useful when -r output is used as a pseudo shared object (e.g., FreeBSD's amd64 kernel modules, CHERIoT compartments). This patch implements --force-group-allocation: Input SHT_GROUP sections are discarded. * Input sections do not get the SHF_GROUP flag, so `addInputSec` will combine relocation sections if their relocated section group members are combined. The default behavior is: * Input SHT_GROUP sections are retained. * Input SHF_GROUP sections can be matched (unlike GNU ld) * Input SHF_GROUP sections keep the SHF_GROUP flag, so `addInputSec` will create different OutputDesc copies. GNU ld provides the `FORCE_GROUP_ALLOCATION` command, which is not implemented. Pull Request: https://github.com/llvm/llvm-project/pull/94704	2024-06-07 14:19:06 -07:00
PiJoules	025394fa0d	Reapply "[lld] Support thumb PLTs" (#93631 ) (#93644 ) This reverts commit `7832769d32`. This was reverted prior due to a test failure on the windows builder. I think this was because we didn't specify the triple and assumed windows. The other tests use the full triple specifying linux, so we follow suite here. --- We are using PLTs for cortex-m33 which only supports thumb. More specifically, this is for a very restricted use case. There's no MMU so there's no sharing of virtual addresses between two processes, but this is fine. The MCU is used for running [chre nanoapps](https://android.googlesource.com/platform/system/chre/+/HEAD/doc/nanoapp_overview.md) for android. Each nanoapp is a shared library (but effectively acts as an executable containing a test suite) that is loaded and run on the MCU one binary at a time and there's only one process running at a time, so we ensure that the same text segment cannot be shared by two different running executables. GNU LD supports thumb PLTs but we want to migrate to a clang toolchain and use LLD, so thumb PLTs are needed.	2024-05-29 13:28:32 -07:00
Mehdi Amini	7832769d32	Revert "[lld] Support thumb PLTs" (#93631 ) Reverts llvm/llvm-project#86223 windows pre-merge is broken.	2024-05-28 19:46:23 -06:00
PiJoules	760c2aa55f	[lld] Support thumb PLTs (#86223 ) We are using PLTs for cortex-m33 which only supports thumb. More specifically, this is for a very restricted use case. There's no MMU so there's no sharing of virtual addresses between two processes, but this is fine. The MCU is used for running [chre nanoapps](https://android.googlesource.com/platform/system/chre/+/HEAD/doc/nanoapp_overview.md) for android. Each nanoapp is a shared library (but effectively acts as an executable containing a test suite) that is loaded and run on the MCU one binary at a time and there's only one process running at a time, so we ensure that the same text segment cannot be shared by two different running executables. GNU LD supports thumb PLTs but we want to migrate to a clang toolchain and use LLD, so thumb PLTs are needed.	2024-05-28 15:37:03 -07:00
John Brawn	cfeb25cd7e	[lld][AArch64] Add support for GCS (#90732 ) This adds the -z gcs and -z gcs-report options, which behave similarly to -z shtk and -z cet-report, except that -z gcs accepts a parameter: * -z gcs=implicit is the default behaviour, where the GCS bit is inferred from the input objects. * -z gcs=never clears the GCS bit, ignoring the input objects. * -z gcs=always sets the GCS bit, ignoring the input objects. This is so that there's a means of explicitly disabling GCS even when all input objects have the GCS bit set.	2024-05-21 17:34:17 +01:00
Daniel Thornburgh	66466ff151	Reland: [LLD] Implement --enable-non-contiguous-regions (#90007 ) When enabled, input sections that would otherwise overflow a memory region are instead spilled to the next matching output section. This feature parallels the one in GNU LD, but there are some differences from its documented behavior: - /DISCARD/ only matches previously-unmatched sections (i.e., the flag does not affect it). - If a section fails to fit at any of its matches, the link fails instead of discarding the section. - The flag --enable-non-contiguous-regions-warnings is not implemented, as it exists to warn about such occurrences. The implementation places stubs at possible spill locations, and replaces them with the original input section when effecting spills. Spilling decisions occur after address assignment. Sections are spilled in reverse order of assignment, with each spill naively decreasing the size of the affected memory regions. This continues until the memory regions are brought back under size. Spilling anything causes another pass of address assignment, and this continues to fixed point. Spilling after rather than during assignment allows the algorithm to consider the size effects of unspillable input sections that appear later in the assignment. Otherwise, such sections (e.g. thunks) may force an overflow, even if spilling something earlier could have avoided it. A few notable feature interactions occur: - Stubs affect alignment, ONLY_IF_RO, etc, broadly as if a copy of the input section were actually placed there. - SHF_MERGE synthetic sections use the spill list of their first contained input section (the one that gives the section its name). - ICF occurs oblivious to spill sections; spill lists for merged-away sections become inert and are removed after assignment. - SHF_LINK_ORDER and .ARM.exidx are ordered according to the final section ordering, after all spilling has completed. - INSERT BEFORE/AFTER and OVERWRITE_SECTIONS are explicitly disallowed.	2024-05-13 11:06:54 -07:00
Daniel Thornburgh	81f34afa5c	Revert "[LLD] Implement --enable-non-contiguous-regions" (#92005 ) Reverts llvm/llvm-project#90007 Broke in merging I think.	2024-05-13 10:38:40 -07:00
Daniel Thornburgh	673114447b	[LLD] Implement --enable-non-contiguous-regions (#90007 ) When enabled, input sections that would otherwise overflow a memory region are instead spilled to the next matching output section. This feature parallels the one in GNU LD, but there are some differences from its documented behavior: - /DISCARD/ only matches previously-unmatched sections (i.e., the flag does not affect it). - If a section fails to fit at any of its matches, the link fails instead of discarding the section. - The flag --enable-non-contiguous-regions-warnings is not implemented, as it exists to warn about such occurrences. The implementation places stubs at possible spill locations, and replaces them with the original input section when effecting spills. Spilling decisions occur after address assignment. Sections are spilled in reverse order of assignment, with each spill naively decreasing the size of the affected memory regions. This continues until the memory regions are brought back under size. Spilling anything causes another pass of address assignment, and this continues to fixed point. Spilling after rather than during assignment allows the algorithm to consider the size effects of unspillable input sections that appear later in the assignment. Otherwise, such sections (e.g. thunks) may force an overflow, even if spilling something earlier could have avoided it. A few notable feature interactions occur: - Stubs affect alignment, ONLY_IF_RO, etc, broadly as if a copy of the input section were actually placed there. - SHF_MERGE synthetic sections use the spill list of their first contained input section (the one that gives the section its name). - ICF occurs oblivious to spill sections; spill lists for merged-away sections become inert and are removed after assignment. - SHF_LINK_ORDER and .ARM.exidx are ordered according to the final section ordering, after all spilling has completed. - INSERT BEFORE/AFTER and OVERWRITE_SECTIONS are explicitly disallowed.	2024-05-13 10:30:50 -07:00
Fangrui Song	6d44a1ef55	[ELF] Adjust --compress-sections to support compression level zstd excels at scaling from low-ratio-very-fast to high-ratio-pretty-slow. Some users prioritize speed and prefer disk read speed, while others focus on achieving the highest compression ratio possible, similar to traditional high-ratio codecs like LZMA. Add an optional `level` to `--compress-sections` (#84855) to cater to these diverse needs. While we initially aimed for a one-size-fits-all approach, this no longer seems to work. (https://richg42.blogspot.com/2015/11/the-lossless-decompression-pareto.html) When --compress-debug-sections is used together, make --compress-sections take precedence since --compress-sections is usually more specific. Remove the level distinction between -O/-O1 and -O2 for --compress-debug-sections=zlib for a more consistent user experience. Pull Request: https://github.com/llvm/llvm-project/pull/90567	2024-05-01 11:40:46 -07:00
cmtice	16711b431b	[lld][ELF] Add --debug-names to create merged .debug_names. (#86508 ) `clang -g -gpubnames` (with optional -gsplit-dwarf) creates the `.debug_names` section ("per-CU" index). By default lld concatenates input `.debug_names` sections into an output `.debug_names` section. LLDB can consume the concatenated section but the lookup performance is not good. This patch adds --debug-names to create a per-module index by combining the per-CU indexes into a single index that covers the entire load module. The produced `.debug_names` is a replacement for `.gdb_index`. Type units (-fdebug-types-section) are not handled yet. Co-authored-by: Fangrui Song <i@maskray.me> --------- Co-authored-by: Fangrui Song <i@maskray.me>	2024-04-18 14:41:14 -07:00
Daniil Kovalev	cca9115b1c	[lld][AArch64][ELF][PAC] Support AUTH relocations and AUTH ELF marking (#72714 ) This patch adds lld support for: - Dynamic R_AARCH64_AUTH_* relocations (without including RELR compressed AUTH relocations) as described here: https://github.com/ARM-software/abi-aa/blob/main/pauthabielf64/pauthabielf64.rst#auth-variant-dynamic-relocations - .note.AARCH64-PAUTH-ABI-tag section as defined here https://github.com/ARM-software/abi-aa/blob/main/pauthabielf64/pauthabielf64.rst#elf-marking Depends on #72713 and #85231 --------- Co-authored-by: Peter Collingbourne <peter@pcc.me.uk> Co-authored-by: Fangrui Song <i@maskray.me>	2024-04-04 12:38:09 +03:00
Fangrui Song	dd7151094f	[ELF] Simplify parseArmCMSEImportLib. NFC	2024-03-25 15:46:52 -07:00
Fangrui Song	36146d2b6b	[ELF] Make LinkerDrive::link a template. NFC This avoids many invokeELFT in `link`.	2024-03-19 17:12:40 -07:00
Fangrui Song	e115c00565	[ELF] Reject certain unknown section types (#85173 ) Unknown section sections may require special linking rules, and rejecting such sections for older linkers may be desired. For example, if we introduce a new section type to replace a control structure (e.g. relocations), it would be nice for older linkers to reject the new section type. GNU ld allows certain unknown section types: * [SHT_LOUSER,SHT_HIUSER] and non-SHF_ALLOC * [SHT_LOOS,SHT_HIOS] and non-SHF_OS_NONCONFORMING but reports errors and stops linking for others (unless --no-warn-mismatch is specified). Port its behavior. For convenience, we additionally allow all [SHT_LOPROC,SHT_HIPROC] types so that we don't have to hard code all known types for each processor. Close https://github.com/llvm/llvm-project/issues/84812	2024-03-15 09:50:23 -07:00
Fangrui Song	f1ca2a0967	[ELF] Add --compress-section to compress matched non-SHF_ALLOC sections --compress-sections <section-glib>=[none\|zlib\|zstd] is similar to --compress-debug-sections but applies to broader sections without the SHF_ALLOC flag. lld will report an error if a SHF_ALLOC section is matched. An interesting use case is to compress `.strtab`/`.symtab`, which consume a significant portion of the file size (15.1% for a release build of Clang). An older revision is available at https://reviews.llvm.org/D154641 . This patch focuses on non-allocated sections for safety. Moving `maybeCompress` as D154641 does not handle STT_SECTION symbols for `-r --compress-debug-sections=zlib` (see `relocatable-section-symbol.s` from #66804). Since different output sections may use different compression algorithms, we need CompressedData::type to generalize config->compressDebugSections. GNU ld feature request: https://sourceware.org/bugzilla/show_bug.cgi?id=27452 Link: https://discourse.llvm.org/t/rfc-compress-arbitrary-sections-with-ld-lld-compress-sections/71674 Pull Request: https://github.com/llvm/llvm-project/pull/84855	2024-03-12 10:56:14 -07:00
Fangrui Song	78762357d4	[ELF] Support placing .lbss/.lrodata/.ldata after .bss https://reviews.llvm.org/D150510 places .lrodata before .rodata to minimize the number of permission transitions in the memory image. However, this layout is less ideal for -fno-pic code (which is still important). Small code model -fno-pic code has R_X86_64_32S relocations with a range of `[0,231)` (if we ignore the negative area). Placing `.lrodata` earlier exerts relocation pressure on such code. Non-x86 64-bit architectures generally have a similar `[0,231)` limitation if they don't use PC-relative relocations. If we place .lrodata later, we will need one extra PT_LOAD. Two layouts are appealing: * .bss/.lbss/.lrodata/.ldata (GNU ld) * .bss/.ldata/.lbss/.lrodata The GNU ld layout has the nice property that there is only one BSS (except .tbss/.relro_padding). Add -z lrodata-after-bss to support this layout. Since a read-only PT_LOAD segment (for large data sections) may appear after RW PT_LOAD segments. The placement of `_etext` has to be adjusted. Pull Request: https://github.com/llvm/llvm-project/pull/81224	2024-02-20 13:59:49 -08:00
Rahman Lavaee	acec6419e8	[SHT_LLVM_BB_ADDR_MAP] Allow basic-block-sections and labels be used together by decoupling the handling of the two features. (#74128 ) Today `-split-machine-functions` and `-fbasic-block-sections={all,list}` cannot be combined with `-basic-block-sections=labels` (the labels option will be ignored). The inconsistency comes from the way basic block address map -- the underlying mechanism for basic block labels -- encodes basic block addresses (https://lists.llvm.org/pipermail/llvm-dev/2020-July/143512.html). Specifically, basic block offsets are computed relative to the function begin symbol. This relies on functions being contiguous which is not the case for MFS and basic block section binaries. This means Propeller cannot use binary profiles collected from these binaries, which limits the applicability of Propeller for iterative optimization. To make the `SHT_LLVM_BB_ADDR_MAP` feature work with basic block section binaries, we propose modifying the encoding of this section as follows. First let us review the current encoding which emits the address of each function and its number of basic blocks, followed by basic block entries for each basic block. \| \| \| \|--\|--\| \| Address of the function \| Function Address \| \| Number of basic blocks in this function \| NumBlocks \| \| BB entry 1 \| BB entry 2 \| ... \| BB entry #NumBlocks To make this work for basic block sections, we treat each basic block section similar to a function, except that basic block sections of the same function must be encapsulated in the same structure so we can map all of them to their single function. We modify the encoding to first emit the number of basic block sections (BB ranges) in the function. Then we emit the address map of each basic block section section as before: the base address of the section, its number of blocks, and BB entries for its basic block. The first section in the BB address map is always the function entry section. \| \| \| \|--\|--\| \| Number of sections for this function \| NumBBRanges \| \| Section 1 begin address \| BaseAddress[1] \| \| Number of basic blocks in section 1 \| NumBlocks[1] \| \| BB entries for Section 1 \|..................\| \| Section #NumBBRanges begin address \| BaseAddress[NumBBRanges] \| \| Number of basic blocks in section #NumBBRanges \| NumBlocks[NumBBRanges] \| \| BB entries for Section #NumBBRanges The encoding of basic block entries remains as before with the minor change that each basic block offset is now computed relative to the begin symbol of its containing BB section. This patch adds a new boolean codegen option `-basic-block-address-map`. Correspondingly, the front-end flag `-fbasic-block-address-map` and LLD flag `--lto-basic-block-address-map` are introduced. Analogously, we add a new TargetOption field `BBAddrMap`. This means BB address maps are either generated for all functions in the compiling unit, or for none (depending on `TargetOptions::BBAddrMap`). This patch keeps the functionality of the old `-fbasic-block-sections=labels` option but does not remove it. A subsequent patch will remove the obsolete option. We refactor the `BasicBlockSections` pass by separating the BB address map and BB sections handing to their own functions (named `handleBBAddrMap` and `handleBBSections`). `handleBBSections` renumbers basic blocks and places them in their assigned sections. `handleBBAddrMap` is invoked after `handleBBSections` (if requested) and only renumbers the blocks. - New tests added: - Two tests basic-block-address-map-with-basic-block-sections.ll and basic-block-address-map-with-mfs.ll to exercise the combination of `-basic-block-address-map` with `-basic-block-sections=list` and '-split-machine-functions`. - A driver sanity test for the `-fbasic-block-address-map` option (basic-block-address-map.c). - An LLD test for testing the `--lto-basic-block-address-map` option. This reuses the LLVM IR from `lld/test/ELF/lto/basic-block-sections.ll`. - Renamed and modified the two existing codegen tests for basic block address map (`basic-block-sections-labels-functions-sections.ll` and `basic-block-sections-labels.ll`) - Removed `SHT_LLVM_BB_ADDR_MAP_V0` tests. Full deprecation of `SHT_LLVM_BB_ADDR_MAP_V0` and `SHT_LLVM_BB_ADDR_MAP` version less than 2 will happen in a separate PR in a few months.	2024-02-01 17:50:46 -08:00
Fangrui Song	43b13341fb	[ELF] Add internal InputFile (#78944 ) Based on https://reviews.llvm.org/D45375 . Introduce a new InputFile kind `InternalKind`, use it for * `ctx.internalFile`: for linker-defined symbols and some synthesized `Undefined` * `createInternalFile`: for symbol assignments and --defsym I picked "internal" instead of "synthetic" to avoid confusion with SyntheticSection. Currently a symbol's file is one of: nullptr, ObjKind, SharedKind, BitcodeKind, BinaryKind. Now it's non-null (I plan to add an `assert(file)` to Symbol::Symbol and change `toString(const InputFile *)` separately). Debugging and error reporting gets improved. The immediate user-facing difference is more descriptive "File" column in the --cref output. This patch may unlock further simplification. Currently each symbol assignment gets its own `createInternalFile(cmd->location)`. Two symbol assignments in a linker script do not share the same file. Making the file the same would be nice, but would require non trivial code.	2024-01-22 09:09:46 -08:00
Kazu Hirata	d7b18d5083	Use llvm::endianness{,::little,::native} (NFC) Now that llvm::support::endianness has been renamed to llvm::endianness, we can use the shorter form. This patch replaces llvm::support::endianness with llvm::endianness.	2023-10-09 00:54:47 -07:00
spupyrev	904b3f66f5	[ELF] A new code layout algorithm for function reordering [3a/3] We are brining a new algorithm for function layout (reordering) based on the call graph (extracted from a profile data). The algorithm is an improvement of top of a known heuristic, C^3. It tries to co-locate hot and frequently executed together functions in the resulting ordering. Unlike C^3, it explores a larger search space and have an objective closely tied to the performance of instruction and i-TLB caches. Hence, the name CDS = Cache-Directed Sort. The algorithm can be used at the linking or post-linking (e.g., BOLT) stage. Refer to https://reviews.llvm.org/D152834 for the actual implementation of the reordering algorithm. This diff adds a linker option to replace the existing C^3 heuristic with CDS. The new behavior can be turned on by passing "--use-cache-directed-sort". (the plan is to make it default in a next diff) Perf-impact clang-10 binary (built with LTO+AutoFDO/CSSPGO): wins on top of C^3 in [0.3%..0.8%] rocksDB-8 binary (built with LTO+CSSPGO): wins on top of C^3 in [0.8%..1.5%] Note that function layout affects the perf the most on older machines (with smaller instruction/iTLB caches) and when huge pages are not enabled. The impact on newer processors with huge pages enabled is likely neutral/minor. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D152840	2023-09-26 06:24:34 -07:00
Fangrui Song	8c556b7e2b	[ELF] Change --call-graph-profile-sort to accept an argument Change the FF form --call-graph-profile-sort to --call-graph-profile-sort={none,hfsort}. This will be extended to support llvm/lib/Transforms/Utils/CodeLayout.cpp. --call-graph-profile-sort is not used in the wild but --no-call-graph-profile-sort is (Chromium). Make --no-call-graph-profile-sort an alias for --call-graph-profile-sort=none. Reviewed By: rahmanl Differential Revision: https://reviews.llvm.org/D159544	2023-09-25 09:49:40 -07:00
modimo	272bd6f9cc	[WPD][LLD] Add option to validate RTTI is enabled on all native types and prevent devirtualization on types with native RTTI Discussion about this approach: https://discourse.llvm.org/t/rfc-safer-whole-program-class-hierarchy-analysis/65144/18 When enabling WPD in an environment where native binaries are present, types we want to optimize can be derived from inside these native files and devirtualizing them can lead to correctness issues. RTTI can be used as a way to determine all such types in native files and exclude them from WPD providing a safe checked way to enable WPD. The approach is: 1. In the linker, identify if RTTI is available for all native types. If not, under `--lto-validate-all-vtables-have-type-infos` `--lto-whole-program-visibility` is automatically disabled. This is done by examining all .symtab symbols in object files and .dynsym symbols in DSOs for vtable (_ZTV) and typeinfo (_ZTI) symbols and ensuring there's always a match for every vtable symbol. 2. During thinlink, if `--lto-validate-all-vtables-have-type-infos` is set and RTTI is available for all native types, identify all typename (_ZTS) symbols via their corresponding typeinfo (_ZTI) symbols that are used natively or outside of our summary and exclude them from WPD. Testing: ninja check-all large Meta service that uses boost, glog and libstdc++.so runs successfully with WPD via --lto-whole-program-visibility. Previously, native types in boost caused incorrect devirtualization that led to crashes. Reviewed By: MaskRay, tejohnson Differential Revision: https://reviews.llvm.org/D155659	2023-09-18 15:51:49 -07:00
Arthur Eubanks	0a1aa6cda2	[NFC][CodeGen] Change CodeGenOpt::Level/CodeGenFileType into enum classes (#66295 ) This will make it easy for callers to see issues with and fix up calls to createTargetMachine after a future change to the params of TargetMachine. This matches other nearby enums. For downstream users, this should be a fairly straightforward replacement, e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive or s/CGFT_/CodeGenFileType::	2023-09-14 14:10:14 -07:00
Fangrui Song	65a15a56d5	[ELF] Respect orders of symbol assignments and DEFINED (#65866 ) Fix #64600: the currently implementation is minimal (see https://reviews.llvm.org/D83758), and an assignment like `__TEXT_REGION_ORIGIN__ = DEFINED(__TEXT_REGION_ORIGIN__) ? __TEXT_REGION_ORIGIN__ : 0;` (used by avr-ld[1]) leads to a value of zero (default value in `declareSymbol`), which is unexpected. Assign orders to symbol assignments and references so that for a script-defined symbol, the `DEFINED` results match users' expectation. I am unclear about GNU ld's exact behavior, but this hopefully matches its behavior in the majority of cases. [1]: https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=ld/scripttempl/avr.sc	2023-09-11 10:54:49 -07:00

1 2 3 4 5 ...

487 Commits