clang-p2996

Author	SHA1	Message	Date
Fangrui Song	dcc45faa30	[ELF] PROVIDE: fix spurious "symbol not found" When archive member extraction involving ENTRY happens after `addScriptReferencedSymbolsToSymTable`, `addScriptReferencedSymbolsToSymTable` may fail to define some PROVIDE symbols used by ENTRY. This is an edge case that regressed after #84512. (The interaction with PROVIDE and ENTRY-in-archive was not considered before). While here, also ensure that --undefined-glob extracted object files are parsed before `addScriptReferencedSymbolsToSymTable`. Fixes: `ebb326a51f` Pull Request: https://github.com/llvm/llvm-project/pull/87530	2024-04-04 09:38:01 -07:00
Parth Arora	ebb326a51f	[ELF] Fix unnecessary inclusion of unreferenced provide symbols Previously, linker was unnecessarily including a PROVIDE symbol which was referenced by another unused PROVIDE symbol. For example, if a linker script contained the below code and 'not_used_sym' provide symbol is not included, then linker was still unnecessarily including 'foo' PROVIDE symbol because it was referenced by 'not_used_sym'. This commit fixes this behavior. PROVIDE(not_used_sym = foo) PROVIDE(foo = 0x1000) This commit fixes this behavior by using dfs-like algorithm to find all the symbols referenced in provide expressions of included provide symbols. This commit also fixes the issue of unused section not being garbage-collected if a symbol of the section is referenced by an unused PROVIDE symbol. Closes #74771 Closes #84730 Co-authored-by: Fangrui Song <i@maskray.me>	2024-03-25 16:11:21 -07:00
Fangrui Song	e115c00565	[ELF] Reject certain unknown section types (#85173 ) Unknown section sections may require special linking rules, and rejecting such sections for older linkers may be desired. For example, if we introduce a new section type to replace a control structure (e.g. relocations), it would be nice for older linkers to reject the new section type. GNU ld allows certain unknown section types: * [SHT_LOUSER,SHT_HIUSER] and non-SHF_ALLOC * [SHT_LOOS,SHT_HIOS] and non-SHF_OS_NONCONFORMING but reports errors and stops linking for others (unless --no-warn-mismatch is specified). Port its behavior. For convenience, we additionally allow all [SHT_LOPROC,SHT_HIPROC] types so that we don't have to hard code all known types for each processor. Close https://github.com/llvm/llvm-project/issues/84812	2024-03-15 09:50:23 -07:00
Fangrui Song	8fe3e70e81	[ELF] Eliminate symbols demoted due to /DISCARD/ discarded sections (#85167 ) #69295 demoted Defined symbols relative to discarded sections. If such a symbol is unreferenced, the desired behavior is to eliminate it from .symtab just like --gc-sections discarded definitions. Linux kernel's CONFIG_DEBUG_FORCE_WEAK_PER_CPU=y configuration expects that the unreferenced `unused` is not emitted to .symtab (https://github.com/ClangBuiltLinux/linux/issues/2006). For relocations referencing demoted symbols, the symbol index restores to 0 like older lld (`R_X86_64_64 0` in `discard-section.s`). Fix #85048	2024-03-14 09:51:27 -07:00
Fangrui Song	f1ca2a0967	[ELF] Add --compress-section to compress matched non-SHF_ALLOC sections --compress-sections <section-glib>=[none\|zlib\|zstd] is similar to --compress-debug-sections but applies to broader sections without the SHF_ALLOC flag. lld will report an error if a SHF_ALLOC section is matched. An interesting use case is to compress `.strtab`/`.symtab`, which consume a significant portion of the file size (15.1% for a release build of Clang). An older revision is available at https://reviews.llvm.org/D154641 . This patch focuses on non-allocated sections for safety. Moving `maybeCompress` as D154641 does not handle STT_SECTION symbols for `-r --compress-debug-sections=zlib` (see `relocatable-section-symbol.s` from #66804). Since different output sections may use different compression algorithms, we need CompressedData::type to generalize config->compressDebugSections. GNU ld feature request: https://sourceware.org/bugzilla/show_bug.cgi?id=27452 Link: https://discourse.llvm.org/t/rfc-compress-arbitrary-sections-with-ld-lld-compress-sections/71674 Pull Request: https://github.com/llvm/llvm-project/pull/84855	2024-03-12 10:56:14 -07:00
Fangrui Song	551e20d190	[ELF] Reject error-prone meta characters in input section description The lexer is overly permissive. When parsing file patterns in an input section description and there is a missing `)`, we would accept many non-sensible tokens (e.g. `}`) as patterns, leading to confusion, e.g. `(SORT_BY_ALIGNMENT(SORT_BY_NAME(.text)) } PROVIDE_HIDDEN(__code_end = .)` (#81804). Ideally, the lexer should be stateful to report more errors like GNU ld and get rid of hacks like `ScriptLexer::maybeSplitExpr`, but that would require a large rewrite of the lexer. For now, just reject certain non-wildcard meta characters to detect common mistakes. Pull Request: https://github.com/llvm/llvm-project/pull/84130	2024-03-06 17:19:59 -08:00
Fangrui Song	d3e79e4cc3	[ELF] Improve wildcard test	2024-03-05 23:43:39 -08:00
Fangrui Song	5ff3f6604b	[ELF] Improve wildcard tests for input section descriptions	2024-03-05 23:39:19 -08:00
Fangrui Song	dee8786f70	[ELF] Fix compareSections assertion failure when OutputDescs in sectionCommands are non-contiguous In a `--defsym y0=0 -T a.lds` link where a.lds contains only INSERT commands, the `script->sectionCommands` layout may be: ``` orphan sections SymbolAssignment due to --defsym sections created by INSERT commands ``` The `OutputDesc` objects are not contiguous in sortInputSections, and `compareSections` will be called with a SymbolAssignment argument, leading to an assertion failure.	2024-02-01 21:20:27 -08:00
Fangrui Song	43b13341fb	[ELF] Add internal InputFile (#78944 ) Based on https://reviews.llvm.org/D45375 . Introduce a new InputFile kind `InternalKind`, use it for * `ctx.internalFile`: for linker-defined symbols and some synthesized `Undefined` * `createInternalFile`: for symbol assignments and --defsym I picked "internal" instead of "synthetic" to avoid confusion with SyntheticSection. Currently a symbol's file is one of: nullptr, ObjKind, SharedKind, BitcodeKind, BinaryKind. Now it's non-null (I plan to add an `assert(file)` to Symbol::Symbol and change `toString(const InputFile *)` separately). Debugging and error reporting gets improved. The immediate user-facing difference is more descriptive "File" column in the --cref output. This patch may unlock further simplification. Currently each symbol assignment gets its own `createInternalFile(cmd->location)`. Two symbol assignments in a linker script do not share the same file. Making the file the same would be nice, but would require non trivial code.	2024-01-22 09:09:46 -08:00
Fangrui Song	7c89b20e02	[ELF] OVERLAY: support optional start address and LMA https://reviews.llvm.org/D44780 implemented rudimentary support for OVERLAY. The start address and `AT(ldaddr)` in `OVERLAY [start] : [NOCROSSREFS] [AT ( ldaddr )]` are not optional. In addition, there are two issues: * When the start address is `.`, subsequent sections don't share the address of the first overlay section. * When the first overlay section is empty and discardable, `p_paddr` is incorrectly zero. This is because a discarded section has a zero address, causing `prev->getLMA() + prev->size` where `prev` refers to the first section to evaluate to zero. This patch supports optional start address and LMA and fix the issues. Close #77265 Pull Request: https://github.com/llvm/llvm-project/pull/77272	2024-01-08 16:12:49 -08:00
Fangrui Song	1dfb949833	[ELF] Improve OVERLAY tests Also test two issues: * When the start address is `.`, subsequent sections don't share the address of the first overlay section. * When the first overlay section is empty and discardable, `p_paddr` is incorrectly zero. This is because a discarded section has a zero address, causing `prev->getLMA() + prev->size` where `prev` refers to the first section to evaluate to zero.	2024-01-07 21:36:33 -08:00
Fangrui Song	01c8af5739	[ELF,test] Improve duplicate "symbol not found" error tests	2023-12-16 13:12:17 -08:00
Fangrui Song	215c565644	[ELF,test] Improve PROVIDE tests	2023-12-12 22:29:36 -08:00
Fangrui Song	b8dface221	[ELF] -r: rename orphan SHT_REL/SHT_RELA when the relocated input section is placed in an output section This ports https://reviews.llvm.org/D40652 (--emit-relocs) to -r and matches GNU ld. Close #67910	2023-11-17 22:38:15 -08:00
Fangrui Song	a40f651a06	[ELF] adjustOutputSections: don't copy SHF_EXECINSTR when an output does not contain input sections (#70911 ) For an output section with no input section, GNU ld eliminates the output section when there are only symbol assignments (e.g. `.foo : { symbol = 42; }`) but not for `.foo : { . += 42; }` (`SHF_ALLOC\|SHF_WRITE`). We choose to retain such an output section with a symbol assignment (unless unreferenced `PROVIDE`). We copy the previous section flag (see https://reviews.llvm.org/D37736) to hopefully make the current PT_LOAD segment extend to the current output section: * decrease the number of PT_LOAD segments * If a new PT_LOAD segment is introduced without a page-size alignment as a separator, there may be a run-time crash. However, this `flags` copying behavior is not suitable for `.foo : { . += 42; }` when `flags` contains `SHF_EXECINSTR`. The executable bit is surprising (https://discourse.llvm.org/t/lld-output-section-flag-assignment-behavior/74359). I think we should drop SHF_EXECINSTR when copying `flags`. The risk is a code section followed by `.foo : { symbol = 42; }` will be broken, which I believe is unrelated as such uses are almost always related to data sections. For data-command-only output sections (e.g. `.foo : { QUAD(42) }`), we keep allowing copyable SHF_WRITE. Some tests are updated to drop the SHF_EXECINSTR flag. GNU ld doesn't set SHF_EXECINSTR as well, though it sets SHF_WRITE for some tests while we don't.	2023-11-01 22:35:28 -07:00
Fangrui Song	9220e0e647	[ELF][test] Improve flag propagation when an output section does not contain input sections	2023-11-01 09:29:33 -07:00
Fangrui Song	1981b1b6b9	[ELF] Demote symbols in /DISCARD/ discarded sections to Undefined (#69295 ) When an input section is matched by /DISCARD/ in a linker script, GNU ld reports errors for relocations referencing symbols defined in the section: `.aaa' referenced in section `.bbb' of a.o: defined in discarded section `.aaa' of a.o Implement the error by demoting eligible symbols to `Undefined` and changing STB_WEAK to STB_GLOBAL. As a side benefit, in relocatable links, relocations referencing symbols defined relative to /DISCARD/ discarded sections no longer set symbol/type to zeros. It's arguable whether a weak reference to a discarded symbol should lead to errors. GNU ld reports an error and our demoting approach reports an error as well. Close #58891 Co-authored-by: Bevin Hansson <bevin.hansson@ericsson.com>	2023-10-17 14:10:52 -07:00
Fangrui Song	0996ceece6	[ELF][test] Improve relocatable link & /DISCARD/ test Check that #69295 will fix symbols referenced by relocations that are defined in discarded sections.	2023-10-17 12:49:17 -07:00
Fangrui Song	557299c9b6	[ELF][test] Test relocations referencing weak symbol, which is defined relative to a section discarded by /DISCARD/	2023-10-14 14:59:10 -07:00
Fangrui Song	4fb49f44fd	[ELF][test] Test relocations referencing symbols relative to sections discarded by /DISCARD/	2023-10-14 14:30:44 -07:00
Fangrui Song	0de0b6dded	[ELF] Postpone "unable to move location counter backward" error (#66854 ) The size of .ARM.exidx may shrink across `assignAddress` calls. It is possible that the initial iteration has a larger location counter, causing `__code_size = __code_end - .; osec : { . += __code_size; }` to report an error, while the error would have been suppressed for subsequent `assignAddress` iterations. Other sections like .relr.dyn may change sizes across `assignAddress` calls as well. However, their initial size is zero, so it is difficiult to trigger a similar error. Similar to https://reviews.llvm.org/D152170, postpone the error reporting. Fix #66836. While here, add more information to the error message.	2023-09-20 09:06:45 -07:00
Fangrui Song	309d1c43bd	[ELF][test] Add a test to demonstrate #66836	2023-09-20 09:04:41 -07:00
Douglas Yung	72bbac4738	Mark test added in `5a58e98` as requiring ppc, not x86 since it tries to use the powerpc64le target.	2023-09-14 14:47:20 -07:00
Fangrui Song	5a58e98c20	[ELF] Align the end of PT_GNU_RELRO associated PT_LOAD to a common-page-size boundary (#66042 ) Close #57618: currently we align the end of PT_GNU_RELRO to a common-page-size boundary, but do not align the end of the associated PT_LOAD. This is benign when runtime_page_size >= common-page-size. However, when runtime_page_size < common-page-size, it is possible that `alignUp(end(PT_LOAD), page_size) < alignDown(end(PT_GNU_RELRO), page_size)`. In this case, rtld's mprotect call for PT_GNU_RELRO will apply to unmapped regions and lead to an error, e.g. ``` error while loading shared libraries: cannot apply additional memory protection after relocation: Cannot allocate memory ``` To fix the issue, add a padding section .relro_padding like mold, which is contained in the PT_GNU_RELRO segment and the associated PT_LOAD segment. The section also prevents strip from corrupting PT_LOAD program headers. .relro_padding has the largest `sortRank` among RELRO sections. Therefore, it is naturally placed at the end of `PT_GNU_RELRO` segment in the absence of `PHDRS`/`SECTIONS` commands. In the presence of `SECTIONS` commands, we place .relro_padding immediately before a symbol assignment using DATA_SEGMENT_RELRO_END (see also https://reviews.llvm.org/D124656), if present. DATA_SEGMENT_RELRO_END is changed to align to max-page-size instead of common-page-size. Some edge cases worth mentioning: * ppc64-toc-addis-nop.s: when PHDRS is present, do not append .relro_padding * avoid-empty-program-headers.s: when the only RELRO section is .tbss, it is not part of PT_LOAD segment, therefore we do not append .relro_padding. --- Close #65002: GNU ld from 2.39 onwards aligns the end of PT_GNU_RELRO to a max-page-size boundary (https://sourceware.org/PR28824) so that the last page is protected even if runtime_page_size > common-page-size. In my opinion, losing protection for the last page when the runtime page size is larger than common-page-size is not really an issue. Double mapping a page of up to max-common-page for the protection could cause undesired VM waste. Internally we had users complaining about 2MiB max-page-size applying to shared objects. Therefore, the end of .relro_padding is padded to a common-page-size boundary. Users who are really anxious can set common-page-size to match their runtime page size. --- 17 tests need updating as there are lots of change detectors.	2023-09-14 10:33:11 -07:00
Fangrui Song	65a15a56d5	[ELF] Respect orders of symbol assignments and DEFINED (#65866 ) Fix #64600: the currently implementation is minimal (see https://reviews.llvm.org/D83758), and an assignment like `__TEXT_REGION_ORIGIN__ = DEFINED(__TEXT_REGION_ORIGIN__) ? __TEXT_REGION_ORIGIN__ : 0;` (used by avr-ld[1]) leads to a value of zero (default value in `declareSymbol`), which is unexpected. Assign orders to symbol assignments and references so that for a script-defined symbol, the `DEFINED` results match users' expectation. I am unclear about GNU ld's exact behavior, but this hopefully matches its behavior in the majority of cases. [1]: https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=ld/scripttempl/avr.sc	2023-09-11 10:54:49 -07:00
Fangrui Song	eabaf3ba85	[ELF] splitNonStrings: switch to xxh3_64bits This is primarily used for .rodata.cst* duplicate elimination. The sections are usually much smaller than .debug_str (D154813), so the speedup is negligible. We do this switch for consistency as we want to eliminate xxh64 in lld.	2023-07-19 11:28:47 -07:00
Fangrui Song	fae96104d4	[ELF] Support operator ^ and ^= GNU ld added ^ support in July 2023 and it looks like ^= is in plan as well. For now, we don't support `a^=0` (^= without a preceding space).	2023-07-15 14:10:40 -07:00
Roger Pau Monne	7cab385a8f	[lld/elf] support quote usage in section names Section names used in ELF linker scripts can be quoted, but such quotes must not be propagated to the binary ELF section names. As such strip the quotes from the section names when processing them, and also strip them from linker script functions that take section names as parameters. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D124266	2023-07-05 14:56:16 -07:00
Fangrui Song	daba24ee7b	[ELF] << >>: make RHS less than 64 The left/right shift linker script operators may trigger UB. E.g. in linkerscript/end-overflow-check.test, the initial REGION1__PADDED_SR_SHIFT is uint64_t(-3), cause the following expression to trigger an out-of-range shift in a ubsan build of lld. REGION1__PADDED_SR_SIZE = MAX(1 << REGION1__PADDED_SR_SHIFT, 32); Protect such UBs by making RHS less than 64.	2023-06-15 10:34:33 -07:00
Andreu Carminati	e4118a7ac0	[ELF] Fix early overflow check in finalizeAddressDependentContent LLD terminates with errors when it detects overflows in the finalizeAddressDependentContent calculation. Although, sometimes, those errors are not really errors, but an intermediate result of an ongoing address calculation. If we continue the fixed-point algorithm we can converge to the correct result. This patch * Removes the verification inside the fixed point algorithm. * Calls checkMemoryRegions at the end. Reviewed By: peter.smith, MaskRay Differential Revision: https://reviews.llvm.org/D152170	2023-06-14 15:26:31 -07:00
Andreu Carminati	58789ed62a	[ELF] Refine warning condition for memory region assignment for non-allocatable section The warning "ignoring memory region assignment for non-allocatable section" should be generated under the following conditions: * sections without SHF_ALLOC attribute and, * presence of input sections or data commands (ByteCommand) The goal of the change is to reduce spurious warnings that are generated for some output sections that have no input section. Reviewed By: MaskRay, peter.smith Differential Revision: https://reviews.llvm.org/D151802	2023-06-14 15:23:14 -07:00
Fangrui Song	361a8226f9	[ELF][test] Add -NEXT and -NOT after D150644 (–-print-memory-usage)	2023-05-25 13:05:43 -07:00
Petr Hosek	811cbfc262	[lld][ELF] Implement –print-memory-usage This option was introduced in GNU ld in https://sourceware.org/legacy-ml/binutils/2015-06/msg00086.html and is often used in embedded development. This change implements this option in LLD matching the GNU ld output verbatim. Differential Revision: https://reviews.llvm.org/D150644	2023-05-25 07:14:18 +00:00
Leonard Chan	b9249a69cc	[lld][ELF] Do not emit warning for NOLOAD output sections Much of NOLOAD's intended use is to explicitly change the type of an output section, so we shouldn't flag these as warnings. Differential Revision: https://reviews.llvm.org/D151144	2023-05-23 20:41:20 +00:00
Peter Smith	ca39168a7c	[LLD][ELF] change CHECK to CHECK-NEXT in overlay-phdr.test NFCI A code-review comment to change a couple of CHECK to CHECK-NEXT that I forgot to apply prior to committing. Differential Revision: https://reviews.llvm.org/D150445	2023-05-15 19:08:16 +01:00
Peter Smith	e16af8a281	[LLD][ELF] Add missing program header parsing to OVERLAY In D72756 the change to add INPUT_SECTION_FLAGS inadvertantly removed the line to parse the program header assignment information for OutputSections within an OVERLAY. This change adds back the missing line and adds a test for it. Differential Revision: https://reviews.llvm.org/D150445	2023-05-15 10:04:33 +01:00
Justin Cady	447aa48b4a	[ELF] Add REVERSE input section description keyword The `REVERSE` keyword is described here: https://sourceware.org/bugzilla/show_bug.cgi?id=27565 It complements `SORT` by allowing the order of input sections to be reversed. This is particularly useful for order-dependent sections such as .init_array, where `REVERSE` can be used to either detect static initialization order fiasco issues or as a mechanism to maintain .ctors element order while transitioning to the modern .init_array. Such a transition is described here: https://discourse.llvm.org/t/is-it-possible-to-manually-specify-init-array-order/68649 Differential Revision: https://reviews.llvm.org/D145381	2023-03-07 12:44:02 -05:00
Fangrui Song	ffa1118330	[ELF] Mention section name for STT_SECTION in reportRangeError() D73518 mentioned non-STT_SECTION symbol names. This patch extends the code to handle STT_SECTION symbols, where we report the section name. This change helps at least the following cases with very little code. * Whether a out-of-range relocation is due to code or data. * For a relocation in .debug_info, which referenced `.debug_*` section (due to DWARF32 limitation) causes the problem. Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D145199	2023-03-03 12:35:05 -08:00
Peter Collingbourne	82c2fcffc2	ELF: Respect MEMORY command when specified without a SECTIONS command. We were previously ignoring the MEMORY command unless SECTIONS was also specified. Fix it. Differential Revision: https://reviews.llvm.org/D145132	2023-03-01 22:40:32 -08:00
Ben Shi	366d34b39e	[AVR][MC] Add ELF flag 'EF_AVR_LINKRELAX_PREPARED' to OBJ files This is in accordance with avr-gcc, even '-mno-relax' is specified to avr-gcc, this flag will also be added to the output relocatables. With this flag set, the GNU ld will perform long call -> short call optimization for AVR, otherwise not. Fixes https://github.com/llvm/llvm-project/issues/54508 Reviewed By: MaskRay, jacquesguan, aykevl Differential Revision: https://reviews.llvm.org/D144617	2023-02-24 11:16:42 +08:00
Fangrui Song	d60ef9338d	[ELF] Support quoted output section names Similar to `e7a7ad134f` and `2bf06d9345` for other linker script syntax. Close https://github.com/llvm/llvm-project/issues/60496	2023-02-03 11:03:00 -08:00
Fangrui Song	926a77b76c	[ELF][test] Clean up PT_OPENBSD tests	2022-11-19 18:51:35 +00:00
Daniel Thornburgh	75cdab6dc2	[llvm-objdump] Add --no-print-imm-hex to tests depending on it. This prepares for an upcoming change to make --print-imm-hex the default behavior of llvm-objdump. These tests were updated in a semi-automatic fashion. See D136972 for details.	2022-10-29 15:40:26 -07:00
Fangrui Song	1837333dac	[ELF] --check-sections: allow address 0xffffffff for ELFCLASS32 Fix https://github.com/llvm/llvm-project/issues/58101	2022-10-01 15:37:07 -07:00
Fangrui Song	3b4d800911	[ELF] Parallelize writes of different OutputSections We currently process one OutputSection at a time and for each OutputSection write contained input sections in parallel. This strategy does not leverage multi-threading well. Instead, parallelize writes of different OutputSections. The default TaskSize for parallelFor often leads to inferior sharding. We prepare the task in the caller instead. * Move llvm::parallel::detail::TaskGroup to llvm::parallel::TaskGroup * Add llvm::parallel::TaskGroup::execute. * Change writeSections to declare TaskGroup and pass it to writeTo. Speed-up with --threads=8: * clang -DCMAKE_BUILD_TYPE=Release: 1.11x as fast * clang -DCMAKE_BUILD_TYPE=Debug: 1.10x as fast * chrome -DCMAKE_BUILD_TYPE=Release: 1.04x as fast * scylladb build/release: 1.09x as fast On M1, many benchmarks are a small fraction of a percentage faster. Mozilla showed the largest difference with the patch being about 1.03x as fast. Differential Revision: https://reviews.llvm.org/D131247	2022-08-24 09:40:03 -07:00
Fangrui Song	85cfd91723	[ELF] Optimize some non-constant alignTo with alignToPowerOf2. NFC My x86-64 lld executable is 2KiB smaller. .eh_frame writing gets faster as there were lots of divisions.	2022-07-24 11:20:49 -07:00
Fangrui Song	b95cca03cd	[ELF] Improve compound assignment tests Also use strchr instead of is_contained.	2022-06-25 22:30:52 -07:00
Fangrui Song	0a0effdd5b	[ELF] Support -= *= /= <<= >>= &= \|= in symbol assignments	2022-06-25 22:22:59 -07:00
Fangrui Song	77295c5486	[ELF] Allow ? without adjacent space GNU ld allows 1 ? 2?3:4 : 5?6 :7	2022-06-25 21:16:59 -07:00

1 2 3 4 5 ...

717 Commits