clang-p2996

Author	SHA1	Message	Date
Jonas Paulsson	5ecd363295	Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions." This reverts commit `122efef8ee`. - Patch fixed to not reuse definitions from predecessors in EH landing pads. - Late review suggestions (by MaskRay) have been addressed. - M68k/pipeline.ll test updated. - Init captures added in processBlock() to avoid capturing structured bindings. - RISCV has this disabled for now. Original commit message: A new pass MachineLateInstrsCleanup is added to be run after PEI. This is a simple pass that removes redundant and identical instructions whenever found by scanning the MF once while keeping track of register definitions in a map. These instructions are typically immediate loads resulting from rematerialization, and address loads emitted by target in eliminateFrameInde(). This is enabled by default, but a target could easily disable it by means of 'disablePass(&MachineLateInstrsCleanupID);'. This late cleanup is naturally not "optimal" in removing instructions as it is done by looking at phys-regs, but still quite effective. It would be desirable to improve other parts of CodeGen and avoid these redundant instructions in the first place, but there are no ideas for this yet. Differential Revision: https://reviews.llvm.org/D123394 Reviewed By: RKSimon, foad, craig.topper, arsenm, asb	2022-12-05 12:53:50 -06:00
Jonas Paulsson	122efef8ee	Revert "Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions."" This reverts commit `17db0de330`. Some more bots got broken - need to investigate.	2022-12-05 00:52:00 +01:00
Jonas Paulsson	17db0de330	Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions." Init captures added in processBlock() to avoid capturing structured bindings, which caused the build problems (with clang). RISCV has this disabled for now until problems relating to post RA pseudo expansions are resolved.	2022-12-03 14:15:15 -06:00
Jonas Paulsson	8ef4632681	Revert "[CodeGen] Add new pass for late cleanup of redundant definitions." Temporarily revert and fix buildbot failure. This reverts commit `6d12599fd4`.	2022-12-01 13:29:24 -05:00
Jonas Paulsson	6d12599fd4	[CodeGen] Add new pass for late cleanup of redundant definitions. A new pass MachineLateInstrsCleanup is added to be run after PEI. This is a simple pass that removes redundant and identical instructions whenever found by scanning the MF once while keeping track of register definitions in a map. These instructions are typically immediate loads resulting from rematerialization, and address loads emitted by target in eliminateFrameInde(). This is enabled by default, but a target could easily disable it by means of 'disablePass(&MachineLateInstrsCleanupID);'. This late cleanup is naturally not "optimal" in removing instructions as it is done by looking at phys-regs, but still quite effective. It would be desirable to improve other parts of CodeGen and avoid these redundant instructions in the first place, but there are no ideas for this yet. Differential Revision: https://reviews.llvm.org/D123394 Reviewed By: RKSimon, foad, craig.topper, arsenm, asb	2022-12-01 13:21:35 -05:00
Alex Richardson	d77b7cac27	[BPF] Avoid checking for intrinsics using string comparisons. NFC Use a dyn_cast<> to IntrinsicInst and an enum compare instead. While touching this code also re-generate the test to use positive check lines instead of negative ones and remove some unneeded metadata. Reviewed By: yonghong-song Differential Revision: https://reviews.llvm.org/D138565	2022-11-25 11:34:55 +00:00
Daniel Thornburgh	75cdab6dc2	[llvm-objdump] Add --no-print-imm-hex to tests depending on it. This prepares for an upcoming change to make --print-imm-hex the default behavior of llvm-objdump. These tests were updated in a semi-automatic fashion. See D136972 for details.	2022-10-29 15:40:26 -07:00
Simon Pilgrim	78739fdb4d	[DAG] Enable combineShiftOfShiftedLogic folds after type legalization This was disabled to prevent regressions, which appear to be just occurring on AMDGPU (at least in our current lit tests), which I've addressed by adding AMDGPUTargetLowering::isDesirableToCommuteWithShift overrides. Fixes #57872 Differential Revision: https://reviews.llvm.org/D136042	2022-10-29 12:30:04 +01:00
Simon Pilgrim	986ca95e06	[BPF] Add (failing) testcase for Issue #57872	2022-10-16 18:16:18 +01:00
Simon Pilgrim	8727248b79	[UpdateTestChecks] Add basic BPF triple handling Working on Issue #57872 - its really useful to be able to autogenerate checks	2022-10-12 15:57:52 +01:00
Simon Pilgrim	01631a83f1	[BPF] memcmp.ll - add checks for all loads Noticed while triaging alignment issues for #57872	2022-09-24 18:55:25 +01:00
Simon Pilgrim	9e090dc699	[BPF] ex1.ll - add checks for stores Noticed while triaging alignment issues for #57872	2022-09-24 18:53:59 +01:00
Yonghong Song	481d67d310	[Clang][BPF] Support record argument with direct values Currently, record arguments are always passed by reference by allocating space for record values in the caller. This is less efficient for small records which may take one or two registers. For example, for x86_64 and aarch64, for a record size up to 16 bytes, the record values can be passed by values directly on the registers. This patch added BPF support of record argument with direct values for up to 16 byte record size. If record size is 0, that record will not take any register, which is the same behavior for x86_64 and aarch64. If the record size is greater than 16 bytes, the record argument will be passed by reference. Differential Revision: https://reviews.llvm.org/D132144	2022-08-18 19:11:50 -07:00
Yonghong Song	6e6c1efe04	[BPF] Handle anon record for CO-RE relocations When doing experiment in kernel, for kernel data structure sockptr_t in CO-RE operation, I hit an assertion error. The sockptr_t definition and usage look like below: #pragma clang attribute push (__attribute__((preserve_access_index)), apply_to = record) typedef struct { union { void kernel; void user; }; unsigned is_kernel : 1; } sockptr_t; #pragma clang attribute pop int test(sockptr_t arg) { return arg->is_kernel; } The assertion error looks like clang: ../lib/Target/BPF/BPFAbstractMemberAccess.cpp:878: llvm::Value {anonymous}::BPFAbstractMemberAccess::computeBaseAndAccessKey(llvm::CallInst, {anonymous}::BPFAbstractMemberAccess::CallInfo&, std::__cxx11::string&, llvm::MDNode&): Assertion `TypeName.size()' failed. In this particular, the clang frontend attach the debuginfo metadata associated with anon structure with the preserve_access_info IR intrinsic. But the first debuginfo type has to be a named type so libbpf can have a sound start to do CO-RE relocation. Besides the above approach using pragma to push attribute, the below typedef/struct definition can have preserve_access_index directly applying to the anon struct. typedef struct { union { void kernel; void user; }; unsigned is_kernel : 1; } __attribute__((preserve_access_index) sockptr_t; This patch fixed the issue by preprocessing function argument/return types and local variable types used by other CO-RE intrinsics. For any typedef struct/union { ... } typedef_name an association of <anon struct/union, typedef> is recorded to replace the IR intrinsic metadata 'anon struct/union' to 'typedef'. It is possible that two different 'typedef' types may have identical anon struct/union type. For such a case, the association will be <anon struct/union, nullptr> to indicate the invalid case. Differential Revision: https://reviews.llvm.org/D129621	2022-07-13 15:16:16 -07:00
Daniel Müller	d129ac27e8	[BPF] Introduce support for type match relocations Among others, BPF currently supports the type-exists CO-RE relocation (e.g., see D83878 & D83242). Its intention, as the name tries to convey, is to be used for checking existence of a type in a target. While that check is useful and has its place, we would also like to be able to perform stricter type queries: instead of just checking mere existence, we want to make sure that members match up in composite types, that enum variants are present, etc. We refer to this as "type match". This change proposes the addition of a new relocation variant/value that we intend to use for establishing this match relation. Differential Revision: https://reviews.llvm.org/D126838	2022-06-29 18:23:08 -07:00
Martin Sebor	b19194c032	[InstCombine] handle subobjects of constant aggregates Remove the known limitation of the library function call folders to only work with top-level arrays of characters (as per the TODO comment in the code) and allows them to also fold calls involving subobjects of constant aggregates such as member arrays.	2022-06-21 11:55:14 -06:00
Yonghong Song	dc1c43d726	[BPF] Add BTF 64bit enum value support Current BTF only supports 32-bit value. For example, enum T { VAL = 0xffffFFFF00000008 }; the generated BTF looks like .long 16 # BTF_KIND_ENUM(id = 4) .long 100663297 # 0x6000001 .long 8 .long 18 .long 8 The encoded value is 8 which equals to (uint32_t)0xffffFFFF00000008 and this is incorrect. This patch introduced BTF_KIND_ENUM64 which permits to encode 64-bit value. The format for each enumerator looks like: .long name_offset .long (uint32_t)value # lower-32 bit value .long value >> 32 # high-32 bit value We use two 32-bit values to represent a 64-bit value as current BTF type subsection has 4-byte alignment and gaps are not permitted in the subsection. This patch also added support for kflag (the bit 31 of CommonType.Info) such that kflag = 1 implies the value is signed and kflag = 0 implies the value is unsigned. The kernel UAPI enumerator definition is struct btf_enum { __u32 name_off; __s32 val; }; so kflag = 0 with unsigned value provides backward compatability. With this patch, for enum T { VAL = 0xffffFFFF00000008 }; the generated BTF looks like .long 16 # BTF_KIND_ENUM64(id = 4) .long 3187671053 # 0x13000001 .long 8 .long 18 .long 8 # 0x8 .long 4294967295 # 0xffffffff and the enumerator value and signedness are encoded correctly. Differential Revision: https://reviews.llvm.org/D124641	2022-06-06 11:35:50 -07:00
Brad Smith	c2d27c8959	[BPF] Enable IAS in backend Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D123845	2022-06-05 23:28:53 -04:00
Nuno Lopes	80b3dcc045	[Support] Make report_fatal_error respect its GenCrashDiag argument so it doesn't generate a backtrace There are a few places where we use report_fatal_error when the input is broken. Currently, this function always crashes LLVM with an abort signal, which then triggers the backtrace printing code. I think this is excessive, as wrong input shouldn't give a link to LLVM's github issue URL and tell users to file a bug report. We shouldn't print a stack trace either. This patch changes report_fatal_error so it uses exit() rather than abort() when its argument GenCrashDiag=false. Reviewed by: nikic, MaskRay, RKSimon Differential Revision: https://reviews.llvm.org/D126550	2022-05-30 19:19:23 +01:00
Ivan Kosarev	ad1d60c3be	[FileCheck] Catch missspelled directives. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D125604	2022-05-26 11:37:19 +01:00
Jim Lin	50f5cef391	[BPF] Implement mod operation Implement BPF_MOD instruction to fix lack of assembly parser support mentioned in https://github.com/llvm/llvm-project/issues/55192. Reviewed By: ast Differential Revision: https://reviews.llvm.org/D125207	2022-05-12 10:59:18 +08:00
Eduard Zingerman	256a18997e	[BPF] Add a test for making FI_ri as isPseudo Commit `8a63326150` ("[BPF] Mark FI_ri as isPseudo to avoid assertion during disassembly") added isPseudo to FI_ri insn in BPFInstrInfo.td file. This patch added the missing test file. Differential Revision: https://reviews.llvm.org/D125185	2022-05-10 17:46:07 -07:00
Peter Klausler	497a5f0415	[BPF] Fix a bug in BPFMISimplifyPatchable pass LLVM BPF pass SimplifyPatchable is used to do necessary code conversion for CO-RE operations. When studying bpf selftest 'exhandler', I found a corner case not handled properly. The following is the C code, modified from original 'exhandler' code. int g; int test(struct t1 p) { struct t2 q = p->q; if (q) return 0; struct t3 f = q->f; if (!f) g = 5; return 0; } For code: struct t3 f = q->f; if (!f) ... The IR before BPFMISimplifyPatchable pass looks like: %5:gpr = LD_imm64 @"llvm.t2:0:8$0:1" %6:gpr = LDD killed %5:gpr, 0 %7:gpr = LDD killed %6:gpr, 0 JNE_ri killed %7:gpr, 0, %bb.3 JMP %bb.2 Note that compiler knows q = 0 based dataflow and value analysis. The correct generated code after the pass should be %5:gpr = LD_imm64 @"llvm.t2:0:8$0:1" %7:gpr = LDD killed %5:gpr, 0 JNE_ri killed %7:gpr, 0, %bb.3 JMP %bb.2 But the current implementation did further optimization for the above code and generates %5:gpr = LD_imm64 @"llvm.t2:0:8$0:1" JNE_ri killed %5:gpr, 0, %bb.3 JMP %bb.2 which is incorrect. This patch added a cache to remember those load insns not associated with CO-RE offset value and will skip these load insns during transformation. Differential Revision: https://reviews.llvm.org/D123883	2022-04-19 15:24:26 -07:00
Yonghong Song	6ee71e53e5	[BPF] handle opaque-pointer for __builtin_preserve_enum_value Opaque pointer [1] is enabled as the default with commit [2]. Andrii found that current __builtin_preserve_enum_value() can only handle non opaque pointer code pattern and will segfault with latest llvm main branch where opaque-pointer is enabled by default. This patch added the opaque pointer support. Besides llvm selftests, also verified with bpf-next bpf selftests. [1] https://llvm.org/docs/OpaquePointers.html [2] https://reviews.llvm.org/D123122 Differential Revision: https://reviews.llvm.org/D123800	2022-04-14 11:34:32 -07:00
Yonghong Song	5898979387	BPF: support inlining __builtin_memcmp intrinsic call Delyan Kratunov reported an issue where __builtin_memcmp is not inlined into simple load/compare instructions. This is a known issue. In the current state, __builtin_memcmp will be converted to memcmp call which won't work for bpf programs. This patch added support for expanding __builtin_memcmp with actual loads and compares up to currently maximum 128 total loads. The implementation is identical to PowerPC. Differential Revision: https://reviews.llvm.org/D122676	2022-03-29 15:03:26 -07:00
Yonghong Song	2e94d8e67a	[BPF] handle unsigned icmp ops in BPFAdjustOpt pass When investigating an issue with bcc tool inject.py, I found a verifier failure with latest clang. The portion of code can be illustrated as below: struct pid_struct { u64 curr_call; u64 conds_met; u64 stack[2]; }; struct pid_struct bpf_map_lookup_elem(); int foo() { struct pid_struct p = bpf_map_lookup_elem(); if (!p) return 0; p->curr_call--; if (p->conds_met < 1 \|\| p->conds_met >= 3) return 0; if (p->stack[p->conds_met - 1] == p->curr_call) p->conds_met--; ... } The verifier failure looks like: ... 8: (79) r1 = (u64 )(r0 +0) R0_w=map_value(id=0,off=0,ks=4,vs=32,imm=0) R10=fp0 fp-8=mmmm???? 9: (07) r1 += -1 10: (7b) (u64 )(r0 +0) = r1 R0_w=map_value(id=0,off=0,ks=4,vs=32,imm=0) R1_w=inv(id=0) R10=fp0 fp-8=mmmm???? 11: (79) r2 = (u64 )(r0 +8) R0_w=map_value(id=0,off=0,ks=4,vs=32,imm=0) R1_w=inv(id=0) R10=fp0 fp-8=mmmm???? 12: (bf) r3 = r2 13: (07) r3 += -3 14: (b7) r4 = -2 15: (2d) if r4 > r3 goto pc+13 R0=map_value(id=0,off=0,ks=4,vs=32,imm=0) R1=inv(id=0) R2=inv(id=2) R3=inv(id=0,umin_value=18446744073709551614,var_off=(0xffffffff00000000; 0xffffffff)) R4=inv-2 R10=fp0 fp-8=mmmm???? 16: (07) r2 += -1 17: (bf) r3 = r2 18: (67) r3 <<= 3 19: (bf) r4 = r0 20: (0f) r4 += r3 math between map_value pointer and register with unbounded min value is not allowed Here the compiler optimized "p->conds_met < 1 \|\| p->conds_met >= 3" to r2 = p->conds_met r3 = r2 r3 += -3 r4 = -2 if (r3 < r4) return 0 r2 += -1 r3 = r2 ... In the above, r3 is initially equal to r2, but is modified used by the comparison. But later on r2 is used again. This caused verification failure. BPF backend has a pass, AdjustOpt, to prevent such transformation, but only focused on signed integers since typical bpf helper returns signed integers. To fix this case, let us handle unsigned integers as well. Differential Revision: https://reviews.llvm.org/D121937	2022-03-17 16:24:39 -07:00
Yonghong Song	d2b4a675a8	[BPF] Fix a bug in BPFAdjustOpt pass for icmp transformation When checking a bcc issue related to bcc tool inject.py, I found a bug in BPFAdjustOpt pass for icmp transformation, caused by typo's. For the following condition: Cond2Op != ICmpInst::ICMP_SLT && Cond1Op != ICmpInst::ICMP_SLE it should be Cond2Op != ICmpInst::ICMP_SLT && Cond2Op != ICmpInst::ICMP_SLE This patch fixed the problem and a test case is added. Differential Revision: https://reviews.llvm.org/D121883	2022-03-17 09:25:18 -07:00
Yonghong Song	98e2274458	[BPF] fix a CO-RE bitfield relocation error with >8 record alignment Jussi Maki reported a fatal error like below for a bitfield CO-RE relocation: fatal error: error in backend: Unsupported field expression for llvm.bpf.preserve.field.info, requiring too big alignment The failure is related to kernel struct thread_struct. The following is a simplied example. Suppose we have below structure: struct t2 { int a[8]; } __attribute__((aligned(64))) __attribute__((preserve_access_index)); struct t1 { int f1:1; int f2:2; struct t2 f3; } __attribute__((preserve_access_index)); Note that struct t2 has aligned 64, which is used sometimes in the kernel to enforce cache line alignment. The above struct will be encoded into BTF and the following is what C code looks like and the struct will appear in the file like vmlinux.h. struct t2 { int a[8]; long: 64; long: 64; long: 64; long: 64; } __attribute__((preserve_access_index)); struct t1 { int f1: 1; int f2: 2; long: 61; long: 64; long: 64; long: 64; long: 64; long: 64; long: 64; long: 64; struct t2 f3; } __attribute__((preserve_access_index)); Note that after origin_source -> BTF -> new_source transition, the new source has the same memory layout as the old one but the alignment interpretation inside the compiler could be different. The bpf program will use the later explicitly padded structure as in vmlinux.h. In the above case, the compiler internal ABI alignment for new struct t1 is 16 while it is 4 for old struct t1. I didn't do a thorough investigation why the ABI alignment is 16 and I suspect it is related to anonymous padding in the above. Current BPF bitfield CO-RE handling requires alignment <= 8 so proper bitfield operatin can be performed. Therefore, alignment 16 will cause a compiler fatal error. To fix the ABI alignment >=16, let us check whether the bitfield can be held within a 8-byte-aligned range. If this is the case, we can use alignment 8. Otherwise, a fatal error will be reported. Differential Revision: https://reviews.llvm.org/D121821	2022-03-16 12:16:46 -07:00
Reid Kleckner	f58fb8ae7f	[BPF] Fix tests that fail if /tmp/t.c exists IMO the BPF backend shouldn't read random source files referenced from debug info. I filed llvm.org/pr54092 about this.	2022-02-25 14:55:53 -08:00
Yonghong Song	3671bdbcd2	[BPF] Fix a BTF type pruning bug In BPF backend, BTF type generation may skip some debuginfo types if they are the pointee type of a struct member. For example, struct task_struct { ... struct mm_struct mm; ... }; BPF backend may generate a forward decl for 'struct mm_struct' instead of full type if there are no other usage of 'struct mm_struct'. The reason is to avoid bringing too much unneeded types in BTF. Alexei found a pruning bug where we may miss some full type generation. The following is an illustrating example: struct t1 { ... } struct t2 { struct t1 p; }; struct t2 g; void foo(struct t1 *arg) { ... } In the above case, we will have partial debuginfo chain like below: struct t2 -> member p \ -> ptr -> struct t1 / foo -> argument arg During traversing struct t2 -> member p -> ptr -> struct t1 The corresponding BTF types are generated except 'struct t1' which will be in FixUp stage. Later, when traversing foo -> argument arg -> ptr -> struct t1 The 'ptr' BTF type has been generated and currently implementation ignores 'pointer' type hence 'struct t1' is not generated. This patch fixed the issue not just for the above case, but for general case with multiple derived types, e.g., struct t2 -> member p \ -> const -> ptr -> volatile -> struct t1 / foo -> argument arg Differential Revision: https://reviews.llvm.org/D119986	2022-02-16 17:23:34 -08:00
Yonghong Song	f419029fcd	[BPF] Fix a bug in BTF_KIND_TYPE_TAG generation Kumar Kartikeya Dwivedi reported a bug ([1]) where BTF_KIND_TYPE_TAG types are not generated. Currently, BPF backend only generates BTF types which are used by the program, e.g., global variables, functions and some builtin functions. For example, suppose we have struct task_struct { ... struct task_group sched_task_group; struct mm_struct mm; ... pid_t pid; pid_t tgid; ... } If BPF program intends to access task_struct->pid and task_struct->tgid, there really no need to generate BTF types for struct task_group and mm_struct. In BPF backend, during BTF generation, when generating BTF for struct task_struct, if types for task_group and mm_struct have not been generated yet, a Fixup structure will be created, which will be reexamined later to instantiate into either a full type or a forward type. In current implementation, if we have something like struct foo { struct bar __tag1 *f; }; and when generating types for struct foo, struct bar type has not been generated, the __tag1 will be lost during later Fixup instantiation. This patch fixed this issue by properly handling btf_type_tag's during Fixup instantiation stage. [1] https://lore.kernel.org/bpf/20220210232411.pmhzj7v5uptqby7r@apollo.legion/ Differential Revision: https://reviews.llvm.org/D119799	2022-02-14 19:43:57 -08:00
Nikita Popov	f430c1eb64	[Tests] Add elementtype attribute to indirect inline asm operands (NFC) This updates LLVM tests for D116531 by adding elementtype attributes to operands that correspond to indirect asm constraints.	2022-01-06 14:23:51 +01:00
Yonghong Song	8fb3f84484	BPF: Workaround InstCombine trunc+icmp => mask+icmp Optimization Patch [1] added further InstCombine trunc+icmp => mask+icmp optimization and this caused a couple of bpf selftest failure. Previous llvm BPF backend patch [2] introduced llvm.bpf.compare builtin to handle such situations. This patch further added support ">" and ">=" icmp opcodes. Tested with bpf selftests and all tests are passed including two previously failed ones. Note Patch [1] also added optimization if the to-be-compared constant is negative-power-of-2 (-C) or not-of-power-of-2 (~C). This patch didn't implement these two cases as typical bpf program compares a scalar to a positive length or boundary value, and this scalar later is used as a index into an array buffer or packet buffer. [1] https://reviews.llvm.org/D112634 [2] https://reviews.llvm.org/D112938 Differential Revision: https://reviews.llvm.org/D114215	2021-11-18 20:25:28 -08:00
Yonghong Song	8d499bd5bc	BPF: change btf_type_tag BTF output format For the declaration like below: int __tag1 * __tag1 __tag2 g Commit `41860e602a` ("BPF: Support btf_type_tag attribute") implemented the following encoding: VAR(g) -> __tag1 --> __tag2 -> pointer -> __tag1 -> pointer -> int Some further experiments with linux btf_type_tag support, esp. with generating attributes in vmlinux.h, and also some internal discussion showed the following format is more desirable: VAR(g) -> pointer -> __tag2 -> __tag1 -> pointer -> __tag1 -> int The format makes it similar to other modifier like 'const', e.g., const int g which has encoding VAR(g) -> PTR -> CONST -> int Differential Revision: https://reviews.llvm.org/D113496	2021-11-09 11:34:25 -08:00
Nikita Popov	1376301c87	[InstCombine] Canonicalize range test idiom InstCombine converts range tests of the form (X > C1 && X < C2) or (X < C1 \|\| X > C2) into checks of the form (X + C3 < C4) or (X + C3 > C4). It is possible to express all range tests in either of these forms (with different choices of constants), but currently neither of them is considered canonical. We may have equivalent range tests using either ult or ugt. This proposes to canonicalize all range tests to use ult. An alternative would be to canonicalize to either ult or ugt depending on the specific constants involved -- e.g. in practice we currently generate ult for && style ranges and ugt for \|\| style ranges when going through the insertRangeTest() helper. In fact, the "clamp like" fold was relying on this, which is why I had to tweak it to not assume whether inversion is needed based on just the predicate. Proof: https://alive2.llvm.org/ce/z/_SP_rQ Differential Revision: https://reviews.llvm.org/D113366	2021-11-08 21:15:46 +01:00
Yonghong Song	41860e602a	BPF: Support btf_type_tag attribute A new kind BTF_KIND_TYPE_TAG is defined. The tags associated with a pointer type are emitted in their IR order as modifiers. For example, for the following declaration: int __tag1 * __tag1 __tag2 *g; The BTF type chain will look like VAR(g) -> __tag1 --> __tag2 -> pointer -> __tag1 -> pointer -> int In the above "->" means BTF CommonType.Type which indicates the point-to type. Differential Revision: https://reviews.llvm.org/D113222	2021-11-04 17:01:36 -07:00
Yonghong Song	f63405f6e3	BPF: Workaround an InstCombine ICmp transformation with llvm.bpf.compare builtin Commit `acabad9ff6` ("[InstCombine] try to canonicalize icmp with trunc op into mask and cmp") added a transformation to convert "(conv)a < power_2_const" to "a & <const>" in certain cases and bpf kernel verifier has to handle the resulted code conservatively and this may reject otherwise legitimate program. This commit tries to prevent such a transformation. A bpf backend builtin llvm.bpf.compare is added. The ICMP insn, which is subject to above InstCombine transformation, is converted to the builtin function. The builtin function is later lowered to original ICMP insn, certainly after InstCombine pass. With this change, all affected bpf strobemeta* selftests are passed now. Differential Revision: https://reviews.llvm.org/D112938	2021-11-01 14:46:20 -07:00
Yonghong Song	0472e83ffc	BPF: emit BTF_KIND_DECL_TAG for typedef types If a typedef type has __attribute__((btf_decl_tag("str"))) with bpf target, emit BTF_KIND_DECL_TAG for that type in the BTF. Differential Revision: https://reviews.llvm.org/D112259	2021-10-21 12:09:42 -07:00
Yonghong Song	cd40b5a712	BPF: set .BTF and .BTF.ext section alignment to 4 Currently, .BTF and .BTF.ext has default alignment of 1. For example, $ cat t.c int foo() { return 0; } $ clang -target bpf -O2 -c -g t.c $ llvm-readelf -S t.o ... Section Headers: [Nr] Name Type Address Off Size ES Flg Lk Inf Al ... [ 7] .BTF PROGBITS 0000000000000000 000167 00008b 00 0 0 1 [ 8] .BTF.ext PROGBITS 0000000000000000 0001f2 000050 00 0 0 1 But to have no misaligned data access, .BTF and .BTF.ext actually requires alignment of 4. Misalignment is not an issue for architecture like x64/arm64 as it can handle it well. But some architectures like mips may incur a trap if .BTF/.BTF.ext is not properly aligned. This patch explicitly forced .BTF and .BTF.ext alignment to be 4. For the above example, we will have [ 7] .BTF PROGBITS 0000000000000000 000168 00008b 00 0 0 4 [ 8] .BTF.ext PROGBITS 0000000000000000 0001f4 000050 00 0 0 4 Differential Revision: https://reviews.llvm.org/D112106	2021-10-19 16:26:01 -07:00
Yonghong Song	009f3a89d8	BPF: remove intrindics @llvm.stacksave() and @llvm.stackrestore() Paul Chaignon reported a bpf verifier failure ([1]) due to using non-ABI register R11. For the test case, llvm11 is okay while llvm12 and later generates verifier unfriendly code. The failure is related to variable length array size. The following mimics the variable length array definition in the test case: struct t { char a[20]; }; void foo(void *); int test() { const int a = 8; char tmp[AA + sizeof(struct t) + a]; foo(tmp); ... } Paul helped bisect that the following llvm commit is responsible: `552c6c2328` ("PR44406: Follow behavior of array bound constant folding in more recent versions of GCC.") Basically, before the above commit, clang frontend did constant folding for array size "AA + sizeof(struct t) + a" to be 68, so used alloca for stack allocation. After the above commit, clang frontend didn't do constant folding for array size any more, which results in a VLA and llvm.stacksave/llvm.stackrestore is generated. BPF architecture API does not support stack pointer (sp) register. The LLVM internally used R11 to indicate sp register but it should not be in the final code. Otherwise, kernel verifier will reject it. The early patch ([2]) tried to fix the issue in clang frontend. But the upstream discussion considered frontend fix is really a hack and the backend should properly undo llvm.stacksave/llvm.stackrestore. This patch implemented a bpf IR phase to remove these intrinsics unconditionally. If eventually the alloca can be resolved with constant size, r11 will not be generated. If alloca cannot be resolved with constant size, SelectionDag will complain, the same as without this patch. [1] https://lore.kernel.org/bpf/20210809151202.GB1012999@Mem/ [2] https://reviews.llvm.org/D107882 Differential Revision: https://reviews.llvm.org/D111897	2021-10-18 09:51:19 -07:00
Yonghong Song	1321e47298	BPF: rename BTF_KIND_TAG to BTF_KIND_DECL_TAG Per discussion in https://reviews.llvm.org/D111199, the existing btf_tag attribute will be renamed to btf_decl_tag. This patch updated BTF backend to use btf_decl_tag attribute name and also renamed BTF_KIND_TAG to BTF_KIND_DECL_TAG. Differential Revision: https://reviews.llvm.org/D111592	2021-10-11 21:33:39 -07:00
Yonghong Song	ea72b0319d	BPF: make 32bit register spill with 64bit alignment In llvm, for non-alu32 mode, the stack alignment is 64bit so only one 64bit spill per 64bit slot. For alu32 mode, the stack alignment is 32bit, so it is possible to have two 32bit spills per 64bit slot. Currently, bpf kernel verifier does not preserve register states for 32bit spills. That is, one 32bit register may hold a constant value or a bounded range before spill. After reload from the stack, the information is lost and sometimes this may cause verifier failure. For 64bit register spill, the verifier indeed tries to preserve the register state for reloading. The current verifier can be modestly changed to handle one 32bit spill per 64bit stack slot with state-preserving reload. Handling two 32bit spills per 64bit stack slot will require substantial changes. This patch changes stack alignment for alu32 to be 64bit. This way, for any 64bit slot in alu32 mode, only one 32bit or 64bit register values can be saved. Together with previous-mentioned verifier enhancement, 32bit spill can be handled with state preserving. Note that llvm stack slot coallescing seems only doing adjacent packing which may leave some holes in the stack. For example, stack slot 8 <== 8 bytes stack slot 4 <== 8 bytes with 4 byte hole stack slot 8 <== 8 bytes stack slot 4 <== 4 bytes Differential Revision: https://reviews.llvm.org/D109073	2021-09-20 21:00:25 -07:00
Nikita Popov	90ec6dff86	[OpaquePtr] Forbid mixing typed and opaque pointers Currently, opaque pointers are supported in two forms: The -force-opaque-pointers mode, where all pointers are opaque and typed pointers do not exist. And as a simple ptr type that can coexist with typed pointers. This patch removes support for the mixed mode. You either get typed pointers, or you get opaque pointers, but not both. In the (current) default mode, using ptr is forbidden. In -opaque-pointers mode, all pointers are opaque. The motivation here is that the mixed mode introduces additional issues that don't exist in fully opaque mode. D105155 is an example of a design problem. Looking at D109259, it would probably need additional work to support mixed mode (e.g. to generate GEPs for typed base but opaque result). Mixed mode will also end up inserting many casts between i8* and ptr, which would require significant additional work to consistently avoid. I don't think the mixed mode is particularly valuable, as it doesn't align with our end goal. The only thing I've found it to be moderately useful for is adding some opaque pointer tests in between typed pointer tests, but I think we can live without that. Differential Revision: https://reviews.llvm.org/D109290	2021-09-10 15:18:23 +02:00
Yonghong Song	e52617c31d	BPF: change BTF_KIND_TAG format Previously we have the following binary representation: struct bpf_type { name, info, type } struct btf_tag { __u32 component_idx; } If the tag points to a struct/union/var/func type, we will have kflag = 1, component_idx = 0 if the tag points to struct/union member or func argument, we will have kflag = 0, component_idx = 0, ..., vlen - 1 The above rather makes interface complex to have both kflag and component needed to determine its legality and index. This patch simplifies the interface by removing kflag involvement. component_idx = (u32)-1 : tag pointing to a type component_idx = 0 ... vlen - 1 : tag pointing to a member or argument and kflag is always 0 and there is no need to check. Differential Revision: https://reviews.llvm.org/D109560	2021-09-09 19:03:57 -07:00
Yonghong Song	4948927058	[BPF] support btf_tag attribute in .BTF section A new kind BTF_KIND_TAG is added to .BTF to encode btf_tag attributes. The format looks like CommonType.name : attribute string CommonType.type : attached to a struct/union/func/var. CommonType.info : encoding BTF_KIND_TAG kflag == 1 to indicate the attribute is for CommonType.type, or kflag == 0 for struct/union member or func argument. one uint32_t : to encode which member/argument starting from 0. If one particular type or member/argument has more than one attribute, multiple BTF_KIND_TAG will be generated. Differential Revision: https://reviews.llvm.org/D106622	2021-08-28 21:02:27 -07:00
Yonghong Song	e52946b9ab	BPF: avoid NE/EQ loop exit condition Kuniyuki Iwashima reported in [1] that llvm compiler may convert a loop exit condition with "i < bound" to "i != bound", where "i" is the loop index variable and "bound" is the upper bound. In case that "bound" is not a constant, verifier will always have "i != bound" true, which will cause verifier failure since to verifier this is an infinite loop. The fix is to avoid transforming "i < bound" to "i != bound". In llvm, the transformation is done by IndVarSimplify pass. The compiler checks loop condition cost (i = i + 1) and if the cost is lower, it may transform "i < bound" to "i != bound". This patch implemented getArithmeticInstrCost() in BPF TargetTransformInfo class to return a higher cost for such an operation, which will prevent the transformation for the test case added in this patch. [1] https://lore.kernel.org/netdev/1994df05-8f01-371f-3c3b-d33d7836878c@fb.com/ Differential Revision: https://reviews.llvm.org/D107483	2021-08-04 16:54:16 -07:00
Nikita Popov	be5af50e7d	[BPF] Use elementtype attribute for preserve.array/struct.index intrinsics Use the elementtype attribute introduced in D105407 for the llvm.preserve.array/struct.index intrinsics. It carries the element type of the GEP these intrinsics effectively encode. This patch: * Adds a verifier check that the attribute is required. * Adds it in the IRBuilder methods for these intrinsics. * Autoupgrades old bitcode without the attribute. * Updates the lowering code to use the attribute rather than the pointer element type. * Updates lots of tests to specify the attribute. * Adds -force-opaque-pointers to the intrinsic-array.ll test to demonstrate they work now. https://reviews.llvm.org/D106184	2021-07-17 11:09:18 +02:00
Simon Pilgrim	d561b6fbdb	[InstCombine] Fold (select C, (gep Ptr, Idx), Ptr) -> (gep Ptr, (select C, Idx, 0)) (PR50183) (REAPPLIED) As discussed on PR50183, we already fold to prefer 'select-of-idx' vs 'select-of-gep': define <4 x i32>* @select0a(<4 x i32>* %a0, i64 %a1, i1 %a2, i64 %a3) { %gep0 = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %a1 %gep1 = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %a3 %sel = select i1 %a2, <4 x i32>* %gep0, <4 x i32>* %gep1 ret <4 x i32>* %sel } --> define <4 x i32>* @select1a(<4 x i32>* %a0, i64 %a1, i1 %a2, i64 %a3) { %sel = select i1 %a2, i64 %a1, i64 %a3 %gep = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %sel ret <4 x i32>* %gep } This patch adds basic handling for the 'fallthrough' cases where the gep idx == 0 has been folded away to the base address: define <4 x i32>* @select0(<4 x i32>* %a0, i64 %a1, i1 %a2) { %gep = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %a1 %sel = select i1 %a2, <4 x i32>* %a0, <4 x i32>* %gep ret <4 x i32>* %sel } --> define <4 x i32>* @select1(<4 x i32>* %a0, i64 %a1, i1 %a2) { %sel = select i1 %a2, i64 0, i64 %a1 %gep = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %sel ret <4 x i32>* %gep } Reapplied with a fix for the bpf "-bpf-disable-avoid-speculation" tests Differential Revision: https://reviews.llvm.org/D105901	2021-07-14 12:21:01 +01:00
Yonghong Song	6a2ea84600	BPF: Add more relocation kinds Currently, BPF only contains three relocations: R_BPF_NONE for no relocation R_BPF_64_64 for LD_imm64 and normal 64-bit data relocation R_BPF_64_32 for call insn and normal 32-bit data relocation Also .BTF and .BTF.ext sections contain symbols in allocated program and data sections. These two sections reserved 32bit space to hold the offset relative to the symbol's section. When LLVM JIT is used, the LLVM ExecutionEngine RuntimeDyld may attempt to resolve relocations for .BTF and .BTF.ext, which we want to prevent. So we used R_BPF_NONE for such relocations. This all works fine until when we try to do linking of multiple objects. . R_BPF_64_64 handling of LD_imm64 vs. normal 64-bit data is different, so lld target->relocate() needs more context to do a correct job. . The same for R_BPF_64_32. More context is needed for lld target->relocate() to differentiate call insn vs. normal 32-bit data relocation. . Since relocations in .BTF and .BTF.ext are set to R_BPF_NONE, they will not be relocated properly when multiple .BTF/.BTF.ext sections are merged by lld. This patch intends to address this issue by adding additional relocation kinds: R_BPF_64_ABS64 for normal 64-bit data relocation R_BPF_64_ABS32 for normal 32-bit data relocation R_BPF_64_NODYLD32 for .BTF and .BTF.ext style relocations. The old R_BPF_64_{64,32} semantics: R_BPF_64_64 for LD_imm64 relocation R_BPF_64_32 for call insn relocation The existing R_BPF_64_64/R_BPF_64_32 mapping to numeric values is maintained. They are the most common use cases for bpf programs and we want to maintain backward compatibility as much as possible. ExecutionEngine RuntimeDyld BPF relocations are adjusted as well. R_BPF_64_{ABS64,ABS32} relocations will be resolved properly and other relocations will be ignored. Two tests are added for RuntimeDyld. Not handling R_BPF_64_NODYLD32 in RuntimeDyldELF.cpp will result in "Relocation type not implemented yet!" fatal error. FK_SecRel_4 usages in BPFAsmBackend.cpp and BPFELFObjectWriter.cpp are removed as they are not triggered in BPF backend. BPF backend used FK_SecRel_8 for LD_imm64 instruction operands. Differential Revision: https://reviews.llvm.org/D102712	2021-05-25 08:19:13 -07:00
serge-sans-paille	4ab3041acb	Revert "[NFC] remove explicit default value for strboolattr attribute in tests" This reverts commit `bda6e5bee0`. See https://lab.llvm.org/buildbot/#/builders/109/builds/15424 for instance	2021-05-24 19:43:40 +02:00

1 2 3 4 5 ...

268 Commits