Summary:
[llvm][TextAPI] adding inlining reexported libraries support
* this patch adds reader/writer support for MachO tbd files.
The usecase is to represent reexported libraries in top level library
that won't need to exist for linker indirection because all of the
needed content will be inlined in the same document.
Reviewers: ributzka, steven_wu, jhenderson
Reviewed By: ributzka
Subscribers: JDevlieghere, hiraditya, mgrang, dexonsmith, rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67646
Summary:
Splitting Knowledge retention into Queries in Analysis and Builder into Transform/Utils
allows Queries and Transform/Utils to use Analysis.
Reviewers: jdoerfert, sstefan1
Reviewed By: jdoerfert
Subscribers: mgorny, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77171
This patch adds
- New arguments to getMinPrefetchStride() to let the target decide on a
per-loop basis if software prefetching should be done even with a stride
within the limit of the hw prefetcher.
- New TTI hook enableWritePrefetching() to let a target do write prefetching
by default (defaults to false).
- In LoopDataPrefetch:
- A search through the whole loop to gather information before emitting any
prefetches. This way the target can get information via new arguments to
getMinPrefetchStride() and emit prefetches more selectively. Collected
information includes: Does the loop have a call, how many memory
accesses, how many of them are strided, how many prefetches will cover
them. This is NFC to before as long as the target does not change its
definition of getMinPrefetchStride().
- If a previous access to the same exact address was 'read', and the
current one is 'write', make it a 'write' prefetch.
- If two accesses that are covered by the same prefetch do not dominate
each other, put the prefetch in a block that dominates both of them.
- If a ConstantMaxTripCount is less than ItersAhead, then skip the loop.
- A SystemZ implementation of getMinPrefetchStride().
Review: Ulrich Weigand, Michael Kruse
Differential Revision: https://reviews.llvm.org/D70228
This removes some includes/forward-declarations that don't seem to be necessary in the MCA core headers
Based off a cppclean report
Differential Revision: https://reviews.llvm.org/D77073
Different file formats have different naming style for the debug
sections. The method is implemented for ELF, COFF and Mach-O formats.
Differential Revision: https://reviews.llvm.org/D76276
This is a cleanup and normalization patch that also enables reuse with
Flang later on. A follow up will clean up and move the directive ->
clauses mapping.
Differential Revision: https://reviews.llvm.org/D77112
If we have a must-tail call the callee and caller need to have matching
ABIs. Part of that is alignment which we might modify when we deduce
alignment of arguments of either. Since we would need to keep them in
sync, which is not as simple, we simply avoid deducing alignment for
arguments of the must-tail caller or callee.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D76673
We create a lot of AbstractAttributes and they live as long as
the Attributor does. It seems reasonable to allocate them via a
BumpPtrAllocator owned by the Attributor.
Reviewed By: baziotis
Differential Revision: https://reviews.llvm.org/D76589
Summary:
Currently, the comparison argument used for ATOMIC_CMP_XCHG is legalised
with GetPromotedInteger, which leaves the upper bits of the value
undefind. Since this is used for comparing in an LR/SC loop with a
full-width comparison, we must sign extend it. We introduce a new
getExtendForAtomicCmpSwapArg to complement getExtendForAtomicOps, since
many targets have compare-and-swap instructions (or pseudos) that
correctly handle an any-extend input, and the existing function
determines the extension of the result, whereas we are concerned with
the input.
This is related to https://reviews.llvm.org/D58829, which solved the
issue for ATOMIC_CMP_SWAP_WITH_SUCCESS, but not the simpler
ATOMIC_CMP_SWAP.
Reviewers: asb, lenary, efriedma
Reviewed By: asb
Subscribers: arichardson, hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, jfb, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, evandro, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74453
SUMMARY:
run clang format on the file llvm/include/llvm/MC/MCDirectives.h
Reviewers: Jason liu
Subscribers: rupprecht, seiyai,hiraditya
Differential Revision: https://reviews.llvm.org/D77170
getCallCost is only used within the different layers of TTI, with no
backend implementing it so fold the base implementation into
getUserCost. I think this is an NFC.
Differential Revision: https://reviews.llvm.org/D77050
Summary: These were templated due to SelectionDAG using int masks for shuffles and IR using unsigned masks for shuffles. But now that D72467 has landed we have an int mask version of IRBuilder::CreateShuffleVector. So just use int instead of a template
Reviewers: spatel, efriedma, RKSimon
Reviewed By: efriedma
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D77183
Instead, represent the mask as out-of-line data in the instruction. This
should be more efficient in the places that currently use
getShuffleVector(), and paves the way for further changes to add new
shuffles for scalable vectors.
This doesn't change the syntax in textual IR. And I don't currently plan
to change the bitcode encoding in this patch, although we'll probably
need to do something once we extend shufflevector for scalable types.
I expect that once this is finished, we can then replace the raw "mask"
with something more appropriate for scalable vectors. Not sure exactly
what this looks like at the moment, but there are a few different ways
we could handle it. Maybe we could try to describe specific shuffles.
Or maybe we could define it in terms of a function to convert a fixed-length
array into an appropriate scalable vector, using a "step", or something
like that.
Differential Revision: https://reviews.llvm.org/D72467
Summary:
CGProfilePass is run by default in certain new pass manager optimization pipeline. Assemblers other than llvm as (such as gnu as) cannot recognize the .cgprofile entries generated and emitted from this pass, causing build time error.
This patch adds new options in clang CodeGenOpts and PassBuilder options so that we can turn cgprofile off when not using integrated assembler.
Reviewers: Bigcheese, xur, george.burgess.iv, chandlerc, manojgupta
Reviewed By: manojgupta
Subscribers: manojgupta, void, hiraditya, dexonsmith, llvm-commits, tcwang, llozano
Tags: #llvm, #clang
Differential Revision: https://reviews.llvm.org/D62627
Summary: this patch preserve information from various places in EarlyCSE into assume bundles.
Reviewers: jdoerfert
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76769
--no-threads is a name copied from gold.
gold has --no-thread, --thread-count and several other --thread-count-*.
There are needs to customize the number of threads (running several lld
processes concurrently or customizing the number of LTO threads).
Having a single --threads=N is a straightforward replacement of gold's
--no-threads + --thread-count.
--no-threads is used rarely. So just delete --no-threads instead of
keeping it for compatibility for a while.
If --threads= is specified (ELF,wasm; COFF /threads: is similar),
--thinlto-jobs= defaults to --threads=,
otherwise all available hardware threads are used.
There is currently no way to override a --threads={1,2,...}. It is still
a debate whether we should use --threads=all.
Reviewed By: rnk, aganea
Differential Revision: https://reviews.llvm.org/D76885
This is the Waymarking algorithm implemented as an independent utility.
The utility is operating on a range of sequential elements.
First we "tag" the elements, by calling `fillWaymarks`.
Then we can "follow" the tags from every element inside the tagged
range, and reach the "head" (the first element), by calling
`followWaymarks`.
Differential Revision: https://reviews.llvm.org/D74415
This patch updates ValueLattice to distinguish between ranges that are
guaranteed to not include undef and ranges that may include undef.
A constant range guaranteed to not contain undef can be used to simplify
instructions to arbitrary values. A constant range that may contain
undef can only be used to simplify to a constant. If the value can be
undef, it might take a value outside the range. For example, consider
the snipped below
define i32 @f(i32 %a, i1 %c) {
br i1 %c, label %true, label %false
true:
%a.255 = and i32 %a, 255
br label %exit
false:
br label %exit
exit:
%p = phi i32 [ %a.255, %true ], [ undef, %false ]
%f.1 = icmp eq i32 %p, 300
call void @use(i1 %f.1)
%res = and i32 %p, 255
ret i32 %res
}
In the exit block, %p would be a constant range [0, 256) including undef as
%p could be undef. We can use the range information to replace %f.1 with
false because we remove the compare, effectively forcing the use of the
constant to be != 300. We cannot replace %res with %p however, because
if %a would be undef %cond may be true but the second use might not be
< 256.
Currently LazyValueInfo uses the new behavior just when simplifying AND
instructions and does not distinguish between constant ranges with and
without undef otherwise. I think we should address the remaining issues
in LVI incrementally.
Reviewers: efriedma, reames, aqjune, jdoerfert, sstefan1
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D76931
Add a new llvm.amdgcn.ballot intrinsic modeled on the ballot function
in GLSL and other shader languages. It returns a bitfield containing the
result of its boolean argument in all active lanes, and zero in all
inactive lanes.
This is intended to replace the existing llvm.amdgcn.icmp and
llvm.amdgcn.fcmp intrinsics after a suitable transition period.
Use the new intrinsic in the atomic optimizer pass.
Differential Revision: https://reviews.llvm.org/D65088
Leverage ARM ELF build attribute section to create ELF attribute section
for RISC-V. Extract the common part of parsing logic for this section
into ELFAttributeParser.[cpp|h] and ELFAttributes.[cpp|h].
Differential Revision: https://reviews.llvm.org/D74023
Compbinary format uses MD5 to represent strings in name table. That gives smaller profile without the need of compression/decompression when writing/reading the profile. The patch adds the support in extbinary format. It is off by default but user can choose to enable it.
Note the feature of using MD5 in name table can bring very small chance of name conflict leading to profile mismatch. Besides, profile using the feature won't have the profile remapping support.
Differential Revision: https://reviews.llvm.org/D76255
When we have
```
a = G_OR x, x
```
or
```
b = G_AND y, y
```
We can drop the G_OR/G_AND and just use x/y respectively.
Also update arm64-fallback.ll because there was an or in there which hits this
transformation.
Differential Revision: https://reviews.llvm.org/D77105
Implement identity combines for operations like the following:
```
%a = G_SUB %b, 0
```
This can just be replaced with %b.
Over CTMark, this gives some minor size improvements at -O3.
Differential Revision: https://reviews.llvm.org/D76640
Summary:
This code was throwing away the opcode for a boolean, which was then
reconstructing the opcode from that boolean. Just pass the opcode, and
forget the boolean.
Reviewers: srhines
Reviewed By: srhines
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77100
Extend the FileCollector's API with addDirectory which adds a directory
and its contents to the VFS mapping.
Differential revision: https://reviews.llvm.org/D76671
Also add a test case to wasm-ld that asserts without this change.
Internally wasm-ld builds a StringMap of exported functions and it seems
like allowing empty string in the set is preferable to adding checks.
This assert looks like it was most likely just a historical accident.
It started life here purely to support InputLanguagesSet:
eeac27e38c
Then got extracted here:
e57a403338
Then got moved to AST here
5c48bae209
With the `InLang` paramater name still intact which suggested is
InputLanguagesSet origins.
Differential Revision: https://reviews.llvm.org/D74589
Summary:
Code frequently relies upon the results of "is.constant" intrinsics to
DCE invalid code paths. We don't want the intrinsic to be made control-
dependent on any additional values. For instance, we can't split a PHI
into a "constant" and "non-constant" part via jump threading in order
to "optimize" the constant part, because the "is.constant" intrinsic is
meant to return "false".
Reviewers: wmi, kazu, MaskRay
Reviewed By: kazu
Subscribers: jdoerfert, efriedma, joerg, lebedev.ri, nikic, xbolva00, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75799