The tcMultiplyPart routine has a flag that says whether to add to the
accumulator or overwrite it. By using the overwrite mode on the first
iteration we don't need to initialize the accumulator to zero.
Note, the initialization in tcFullMultiply was only initializing the
first rhsParts of dst. tcMultiplyPart always overwrites the rhsParts+1
part that just contains the last carry. The first write to each part of
dst past rhsParts is a carry write so that's how the upper part of dst
is initialized.
Improvements include
* Enable `ListeningSocket::accept` to timeout after a specified amount
of time or block indefinitely
* Enable `ListeningSocket::createUnix` to handle instances where the
target socket address already exists and differentiate between
situations where the existing file does and does not already have a
bound socket
* Doxygen comments
Functionality added for the module build daemon
---------
Co-authored-by: Michael Spencer <bigcheesegs@gmail.com>
This allows the subtract to reuse the storage of T. T will be assigned
over by the condition on the next iteration. I think assigning over a
moved from value should be ok.
The commit f11b056c (#76304) breaks `clang` and other tools if they are
used from a RAMDrive. `GetFinalPathNameByHandleW()` may return 0 and
GetLastError 0x28. This patch fixes that issue. Note `real_path()` uses
`openFileForRead()` but it reports the error only if failed to open a
file. Getting `RealPath` is optional functionality.
BTW, `sys::fs::real_path()` resolves not only symlinks, but also network
drives and virtual drives created by the `subst` tool. It may break an
automation. It is better to detect symlinks and resolve only symlinks.
It's useful to have some significant build options visible in the
version when investigating problems with a specific compiler artifact.
This makes it easy to see if assertions, expensive checks, sanitizers,
etc. are enabled when checking a compiler version.
Example config line output:
Build configuration: +unoptimized, +assertions, +asan, +ubsan
All callers have been changed to use the new simpler overload with an
implicit modulus of 2^BitWidth. The old form was never used or tested
with non-power-of-two modulus anyway.
The current APInt::multiplicativeInverse takes a modulus which can be
any value, but all in-tree callers use a power of two. Moreover, most
callers want to use two to the power of the width of an existing APInt,
which is awkward because 2^N is not representable as an N-bit APInt.
Add a new overload of multiplicativeInverse which implicitly uses
2^BitWidth as the modulus.
Some support code, e.g. llvm/Support/Endian.h, uses
llvm::support::detail, but the format-related code uses llvm::detail. On
VS2019, when a C++ file includes both headers, a `detail::` from
`namespace llvm { ... }` becomes ambiguous.
44253a9c breaks TensorFlow and
[JAX](https://github.com/google/jax/actions/runs/8507773013/job/23300219405)
build because of this.
Since llvm::X::detail seems like a cleaner solution and is used in other
places as well (e.g. llvm::yaml::detail), we should probably migrate all
llvm::detail usages to llvm::X::detail.
I was profiling our compiler and noticed that `llvm::get_threadid` was
at the top of the hotlist, taking up a surprising 5% (7 seconds) in the
profile trace. It seems that computing this on MacOS systems is
non-trivial, so cache the result in a thread_local.
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
The color methods in formatted_raw_ostream were forwarding directly to
the underlying stream without considering existing buffered output. This
would cause incorrect colored output for buffered uses of
formatted_raw_ostream.
Fix this issue by applying the color to the formatted_raw_ostream itself
and temporarily disabling scanning of any color related output so as not
to affect the position tracking.
This fix means that workarounds that forced formatted_raw_ostream
buffering to be disabled can be removed. In the case of llvm-objdump,
this can improve disassembly performance when redirecting to a file by
more than an order of magnitude on both Windows and Linux. This
improvement restores the disassembly performance when redirecting to a
file to a level similar to before color support was added.
Fixes https://github.com/llvm/llvm-project/issues/83046
There is a race condition when calling `GetFileAttributesW` that can
cause it to return `ERROR_ACCESS_DENIED` on a path which exists, which
is unexpected for callers using this function to check for file
existence by passing `AccessMode::Exist`. This was manifesting as a
compiler crash on Windows downstream in the Swift compiler when using
the `-index-store-path` flag (more information in
https://github.com/apple/llvm-project/issues/8224).
I looked for alternate APIs to avoid bringing in `shlwapi.h`, but didn't
see any good candidates. I'm not tied at all to this solution, any
feedback and alternative approaches are more than welcome.
Defines a subset of attributes and emits them to a section called
.hexagon.attributes.
The current attributes recorded are the attributes needed by
llvm-objdump to automatically determine target features and eliminate
the need to manually pass features.
When we lookup in the external filesystem, do not remove . and ..
components from the original path. For .. this is a correctness issue in
the presence of symlinks, while for . it is simply better practice to
preserve the original path to better match the behaviour of other
filesystems. The only modification we need is to apply the working
directory, since it could differ from the external filesystem.
rdar://123655660
Removing the extension to FullWidth should make them much more efficient
in the 64-bit case, because 65-bit APInts use a separate allocation for
their bits.
Include the `LLVM_REPOSITORY` and `LLVM_REVISION` in the version output
of tools using `cl::PrintVersionMessage()` such as dwarfdump and
dsymutil.
Before:
```
$ llvm-dwarfdump --version
LLVM (http://llvm.org/):
LLVM version 19.0.0git
Optimized build with assertions.
```
After:
```
$ llvm-dwarfdump --version
LLVM (http://llvm.org/):
LLVM version 19.0.0git (git@github.com:llvm/llvm-project.git 8467457afc)
Optimized build with assertions.
```
rdar://121526866
These were in LLVM 17 but removed from LLVM 18 due to an incorrect
extension name being used.
This restores them with new extension names that match SiFive's
downstream compiler. The extension name has been used internally for
some time. It uses XSiFive instead of XSf like the newer extensions.
`cease` did not have an internal extension name so its using the `XSf`
convention.
The spec for the instructions is here
https://sifive.cdn.prismic.io/sifive/767804da-53b2-4893-97d5-b7c030ae0a94_s76mc_core_complex_manual_21G3.pdf
though the extension name is not listed.
Column width in the extension printing had to be changed to accommodate
a longer extension name.
When I created KnownBits::absdiff, I totally missed that we already have ISD::ABDS/ABDU nodes, and we use this term in other places/targets as well.
I've added the KnownBits::abds implementation and renamed KnownBits::absdiff to KnownBits::abdu.
Followup to #84791
The exact flag basically allows us to set an upper bound on shift
amount when we have a known 1 in `LHS`.
Typically we deduce exact using knownbits (on non-exact incoming
shifts), so this is particularly impactful, but may be useful in some
circumstances.
Closes#84254
Added --offload-compression-level= option to clang and
-compression-level=
option to clang-offload-bundler for controlling compression level.
Added support of long distance matching (LDM) for llvm::zstd which is
off
by default. Enable it for clang-offload-bundler by default since it
improves compression rate in general.
Change default compression level to 3 for zstd for clang-offload-bundler
since it works well for bundle entry size from 1KB to 32MB, which should
cover most of the clang-offload-bundler usage. Users can still specify
compression level by -compression-level= option if necessary.
LLVM is inconsistent about how it converts `errno` to `std::error_code`.
This can cause problems because values outside of `std::errc` compare
differently if one is system and one is generic on POSIX systems.
This is even more of a problem on Windows where use of the system
category is just wrong, as that is for Windows errors, which have a
completely different mapping than POSIX/generic errors. This patch fixes
one instance of this mistake in `JSONTransport.cpp`.
This patch adds `errnoAsErrorCode()` which makes it so people do not
need to think about this issue in the future. It also cleans up a lot of
usage of `errno` in LLVM and Clang.
On many current Linux systems, coredumps are no longer dumped in the CWD
but instead piped to a utility such as systemd-coredumpd that stores
them in a deterministic location. This can be done by setting the
kernel.core_pattern sysctl to start with a '|'. However, when using such
a setup the kernel ignores a coredump limit of 0 (since there is no file
being written) and we can end up piping many gigabytes of data to
systemd-coredumpd which causes the test suite to freeze for a long time.
While most piped coredump handlers do respect the crashing processes'
RLIMIT_CORE, this is notable not the case for Debian's systemd-coredump
due to a local patch that changes sysctl.d/50-coredump.conf to ignore
the specified limit and instead use RLIM_INFINITY
(https://salsa.debian.org/systemd-team/systemd/-/commit/64599ffe44f0d).
Fortunately there is a workaround: the kernel recognizes the magic value
of 1 for RLIMIT_CORE to disable coredumps when piping. One byte is also
too small to generate any coredump, so it effectively behaves as if we
had set the value to zero.
The alternative to using RLIMIT_CORE=1 would be to use prctl() with the
PR_SET_DUMPABLE flag, however that also prevents ptrace(), so makes it
impossible to attach a debugger.
See https://github.com/llvm/llvm-project/pull/83701 and
https://github.com/llvm/llvm-project/issues/45797
Reviewed By: MaskRay
Pull Request: https://github.com/llvm/llvm-project/pull/83703
Relands #81708, which was reverted by
f410f74cd5, now with a corrected unit
test. Origionally the test failed on Windows when run with lit as
`GetConsoleWindow` could not retrieve a window handle regardless of
whether `DetachProcess` was `true` or `false`. The test now uses
`GetStdHandle(STD_OUTPUT_HANDLE)` which does not rely on a console
window existing. Original commit message below.
Adds a new parameter, `bool DetachProcess` with a default option of
`false`, to `llvm::sys::ExecuteNoWait`, which, when set to `true`,
executes the specified program without a controlling terminal.
Functionality added so that the module build daemon can be run without a
controlling terminal.
The base class llvm::ThreadPoolInterface will be renamed
llvm::ThreadPool in a subsequent commit.
This is a breaking change: clients who use to create a ThreadPool must
now create a DefaultThreadPool instead.
If the "march" has some extension with version that is not supported, it
returns the error message like: "error: invalid arch name 'some_arch',
unsupported version number 2.0 for extension 'some_arch'", which is not
precise enough, it should return the message that only tells users "the
extension is not supported".
Add functionality to APInt::toString() that allows it to insert
separators between groups of digits, using the C++ literal
separator ' between groups.
Fixes issue #58228
Reviewers: @AaronBallman, @cjdb, @tbaederr
As per the comment in `raw_fd_ostream`'s destructor, you must call
`clear_error()` to prevent a call to `report_fatal_error()`. There's not
really a way to test this, but we did encounter it in the wild.
rdar://117347895
Previously, the function `install_bad_alloc_error_handler` was asserting that a bad_alloc error handler was not already installed. However, it was actually checking for the fatal-error handler, not for the bad_alloc error handler, causing an assertion failure if a fatal-error handler was already installed.
Fixes#83040
In a04879ce7d, dsymutil switched from using TempFile to
createTemporaryFile. This caused a regression because the two use
different permissions:
- TempFile opens the file as all_read | all_write
- createTemporaryFile opens the file as owner_read | owner_write
The latter turns out to be problematic for dsymutil because it either
promotes the temporary to a proper output file, or it would pass it to
`lipo` to create a universal binary and `lipo` preserves the permissions
of the input files. Either way, this caused issues when the build system
was run as a different user than the one ingesting the resulting
binaries.
I did some version control archeology and I couldn't find evidence that
these permissions were chosen purposely. Both could be considered
reasonable default.
This patch changes the permissions to `all read | all write` to make the
two consistent and match the one currently used by the higher level
abstraction (TempFile).
rdar://123722848
Primary change is to add a flag `--pretty-pgo-analysis-map` to
llvm-readobj and llvm-objdump that prints block frequencies and branch
probabilities in the same manner as BFI and BPI respectively. This can
be helpful if you are manually inspecting the outputs from the tools.
In order to print, I moved the `printBlockFreqImpl` function from
Analysis to Support and renamed it to `printRelativeBlockFreq`.