Previously, the SparseTensorUtils.cpp library contained a C++ core implementation, but hid it in an anonymous namespace and only exposed a C-API for accessing it. Now we are factoring out that C++ core into a standalone C++ library so that it can be used directly by downstream clients (per request of one such client). This refactoring has been decomposed into a stack of differentials in order to simplify the code review process, however the full stack of changes should be considered together.
* D133462: Part 1: split one file into several
* (this): Part 2: Reorder chunks within files
* D133831: Part 3: General code cleanup
* D133833: Part 4: Update documentation
This part moves chunks of code within files, but again aims to make no other changes. Many of these movements are part of a stylistic shift to reorder the components of class definitions as follows: data members, ctors/factories, getters, other public methods, private methods.
Depends On D133462
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D133830
Previously, the SparseTensorUtils.cpp library contained a C++ core implementation, but hid it in an anonymous namespace and only exposed a C-API for accessing it. Now we are factoring out that C++ core into a standalone C++ library so that it can be used directly by downstream clients (per request of one such client). This refactoring has been decomposed into a stack of differentials in order to simplify the code review process, however the full stack of changes should be considered together.
* (this): Part 1: split one file into several
* D133830: Part 2: Reorder chunks within files
* D133831: Part 3: General code cleanup
* D133833: Part 4: Update documentation
This part aims to make no changes other than the 1:N file splitting, and things which are forced to accompany that change.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D133462
Instead of using find_package(HIP) to find FindHIP.cmake, which
doesn't seem to be the preferred way to find HIP anymore, use
find_package(hip CONFIG) to find the HIP configuration. Give
preference to ${ROCM_PATH} over ${ROCM_PATH}/hip in order to handle
the fact that newer ROCm versions prefer the include path to use
${ROCM_PATH}/include/hip over ${ROCM_PATH}/hip/innclude/hip (the
latter throws up a bunch of deprecation warnings)
Then, instead of trying to manually find the host-side headers and
runtime library by hand, use the hip::host and hip::amdhip64 libraries
that the config module defines.
This makes the CMake config much less error-prone and brings it in
line with the recommended approach to finding HIP.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D134753
The push/pop context APIs are deprecated in HIP, and keeping the
default device set is handled in IHP using hipSetDevice().
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D134747
With the CMake file as written, if code elsewhere had set ROCM_PATH,
then HIP_PATH would not be set, breaking the rest of the ROCm
execution utility handling.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D134674
A number of mlir tests `FAIL` on Solaris/sparcv9 with `Target has no JIT
support`. This patch fixes that by mimicing `clang/test/lit.cfg.py` which
implements a `host-supports-jit` keyword for this. The gtest-based unit
tests don't support `REQUIRES:`, so lack of support needs to be hardcoded
there.
Tested on `amd64-pc-solaris2.11` (`check-mlir` results unchanged) and
`sparcv9-sun-solaris2.11` (only one unrelated failure left).
Differential Revision: https://reviews.llvm.org/D131151
c7ec6e19d5 made LLVM adhere to the x86
psABI and pass bf16 in SSE registers instead of GPRs. This breaks the
custom versions of runtime functions we have for bf16 conversion. A
great fix for this would be to use __bf16 types instead which carry the
right ABI, but that type isn't widely available.
Instead just pretend it's a 32 bit float on the ABI boundary and
carefully cast it to the right type.
Fixes#57042
Using if (TARGET ${LLVM_NATIVE_ARCH}) only works if MLIR is built
together with LLVM, but not for standalone builds of MLIR. The
correct way to check this is
if (${LLVM_NATIVE_ARCH} IN_LIST LLVM_TARGETS_TO_BUILD), as the
LLVM build system exports LLVM_TARGETS_TO_BUILD.
To avoid repeating the same check many times, add a
MLIR_ENABLE_EXECUTION_ENGINE variable.
Differential Revision: https://reviews.llvm.org/D131071
aligned_alloc was added in MacOS 10.15, some users want to support older
versions. The runtime functions makes this easy, so just put in a call
to posix_memalign, which provides the same functionality.
These functions don't depend on the C++ runtime and therefore belong to
CRunnerUtils. Clean up the macros on the way as `_MSC_VER` indicates the
compiler, not the platform, which is indicated by `_WIN32` and will be
present when, e.g., compiling with minGW.
Reviewed By: rdzhabarov
Differential Revision: https://reviews.llvm.org/D130025
When converted to the LLVM dialect, the memref.alloc and memref.free operations were generating calls to hardcoded 'malloc' and 'free' functions. This didn't leave any freedom to users to provide their custom implementation. Those operations now convert into calls to '_mlir_alloc' and '_mlir_free' functions, which have also been implemented into the runtime support library as wrappers to 'malloc' and 'free'. The same has been done for the 'aligned_alloc' function.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D128791
Currently, there've been a lot of warnings while building MLIR.
This change fixes the warnings listed below.
.../SparseTensorUtils.cpp: In instantiation of ‘...::openSparseTensorCOO(...) [with ...]’:
.../SparseTensorUtils.cpp:1672:3: required from here
.../SparseTensorUtils.cpp:87:21: warning: format ‘%d’ expects argument of type ‘int’, but argument 3 has type ‘PrimaryType’ [-Wformat=]
.../OptUtils.cpp:36:5: warning: this statement may fall through [-Wimplicit-fallthrough=]
.../AffineOps.cpp:1741:32: warning: suggest parentheses around ‘&&’ within ‘||’ [-Wparentheses]
Reviewed By: aartbik, wrengr, aeubanks
Differential Revision: https://reviews.llvm.org/D128993
This adds weak versions of the truncation libcalls in case the runtime
environment doesn't have them.
Differential Revision: https://reviews.llvm.org/D128091
This fixes all sorts of ABI issues due to passing by-value
(using by-reference with memref's exclusively).
Reviewed By: bkramer
Differential Revision: https://reviews.llvm.org/D128018
Support complex numbers for Matrix Market Exchange Formats. Add a test case.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D127138
This is the first PR to add `F16` and `BF16` support to the sparse codegen. There are still problems in supporting these two data types, such as `BF16` is not quite working yet.
Add tests cases.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D127010
The previous macro definition using `{...}` would fail to compile when the callsite uses a semicolon followed by an else-statement (i.e., `if (...) FATAL(...); else ...;`). Replacing the simple braces with `do{...}while(0)` (n.b., semicolon not included in the macro definition) enables callsites to use the semicolon plus else-statement syntax without problems. The new definition now requires the semicolon at all callsites, but since it was already being called that way nothing changes.
For more explanation, see <https://gcc.gnu.org/onlinedocs/cpp/Swallowing-the-Semicolon.html>
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D126514
The primary goal of this change is to define readSparseTensorShape. Whereas the SparseTensorFile class is merely introduced as a way to reduce code duplication along the way.
Depends On D126106
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D126233
The semicolons were introduced in D126105 in order to correct clang-format, but I forgot this file must be compiled as C++98 rather than C++11.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D126561
This is a followup to D126105 to move functions in SparseTensorUtils.cpp to match their locations in SparseTensorUtils.h
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D126106
This change makes the public API of SparseTensorUtils.cpp explicit, whereas before the publicity of these functions was only implicit. Implicit publicity is sufficient for mlir-opt to generate calls to these functions, but it's not enough to enable C/C++ code to call them directly in the usual way (i.e., without going through codegen). Thus, leaving the publicity implicit prevents development of other tools (e.g., microbenchmarks).
In addition this change also marks the functions MLIR_CRUNNERUTILS_EXPORT, which is required by the JIT under certain configurations (albeit not for anything in our test suite).
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D126105
By closing over the `rank` itself rather than `this`, we save a method call on each iteration. A minor optimization, but one that adds up.
Depends On D126016
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D126019
This is a followup to D125431, to keep from confusing the machinery that generates diffs (since combining these two changes into one would obfuscate the changes actually made in the previous differential).
Depends On D125431
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D125432
In addition to reducing code repetition, this also helps ensure that the various API functions follow the naming convention of mlir::sparse_tensor::primaryTypeFunctionSuffix (e.g., due to typos in the repetitious code).
Depends On D125428
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D125431
This enables the compiler to perform devirtualization. And benchmarks
indicate devirtualization can sometimes give considerable speedup.
Depends On D122061
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D125428
This is the first implementation of complex (f64 and f32) support
in the sparse compiler, with complex add/mul as first operations.
Note that various features are still TBD, such as other ops, and
reading in complex values from file. Also, note that the
std::complex<float> had a bit of an ABI issue when passed as
single argument. It is still TBD if better solutions are possible.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D125596