Files
clice/scripts/setup-llvm.py
ykiko 592b37417e feat: cross-compile & upgrade LLVM to 21.1.8 (#390)
## Summary

This PR adds cross-compilation support for three new target platforms,
upgrades LLVM to 21.1.8, and overhauls the CI pipelines around
cross-builds and testing.

## Cross-compilation

New target triples accepted via `-DCLICE_TARGET_TRIPLE=...`:

| Target triple | Host | Output |
|---|---|---|
| `x86_64-apple-darwin` | macos-15 (arm64) | macOS x64 |
| `aarch64-linux-gnu` | ubuntu-24.04 (x64) | Linux arm64 |
| `aarch64-pc-windows-msvc` | windows-2025 (x64) | Windows arm64 |

- `cmake/toolchain.cmake` — maps `CLICE_TARGET_TRIPLE` to
`CMAKE_SYSTEM_NAME`/`CMAKE_SYSTEM_PROCESSOR`/compiler `--target`; picks
up conda aarch64 sysroot when cross-compiling Linux.
- `cmake/llvm.cmake` — forwards target platform/arch to `setup-llvm.py`
so the right prebuilt LLVM is downloaded for the target.
- `CMakeLists.txt` — uses a host-side `flatc` from `PATH` under
`CMAKE_CROSSCOMPILING` instead of the in-tree target build.
- `pixi.toml`:
  - Adds `osx-64`, `linux-aarch64`, `win-arm64` platforms.
- New environments: `cross-macos-x64`, `cross-linux-aarch64` (adds
`gcc_linux-aarch64` + `sysroot_linux-aarch64`), `cross-windows-arm64`.
- New lightweight `test-run` env used on native ARM/x64 runners to
execute cross-built artifacts (pulls in upstream clang+lld on macOS so
tests don't fall back to Apple clang).
- `scripts/activate_cross_linux.sh` — exports `CONDA_PREFIX`-relative
paths for the aarch64 toolchain.
- `scripts/build-llvm.py` — `--target-triple` support and a
`build_native_tools()` helper that produces host `llvm-tblgen` /
`clang-tblgen` needed when cross-compiling LLVM itself.

## LLVM upgrade 21.1.4 → 21.1.8

- `cmake/package.cmake` bumps `setup_llvm("21.1.8")`.
- `config/llvm-manifest.json` regenerated with 6 new cross-compiled
entries and a new `arch` field on every entry so lookup is `(version,
platform, arch, lto, build_type)`.
- `scripts/setup-llvm.py` — honours the new `arch` field when resolving
artifacts.
- `scripts/update-llvm-version.py` (new) — single-call version bump
across `package.cmake` + manifest.
- `scripts/validate-llvm-components.py` (new) — scans the LLVM source
tree for library targets and diffs them against
`scripts/llvm-components.json` to catch stale/misspelled component names
before a build.
- `scripts/llvm-components.json` (new) — explicit allow-list of required
LLVM/Clang library targets used by `build-llvm.py`.

## CI changes

- `.github/workflows/build-llvm.yml`:
- Adds `workflow_dispatch` with `llvm_version`, `skip_upload`, `skip_pr`
inputs.
- Matrix extended with the 6 cross-compile entries (2 per new platform:
RelWithDebInfo ± LTO).
- `build clice` / test / prune steps gated on `!matrix.target_triple`
for cross-builds; cross-built LTO entries apply the native prune
manifest (arch-independent).
  - Cross-compiled binary architecture is verified with `file(1)`.
- New `upload` job triggered by `workflow_dispatch` pushes artifacts to
`clice-io/clice-llvm` and hands the manifest off to the next job.
- `.github/workflows/test-cmake.yml`:
- Build matrix gains three `build_only: true` cross entries that upload
`bin/` + `lib/` artifacts.
- New `test-cross` job runs on native `macos-15-intel`,
`ubuntu-24.04-arm`, `windows-11-arm` runners, downloads the cross-built
artifacts, and runs unit / integration / smoke tests under the
`test-run` pixi env.
- Cache keys now include `target_triple` so native and cross builds
don't collide.
- `.github/workflows/publish-clice.yml`:
- Three additional release artifacts for the new targets
(`clice-x86_64-macos-darwin`, `clice-aarch64-linux-gnu`,
`clice-aarch64-windows-msvc`), each with a matching `-symbol` archive.

## Compatibility

- All existing native builds and tests are preserved; cross entries are
additive.
- `Debug` + ASAN remains disabled on Windows (`llvm_mode == Debug && os
== windows-*` no longer appends `-asan`).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-19 00:17:39 +08:00

406 lines
14 KiB
Python

#!/usr/bin/env python3
import argparse
import hashlib
import json
import os
import shutil
import subprocess
import sys
import tarfile
from pathlib import Path
from urllib.error import HTTPError, URLError
from urllib.request import Request, urlopen
PRIVATE_CLANG_FILES = [
"Sema/CoroutineStmtBuilder.h",
"Sema/TypeLocBuilder.h",
"Sema/TreeTransform.h",
]
def log(message: str) -> None:
print(f"[setup-llvm] {message}", flush=True)
def read_manifest(path: Path) -> list[dict]:
with path.open("r", encoding="utf-8") as handle:
return json.load(handle)
def detect_platform() -> str:
plat = sys.platform
if plat.startswith("win"):
return "Windows"
if plat == "darwin":
return "macosx"
if plat.startswith("linux"):
return "Linux"
raise RuntimeError(f"Unsupported platform: {plat}")
def detect_arch() -> str:
import platform
machine = platform.machine().lower()
if machine in ("x86_64", "amd64"):
return "x64"
if machine in ("aarch64", "arm64"):
return "arm64"
raise RuntimeError(f"Unsupported architecture: {machine}")
def pick_artifact(
manifest: list[dict],
version: str,
build_type: str,
is_lto: bool,
platform: str,
arch: str,
) -> dict:
base_version = version.split("+", 1)[0]
saw_missing_arch = False
for entry in manifest:
if entry.get("version") != version:
continue
if entry.get("platform") != platform.lower():
continue
entry_arch = entry.get("arch")
if entry_arch is None:
saw_missing_arch = True
continue
if entry_arch != arch:
continue
if entry.get("build_type") != build_type:
continue
if bool(entry.get("lto")) != is_lto:
continue
return entry
if saw_missing_arch:
raise RuntimeError(
f"Manifest contains entries without an 'arch' field for version={base_version}, "
f"platform={platform}. The manifest format changed to require explicit "
f"architectures; regenerate it via scripts/update-llvm-version.py."
)
raise RuntimeError(
f"No matching LLVM artifact in manifest for version={base_version}, platform={platform}, "
f"arch={arch}, build_type={build_type}, lto={is_lto}"
)
def sha256sum(path: Path) -> str:
digest = hashlib.sha256()
with path.open("rb") as handle:
for chunk in iter(lambda: handle.read(1024 * 1024), b""):
digest.update(chunk)
return digest.hexdigest()
def download(url: str, dest: Path, token: str | None) -> None:
log(f"Start download: {url} -> {dest}")
dest.parent.mkdir(parents=True, exist_ok=True)
headers = {"User-Agent": "clice-setup-llvm"}
if token:
headers["Authorization"] = f"Bearer {token}"
request = Request(url, headers=headers)
try:
with urlopen(request) as response, dest.open("wb") as handle:
total_bytes = response.length
if total_bytes is None:
header_len = response.getheader("Content-Length")
if header_len and header_len.isdigit():
total_bytes = int(header_len)
downloaded = 0
next_percent = 10
next_unknown_mark = 10 * 1024 * 1024 # 10MB steps when size is unknown
while True:
chunk = response.read(1024 * 512)
if not chunk:
break
handle.write(chunk)
downloaded += len(chunk)
if total_bytes:
percent = int(downloaded * 100 / total_bytes)
while percent >= next_percent and next_percent <= 100:
log(
f"Download progress: {next_percent}% "
f"({downloaded / 1024 / 1024:.1f}MB/"
f"{total_bytes / 1024 / 1024:.1f}MB)"
)
next_percent += 10
else:
if downloaded >= next_unknown_mark:
log(
f"Downloaded {downloaded / 1024 / 1024:.1f}MB (size unknown)"
)
next_unknown_mark += 10 * 1024 * 1024
if total_bytes and next_percent <= 100:
log("Download progress: 100% (size verified by server)")
log(f"Finished download: {dest} ({downloaded / 1024 / 1024:.1f}MB)")
except HTTPError as err:
raise RuntimeError(f"HTTP error {err.code} while downloading {url}") from err
except URLError as err:
raise RuntimeError(f"Failed to download {url}: {err.reason}") from err
def ensure_download(
url: str, dest: Path, expected_sha256: str, token: str | None
) -> None:
if dest.exists():
current = sha256sum(dest)
if current == expected_sha256:
return
dest.unlink()
download(url, dest, token)
current = sha256sum(dest)
if current != expected_sha256:
dest.unlink(missing_ok=True)
raise RuntimeError(
f"SHA256 mismatch for {dest.name}: expected {expected_sha256}, got {current}"
)
def extract_archive(archive: Path, dest_dir: Path) -> None:
log(f"Extracting {archive.name} to {dest_dir}")
dest_dir.mkdir(parents=True, exist_ok=True)
name = archive.name.lower()
if name.endswith(".tar.xz") or name.endswith(".tar.gz") or name.endswith(".tar"):
with tarfile.open(archive, "r:*") as tar:
tar.extractall(path=dest_dir)
log("Extraction complete")
return
raise RuntimeError(f"Unsupported archive format: {archive}")
def flatten_install_dir(dest_dir: Path) -> None:
# Some archives add an extra root directory (llvm-install, build-install, etc.).
for name in ("llvm-install", "build-install"):
nested = dest_dir / name
if not nested.is_dir():
continue
log(f"Flattening nested install directory: {nested}")
for entry in nested.iterdir():
target = dest_dir / entry.name
if target.exists():
raise RuntimeError(
f"Cannot flatten {nested}: target already exists: {target}"
)
shutil.move(str(entry), str(target))
nested.rmdir()
break
def parse_version_tuple(text: str) -> tuple[int, ...]:
digits = []
current = ""
for ch in text:
if ch.isdigit():
current += ch
else:
if current:
digits.append(int(current))
current = ""
if ch in {".", "-"}:
continue
if current:
digits.append(int(current))
return tuple(digits)
def system_llvm_ok(required_version: str, build_type: str) -> Path | None:
if build_type.lower().startswith("debug"):
return None
llvm_config = shutil.which("llvm-config")
if not llvm_config:
return None
try:
version = subprocess.check_output([llvm_config, "--version"], text=True).strip()
prefix = subprocess.check_output([llvm_config, "--prefix"], text=True).strip()
except (subprocess.CalledProcessError, OSError):
return None
required = parse_version_tuple(required_version.split("+", 1)[0])
found = parse_version_tuple(version)
if not found or found < required:
return None
return Path(prefix)
def github_api(url: str, token: str | None) -> dict:
headers = {
"Accept": "application/vnd.github+json",
"User-Agent": "clice-setup-llvm",
}
if token:
headers["Authorization"] = f"Bearer {token}"
request = Request(url, headers=headers)
with urlopen(request) as response:
return json.load(response)
def lookup_llvm_commit(version: str, token: str | None) -> str | None:
tag_version = version.split("+", 1)[0]
tag = f"llvmorg-{tag_version}"
ref_url = f"https://api.github.com/repos/llvm/llvm-project/git/ref/tags/{tag}"
try:
ref = github_api(ref_url, token)
except Exception:
return None
obj = ref.get("object") or {}
obj_type = obj.get("type")
obj_sha = obj.get("sha")
if obj_type == "commit":
return obj_sha
if obj_type == "tag" and obj_sha:
tag_url = f"https://api.github.com/repos/llvm/llvm-project/git/tags/{obj_sha}"
try:
tag_info = github_api(tag_url, token)
except Exception:
return None
return tag_info.get("object", {}).get("sha") or tag_info.get("sha")
return None
def ensure_private_headers(
install_path: Path, work_dir: Path, version: str, token: str | None, offline: bool
) -> None:
missing = []
for rel in PRIVATE_CLANG_FILES:
if (install_path / "include" / "clang" / rel).exists():
continue
if (work_dir / "include" / "clang" / rel).exists():
continue
missing.append(rel)
if not missing or offline:
return
commit = lookup_llvm_commit(version, token)
if not commit:
return
for rel in missing:
dest = work_dir / "include" / "clang" / rel
dest.parent.mkdir(parents=True, exist_ok=True)
url = f"https://raw.githubusercontent.com/llvm/llvm-project/{commit}/clang/lib/{rel}"
log(f"Fetching private header: {url}")
download(url, dest, token)
def main() -> None:
parser = argparse.ArgumentParser(description="Setup LLVM dependencies for CMake")
parser.add_argument("--version", required=True)
parser.add_argument("--build-type", required=True)
parser.add_argument("--binary-dir", required=True)
parser.add_argument("--manifest", required=True)
parser.add_argument("--install-path")
parser.add_argument("--enable-lto", action="store_true")
parser.add_argument("--offline", action="store_true")
parser.add_argument(
"--target-platform",
help="Override platform for cross-compilation (e.g. macosx, linux, windows)",
)
parser.add_argument(
"--target-arch",
help="Override architecture for cross-compilation (e.g. x64, arm64)",
)
parser.add_argument("--output", required=True)
args = parser.parse_args()
log(
"Args: "
f"version={args.version}, build_type={args.build_type}, "
f"binary_dir={args.binary_dir}, install_path={args.install_path or '(auto)'}, "
f"enable_lto={args.enable_lto}, offline={args.offline}"
)
token = os.environ.get("GH_TOKEN") or os.environ.get("GITHUB_TOKEN")
build_type = args.build_type
platform_name = args.target_platform if args.target_platform else detect_platform()
arch_name = args.target_arch if args.target_arch else detect_arch()
log(
f"Platform: {platform_name}, arch: {arch_name}, normalized build type: {build_type}"
)
manifest = read_manifest(Path(args.manifest))
binary_dir = Path(args.binary_dir).resolve()
install_root = binary_dir / ".llvm"
install_path: Path | None = None
needs_install = False
if args.install_path:
candidate = Path(args.install_path)
if candidate.exists():
log(f"Using provided LLVM install at {candidate}")
else:
log(
f"Provided LLVM install path does not exist; will install to {candidate}"
)
needs_install = True
install_path = candidate
else:
detected = system_llvm_ok(args.version, build_type)
if detected:
log(f"Found suitable system LLVM at {detected}")
install_path = detected
artifact = None
if install_path is None:
needs_install = True
artifact = pick_artifact(
manifest,
args.version,
build_type,
args.enable_lto,
platform_name,
arch_name,
)
log(f"Selected artifact: {artifact.get('filename')} for download")
filename = artifact["filename"]
url_version = args.version.replace("+", "%2B")
url = f"https://github.com/clice-io/clice-llvm/releases/download/{url_version}/{filename}"
download_path = binary_dir / filename
ensure_download(url, download_path, artifact["sha256"], token)
extract_archive(download_path, install_root)
flatten_install_dir(install_root)
install_path = install_root
elif needs_install:
artifact = pick_artifact(
manifest,
args.version,
build_type,
args.enable_lto,
platform_name,
arch_name,
)
log(f"Selected artifact: {artifact.get('filename')} for download")
filename = artifact["filename"]
url_version = args.version.replace("+", "%2B")
url = f"https://github.com/clice-io/clice-llvm/releases/download/{url_version}/{filename}"
download_path = binary_dir / filename
ensure_download(url, download_path, artifact["sha256"], token)
target_dir = install_path.resolve()
extract_archive(download_path, target_dir)
flatten_install_dir(target_dir)
install_path = target_dir
else:
install_path = install_path.resolve()
log(f"Using existing LLVM install at {install_path}")
cmake_dir = install_path / "lib" / "cmake" / "llvm"
ensure_private_headers(install_path, binary_dir, args.version, token, args.offline)
output = Path(args.output)
output.parent.mkdir(parents=True, exist_ok=True)
with output.open("w", encoding="utf-8") as handle:
json.dump(
{
"install_path": str(install_path),
"cmake_dir": str(cmake_dir),
"artifact": artifact or {},
},
handle,
indent=2,
)
handle.write("\n")
if __name__ == "__main__":
main()