Files
clice/scripts/validate-llvm-components.py
ykiko 592b37417e feat: cross-compile & upgrade LLVM to 21.1.8 (#390)
## Summary

This PR adds cross-compilation support for three new target platforms,
upgrades LLVM to 21.1.8, and overhauls the CI pipelines around
cross-builds and testing.

## Cross-compilation

New target triples accepted via `-DCLICE_TARGET_TRIPLE=...`:

| Target triple | Host | Output |
|---|---|---|
| `x86_64-apple-darwin` | macos-15 (arm64) | macOS x64 |
| `aarch64-linux-gnu` | ubuntu-24.04 (x64) | Linux arm64 |
| `aarch64-pc-windows-msvc` | windows-2025 (x64) | Windows arm64 |

- `cmake/toolchain.cmake` — maps `CLICE_TARGET_TRIPLE` to
`CMAKE_SYSTEM_NAME`/`CMAKE_SYSTEM_PROCESSOR`/compiler `--target`; picks
up conda aarch64 sysroot when cross-compiling Linux.
- `cmake/llvm.cmake` — forwards target platform/arch to `setup-llvm.py`
so the right prebuilt LLVM is downloaded for the target.
- `CMakeLists.txt` — uses a host-side `flatc` from `PATH` under
`CMAKE_CROSSCOMPILING` instead of the in-tree target build.
- `pixi.toml`:
  - Adds `osx-64`, `linux-aarch64`, `win-arm64` platforms.
- New environments: `cross-macos-x64`, `cross-linux-aarch64` (adds
`gcc_linux-aarch64` + `sysroot_linux-aarch64`), `cross-windows-arm64`.
- New lightweight `test-run` env used on native ARM/x64 runners to
execute cross-built artifacts (pulls in upstream clang+lld on macOS so
tests don't fall back to Apple clang).
- `scripts/activate_cross_linux.sh` — exports `CONDA_PREFIX`-relative
paths for the aarch64 toolchain.
- `scripts/build-llvm.py` — `--target-triple` support and a
`build_native_tools()` helper that produces host `llvm-tblgen` /
`clang-tblgen` needed when cross-compiling LLVM itself.

## LLVM upgrade 21.1.4 → 21.1.8

- `cmake/package.cmake` bumps `setup_llvm("21.1.8")`.
- `config/llvm-manifest.json` regenerated with 6 new cross-compiled
entries and a new `arch` field on every entry so lookup is `(version,
platform, arch, lto, build_type)`.
- `scripts/setup-llvm.py` — honours the new `arch` field when resolving
artifacts.
- `scripts/update-llvm-version.py` (new) — single-call version bump
across `package.cmake` + manifest.
- `scripts/validate-llvm-components.py` (new) — scans the LLVM source
tree for library targets and diffs them against
`scripts/llvm-components.json` to catch stale/misspelled component names
before a build.
- `scripts/llvm-components.json` (new) — explicit allow-list of required
LLVM/Clang library targets used by `build-llvm.py`.

## CI changes

- `.github/workflows/build-llvm.yml`:
- Adds `workflow_dispatch` with `llvm_version`, `skip_upload`, `skip_pr`
inputs.
- Matrix extended with the 6 cross-compile entries (2 per new platform:
RelWithDebInfo ± LTO).
- `build clice` / test / prune steps gated on `!matrix.target_triple`
for cross-builds; cross-built LTO entries apply the native prune
manifest (arch-independent).
  - Cross-compiled binary architecture is verified with `file(1)`.
- New `upload` job triggered by `workflow_dispatch` pushes artifacts to
`clice-io/clice-llvm` and hands the manifest off to the next job.
- `.github/workflows/test-cmake.yml`:
- Build matrix gains three `build_only: true` cross entries that upload
`bin/` + `lib/` artifacts.
- New `test-cross` job runs on native `macos-15-intel`,
`ubuntu-24.04-arm`, `windows-11-arm` runners, downloads the cross-built
artifacts, and runs unit / integration / smoke tests under the
`test-run` pixi env.
- Cache keys now include `target_triple` so native and cross builds
don't collide.
- `.github/workflows/publish-clice.yml`:
- Three additional release artifacts for the new targets
(`clice-x86_64-macos-darwin`, `clice-aarch64-linux-gnu`,
`clice-aarch64-windows-msvc`), each with a matching `-symbol` archive.

## Compatibility

- All existing native builds and tests are preserved; cross entries are
additive.
- `Debug` + ASAN remains disabled on Windows (`llvm_mode == Debug && os
== windows-*` no longer appends `-asan`).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-19 00:17:39 +08:00

164 lines
5.4 KiB
Python
Executable File

#!/usr/bin/env python3
"""
Validate the LLVM distribution component list against the actual LLVM source tree.
Scans the LLVM source for CMake library targets and compares them against
a components JSON file to detect stale or misspelled entries.
"""
import argparse
import difflib
import json
import re
import sys
from pathlib import Path
# CMake function calls that define library targets.
# The captured group uses [^\s)]+ to grab the target name without
# trailing parentheses or whitespace.
LLVM_LIB_PATTERNS = [
re.compile(r"add_llvm_component_library\(\s*([^\s)]+)"),
re.compile(r"add_llvm_library\(\s*([^\s)]+)"),
]
CLANG_LIB_PATTERNS = [
re.compile(r"add_clang_library\(\s*([^\s)]+)"),
]
# Header-only / custom install targets.
HEADER_PATTERNS = [
re.compile(r"add_llvm_install_targets\(\s*([^\s)]+)"),
re.compile(r"add_custom_target\(\s*([^\s)]+)"),
re.compile(r"add_library\(\s*([^\s)]+)"),
]
# Targets we recognise as header-only distribution components.
KNOWN_HEADER_TARGETS = {
"llvm-headers",
"clang-headers",
"clang-tidy-headers",
"clang-resource-headers",
}
def scan_targets(directory: Path, patterns: list[re.Pattern]) -> set[str]:
"""Recursively scan *directory* for CMakeLists.txt files and extract target names."""
targets: set[str] = set()
if not directory.is_dir():
return targets
for cmake_file in directory.rglob("CMakeLists.txt"):
text = cmake_file.read_text(errors="replace")
for pattern in patterns:
for match in pattern.finditer(text):
targets.add(match.group(1))
return targets
def scan_header_targets(llvm_src: Path) -> set[str]:
"""Scan for well-known header / custom-install targets across the tree."""
found: set[str] = set()
for cmake_file in llvm_src.rglob("CMakeLists.txt"):
text = cmake_file.read_text(errors="replace")
for pattern in HEADER_PATTERNS:
for match in pattern.finditer(text):
name = match.group(1)
if name in KNOWN_HEADER_TARGETS:
found.add(name)
return found
def collect_source_targets(llvm_src: Path) -> set[str]:
"""Return the full set of library / header targets found in the LLVM source tree."""
targets: set[str] = set()
targets |= scan_targets(llvm_src / "llvm" / "lib", LLVM_LIB_PATTERNS)
targets |= scan_targets(llvm_src / "clang" / "lib", CLANG_LIB_PATTERNS)
targets |= scan_targets(llvm_src / "clang-tools-extra", CLANG_LIB_PATTERNS)
targets |= scan_header_targets(llvm_src)
return targets
def load_components(path: Path) -> list[str]:
with path.open("r", encoding="utf-8") as handle:
data = json.load(handle)
if isinstance(data, dict):
data = data.get("components", [])
if not isinstance(data, list) or not data:
print(f"Error: no component list found in {path}", file=sys.stderr)
sys.exit(1)
return data
def main() -> None:
parser = argparse.ArgumentParser(
description="Validate LLVM distribution components against the source tree."
)
parser.add_argument(
"--llvm-src",
required=True,
help="Path to the llvm-project source root",
)
parser.add_argument(
"--components-file",
required=True,
help="Path to llvm-components.json",
)
args = parser.parse_args()
llvm_src = Path(args.llvm_src).expanduser().resolve()
components_file = Path(args.components_file).expanduser().resolve()
if not llvm_src.is_dir():
print(f"Error: LLVM source directory not found: {llvm_src}")
sys.exit(1)
if not (llvm_src / "llvm" / "CMakeLists.txt").exists():
print(f"Error: {llvm_src} does not look like an llvm-project root.")
sys.exit(1)
if not components_file.is_file():
print(f"Error: components file not found: {components_file}")
sys.exit(1)
components = load_components(components_file)
source_targets = collect_source_targets(llvm_src)
print(f"Found {len(source_targets)} targets in LLVM source tree")
print(f"Components file lists {len(components)} entries")
# Check for components that are missing from the source tree.
missing: list[tuple[str, list[str]]] = []
for name in components:
if name not in source_targets:
suggestions = difflib.get_close_matches(
name, source_targets, n=3, cutoff=0.6
)
missing.append((name, suggestions))
if missing:
print(f"\nError: {len(missing)} component(s) not found in the source tree:\n")
for name, suggestions in missing:
print(f" - {name}")
if suggestions:
print(f" Did you mean: {', '.join(suggestions)}?")
sys.exit(1)
# Warn about source targets not present in the component list.
component_set = set(components)
new_targets = sorted(source_targets - component_set - KNOWN_HEADER_TARGETS)
# Filter to targets that follow LLVM/Clang naming conventions to reduce noise.
noteworthy = [t for t in new_targets if t.startswith(("LLVM", "clang", "Clang"))]
if noteworthy:
print(
f"\nWarning: {len(noteworthy)} target(s) in source not listed in components:"
)
for name in noteworthy:
print(f" + {name}")
print("\nAll components validated successfully.")
sys.exit(0)
if __name__ == "__main__":
main()