From 2b8bff6f66fd90ac658d0ae0d7f9a83ffadfd77f Mon Sep 17 00:00:00 2001 From: Mircea Trofin Date: Thu, 22 May 2025 12:28:32 -0700 Subject: [PATCH] [doc][mlgo] Document the logger (serialization) and expose the doc (#141094) --- llvm/docs/MLGO.rst | 92 ++++++++++++++++++++++++++++++++++++++--- llvm/docs/Reference.rst | 6 ++- 2 files changed, 92 insertions(+), 6 deletions(-) diff --git a/llvm/docs/MLGO.rst b/llvm/docs/MLGO.rst index 49efea3519c5..43b45d505ea3 100644 --- a/llvm/docs/MLGO.rst +++ b/llvm/docs/MLGO.rst @@ -314,7 +314,7 @@ features. ``MLModelRunner`` implementations ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -We currently feature 3 implementations: +We currently feature 4 implementations: - ``ModelUnderTrainingRunner``. This requires the compiler be built with TFLite support. It allows loading a TFLite model dynamically and is primarily @@ -338,15 +338,97 @@ requiring no out of tree build-time dependencies. presumably a python training algorithm. We do not envision using this in a production environment. +- ``NoInferenceModelRunner``. This serves as a store for feature values, and its + ``evaluate`` should never be called. It's used for training scenarios, when we + want to capture the behavior of the default (non-ML) heuristic. + Note that training leaves it to the training infrastructure to handle distributed computing. The assumed architecture has python processes communicating remotely between themselves, but managing local communication with clang. -.. - TODO(mtrofin): - - logging, and the use in interactive mode. - - discuss an example (like the inliner) +Logging Facility +---------------- + +When training models, we need to expose the features we will want to use during +inference, as well as outcomes, to guide reward-based learning techniques. This +can happen in 2 forms: + +- when running the compiler on some input, as a capture of the features and + actions taken by some policy or a model currently being used. + For example, see ``DevelopmentModeInlineAdvisor`` or ``DevelopmentModeEvictAdvisor`` + in ``MLRegallocEvictAdvisor.cpp``. In more detail, in the former case, if + ``-training-log`` is specified, the features and actions (inline/no inline) + from each inlining decision are saved to the specified file. Since + ``MLModelRunner`` implementations hold on to feature values (they don't get + cleared by ``evaluate``), logging is easily supported by just looping over the + model runner's features and passing the tensor buffers to the logger. Note how + we use the ``NoInferenceModelRunner`` to capture the features observed when + using the default policy. + +- as a serialization mechanism for the ``InteractiveModelRunner``. Here, we need + to pass the observed features over IPC (a file descriptor, likely a named + pipe). + +Both cases require serializing the same kind of data and we support both with +``Analysis/Utils/TrainingLogger``. + +The goal of the logger design was avoiding any new dependency, and optimizing +for the tensor scenario - i.e. exchanging potentially large buffers of fixed +size, containing scalars. We explicitly assume the reader of the format has the +same endianness as the compiler host, and we further expect the reader and the +compiler run on the same host. This is because we expect the training scenarios +have a (typically python) process managing the compiler process, and we leave to +the training side to handle remoting. + +The logger produces the following sequence: + +- a header describing the structure of the log. This is a one-line textual JSON + dictionary with the following elements: + + - ``features``: a list of JSON-serialized ``TensorSpec`` values. The position + in the list matters, as it will be the order in which values will be + subsequently recorded. If we are just logging (i.e. not using the + ``InteractiveModelRunner``), the last feature should be that of the action + (e.g. "inline/no inline", or "index of evicted live range") + - (optional) ``score``: a ``TensorSpec`` describing a value we will include to + help formulate a reward. This could be a size estimate or a latency estimate. + - (optional) ``advice``: a ``TensorSpec`` describing the action. This is used + for the ``InteractiveModelRunner``, in which case it shouldn't be in the + ``features`` list. +- a sequence of ``contexts``. Contexts are independent traces of the optimization + problem. For module passes, there is only one context, for function passes, + there is a context per function. The start of a context is marked with a + one-line JSON dictionary of the form ``{"context": }`` + + Each context has a sequence of: + + - ``observations``. An observation is: + + - one-line JSON ``{"observation": }`` + - a binary dump of the tensor buffers, in the order in which they were + specified in the header. + - a new line character + - if ``score`` was specified in the header: + + - a one-line JSON object ``{"outcome": }``, where the ``value`` + conforms to the ``TensorSpec`` in defined for the ``score`` in the header. + - the outcome value, as a binary dump + - a new line character. + +The format uses a mix of textual JSON (for headers) and binary dumps (for tensors) +because the headers are not expected to dominate the payload - the tensor values +are. We wanted to avoid overburdening the log reader - likely python - from +additional dependencies; and the one-line JSON makes it rudimentarily possible +to inspect a log without additional tooling. + +A python utility for reading logs, used for tests, is available at +``Analysis/models/log_reader.py``. A utility showcasing the ``InteractiveModelRunner``, +which uses this reader as well, is at ``Analysis/models/interactive_host.py``. +The latter is also used in tests. + +There is no C++ implementation of a log reader. We do not have a scenario +motivating one. IR2Vec Embeddings ================= diff --git a/llvm/docs/Reference.rst b/llvm/docs/Reference.rst index 565d5c6876d6..cb9576b15d70 100644 --- a/llvm/docs/Reference.rst +++ b/llvm/docs/Reference.rst @@ -40,8 +40,8 @@ LLVM and API reference documentation. PCSectionsMetadata PDB/index PointerAuth - ScudoHardenedAllocator MLGO + ScudoHardenedAllocator MemoryModelRelaxationAnnotations MemTagSanitizer Security @@ -239,3 +239,7 @@ Additional Topics :doc:`ConvergenceAndUniformity` A description of uniformity analysis in the presence of irreducible control flow, and its implementation. + +:doc:`MLGO` + Facilities for ML-Guided Optimization, such as collecting IR corpora from a + build, interfacing with ML models, an exposing features for training.