[IR2Vec] Scale embeddings once in vocab analysis instead of repetitive scaling (#143986)
Changes to scale opcodes, types and args once in `IR2VecVocabAnalysis` so that we can avoid scaling each time while computing embeddings. This PR refactors the vocabulary to explicitly define 3 sections---Opcodes, Types, and Arguments---used for computing Embeddings. (Tracking issue - #141817 ; partly fixes - #141832)
This commit is contained in:
committed by
GitHub
parent
56ef00a59d
commit
0745eb501d
@@ -448,7 +448,16 @@ downstream tasks, including ML-guided compiler optimizations.
|
||||
|
||||
The core components are:
|
||||
- **Vocabulary**: A mapping from IR entities (opcodes, types, etc.) to their
|
||||
vector representations. This is managed by ``IR2VecVocabAnalysis``.
|
||||
vector representations. This is managed by ``IR2VecVocabAnalysis``. The
|
||||
vocabulary (.json file) contains three sections -- Opcodes, Types, and
|
||||
Arguments, each containing the representations of the corresponding
|
||||
entities.
|
||||
|
||||
.. note::
|
||||
|
||||
It is mandatory to have these three sections present in the vocabulary file
|
||||
for it to be valid; order in which they appear does not matter.
|
||||
|
||||
- **Embedder**: A class (``ir2vec::Embedder``) that uses the vocabulary to
|
||||
compute embeddings for instructions, basic blocks, and functions.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user