Skip to content

[CGData] Lazy loading support for stable function map #151660

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

nocchijiang
Copy link
Contributor

The stable function map could be huge for a large application. Fully loading it is slow and consumes a significant amount of memory, which is unnecessary and drastically slows down compilation especially for non-LTO and distributed-ThinLTO setups. This patch introduces an opt-in lazy loading support for the stable function map. The detailed changes are:

  • StableFunctionMap

    • The map now stores entries in an EntryStorage struct, which includes offsets for serialized entries and a std::once_flag for thread-safe lazy loading.
    • The underlying map type is changed from DenseMap to std::unordered_map for compatibility with std::once_flag.
    • contains(), size() and at() are implemented to only load requested entries on demand.
  • Lazy Loading Mechanism

    • When reading indexed codegen data, if the newly-introduced -indexed-codegen-data-lazy-loading flag is set, the stable function map is not fully deserialized up front. The binary format for the stable function map now includes offsets and sizes to support lazy loading.
    • The safety of lazy loading is guarded by the once flag per function hash. This guarantees that even in a multi-threaded environment, the deserialization for a given function hash will happen exactly once. The first thread to request it performs the load, and subsequent threads will wait for it to complete before using the data. For single-threaded builds, the overhead is negligible (a single check on the once flag). For multi-threaded scenarios, users can omit the flag to retain the previous eager-loading behavior.

The stable function map could be huge for a large application. Fully
loading it is slow and consumes a significant amount of memory, which
is unnecessary and drastically slows down compilation especially for
non-LTO and distributed-ThinLTO setups. This patch introduces an
opt-in lazy loading support for the stable function map. The detailed
changes are:

- `StableFunctionMap`
  - The map now stores entries in an `EntryStorage` struct, which
    includes offsets for serialized entries and a `std::once_flag`
    for thread-safe lazy loading.
  - The underlying map type is changed from `DenseMap` to
    `std::unordered_map` for compatibility with `std::once_flag`.
  - `contains()`, `size()` and `at()` are implemented to only
    load requested entries on demand.

- Lazy Loading Mechanism
  - When reading indexed codegen data, if the newly-introduced
    `-indexed-codegen-data-lazy-loading` flag is set, the stable
    function map is not fully deserialized up front. The binary format
    for the stable function map now includes offsets and sizes to
    support lazy loading.
  - The safety of lazy loading is guarded by the once flag per function
    hash. This guarantees that even in a multi-threaded environment, the
    deserialization for a given function hash will happen exactly once.
    The first thread to request it performs the load, and subsequent
    threads will wait for it to complete before using the data. For
    single-threaded builds, the overhead is negligible (a single check
    on the once flag). For multi-threaded scenarios, users can omit the
    flag to retain the previous eager-loading behavior.
@llvmbot llvmbot added llvm:codegen LTO Link time optimization (regular/full LTO or ThinLTO) labels Aug 1, 2025
@llvmbot
Copy link
Member

llvmbot commented Aug 1, 2025

@llvm/pr-subscribers-lto

Author: Zhaoxuan Jiang (nocchijiang)

Changes

The stable function map could be huge for a large application. Fully loading it is slow and consumes a significant amount of memory, which is unnecessary and drastically slows down compilation especially for non-LTO and distributed-ThinLTO setups. This patch introduces an opt-in lazy loading support for the stable function map. The detailed changes are:

  • StableFunctionMap

    • The map now stores entries in an EntryStorage struct, which includes offsets for serialized entries and a std::once_flag for thread-safe lazy loading.
    • The underlying map type is changed from DenseMap to std::unordered_map for compatibility with std::once_flag.
    • contains(), size() and at() are implemented to only load requested entries on demand.
  • Lazy Loading Mechanism

    • When reading indexed codegen data, if the newly-introduced -indexed-codegen-data-lazy-loading flag is set, the stable function map is not fully deserialized up front. The binary format for the stable function map now includes offsets and sizes to support lazy loading.
    • The safety of lazy loading is guarded by the once flag per function hash. This guarantees that even in a multi-threaded environment, the deserialization for a given function hash will happen exactly once. The first thread to request it performs the load, and subsequent threads will wait for it to complete before using the data. For single-threaded builds, the overhead is negligible (a single check on the once flag). For multi-threaded scenarios, users can omit the flag to retain the previous eager-loading behavior.

Patch is 37.50 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/151660.diff

20 Files Affected:

  • (modified) llvm/include/llvm/CGData/CodeGenData.h (+3)
  • (modified) llvm/include/llvm/CGData/CodeGenData.inc (+1-1)
  • (modified) llvm/include/llvm/CGData/StableFunctionMap.h (+46-5)
  • (modified) llvm/include/llvm/CGData/StableFunctionMapRecord.h (+19)
  • (modified) llvm/lib/CGData/CodeGenData.cpp (+1-1)
  • (modified) llvm/lib/CGData/CodeGenDataReader.cpp (+16-1)
  • (modified) llvm/lib/CGData/StableFunctionMap.cpp (+55-13)
  • (modified) llvm/lib/CGData/StableFunctionMapRecord.cpp (+122-41)
  • (modified) llvm/lib/CodeGen/GlobalMergeFunctions.cpp (+4-6)
  • (modified) llvm/test/ThinLTO/AArch64/cgdata-merge-write.ll (+2)
  • (modified) llvm/test/tools/llvm-cgdata/empty.test (+2-2)
  • (modified) llvm/test/tools/llvm-cgdata/error.test (+2-2)
  • (modified) llvm/test/tools/llvm-cgdata/merge-combined-funcmap-hashtree.test (+3-1)
  • (modified) llvm/test/tools/llvm-cgdata/merge-funcmap-archive.test (+4-4)
  • (modified) llvm/test/tools/llvm-cgdata/merge-funcmap-concat.test (+4-2)
  • (modified) llvm/test/tools/llvm-cgdata/merge-funcmap-double.test (+4-3)
  • (modified) llvm/test/tools/llvm-cgdata/merge-funcmap-single.test (+3-1)
  • (modified) llvm/tools/llvm-cgdata/Opts.td (+1)
  • (modified) llvm/tools/llvm-cgdata/llvm-cgdata.cpp (+5)
  • (modified) llvm/unittests/CGData/StableFunctionMapTest.cpp (+1-1)
diff --git a/llvm/include/llvm/CGData/CodeGenData.h b/llvm/include/llvm/CGData/CodeGenData.h
index 38b96b72ccac6..e44497a408245 100644
--- a/llvm/include/llvm/CGData/CodeGenData.h
+++ b/llvm/include/llvm/CGData/CodeGenData.h
@@ -285,6 +285,9 @@ enum CGDataVersion {
   // Version 3 adds the total size of the Names in the stable function map so
   // we can skip reading them into the memory for non-assertion builds.
   Version3 = 3,
+  // Version 4 adjusts the structure of stable function merging map for
+  // efficient lazy loading support.
+  Version4 = 4,
   CurrentVersion = CG_DATA_INDEX_VERSION
 };
 const uint64_t Version = CGDataVersion::CurrentVersion;
diff --git a/llvm/include/llvm/CGData/CodeGenData.inc b/llvm/include/llvm/CGData/CodeGenData.inc
index 94de4c0b017a2..d5fbe2fb97718 100644
--- a/llvm/include/llvm/CGData/CodeGenData.inc
+++ b/llvm/include/llvm/CGData/CodeGenData.inc
@@ -49,4 +49,4 @@ CG_DATA_SECT_ENTRY(CG_merge, CG_DATA_QUOTE(CG_DATA_MERGE_COMMON),
 #endif
 
 /* Indexed codegen data format version (start from 1). */
-#define CG_DATA_INDEX_VERSION 3
+#define CG_DATA_INDEX_VERSION 4
diff --git a/llvm/include/llvm/CGData/StableFunctionMap.h b/llvm/include/llvm/CGData/StableFunctionMap.h
index bcb72e8216973..b28e71fe8579c 100644
--- a/llvm/include/llvm/CGData/StableFunctionMap.h
+++ b/llvm/include/llvm/CGData/StableFunctionMap.h
@@ -20,6 +20,8 @@
 #include "llvm/ADT/StringMap.h"
 #include "llvm/IR/StructuralHash.h"
 #include "llvm/Support/Compiler.h"
+#include "llvm/Support/MemoryBuffer.h"
+#include <mutex>
 
 namespace llvm {
 
@@ -72,11 +74,29 @@ struct StableFunctionMap {
           IndexOperandHashMap(std::move(IndexOperandHashMap)) {}
   };
 
-  using HashFuncsMapType =
-      DenseMap<stable_hash, SmallVector<std::unique_ptr<StableFunctionEntry>>>;
+  using StableFunctionEntries =
+      SmallVector<std::unique_ptr<StableFunctionEntry>>;
+
+  /// In addition to the deserialized StableFunctionEntry, the struct stores
+  /// the offsets of corresponding serialized stable function entries, and a
+  /// once flag for safe lazy loading in a multithreaded environment.
+  struct EntryStorage {
+    StableFunctionEntries Entries;
+
+  private:
+    SmallVector<uint64_t> Offsets;
+    std::once_flag LazyLoadFlag;
+    friend struct StableFunctionMap;
+    friend struct StableFunctionMapRecord;
+  };
+
+  // Note: DenseMap requires value type to be copyable even if only using
+  // in-place insertion. Use STL instead. This also affects the
+  // deletion-while-iteration in finalize().
+  using HashFuncsMapType = std::unordered_map<stable_hash, EntryStorage>;
 
   /// Get the HashToFuncs map for serialization.
-  const HashFuncsMapType &getFunctionMap() const { return HashToFuncs; }
+  const HashFuncsMapType &getFunctionMap() const;
 
   /// Get the NameToId vector for serialization.
   ArrayRef<std::string> getNames() const { return IdToName; }
@@ -99,6 +119,13 @@ struct StableFunctionMap {
   /// \returns true if there is no stable function entry.
   bool empty() const { return size() == 0; }
 
+  bool contains(HashFuncsMapType::key_type FunctionHash) const {
+    return HashToFuncs.count(FunctionHash) > 0;
+  }
+
+  const StableFunctionEntries &
+  at(HashFuncsMapType::key_type FunctionHash) const;
+
   enum SizeType {
     UniqueHashCount,        // The number of unique hashes in HashToFuncs.
     TotalFunctionCount,     // The number of total functions in HashToFuncs.
@@ -119,17 +146,31 @@ struct StableFunctionMap {
   /// `StableFunctionEntry` is ready for insertion.
   void insert(std::unique_ptr<StableFunctionEntry> FuncEntry) {
     assert(!Finalized && "Cannot insert after finalization");
-    HashToFuncs[FuncEntry->Hash].emplace_back(std::move(FuncEntry));
+    HashToFuncs[FuncEntry->Hash].Entries.emplace_back(std::move(FuncEntry));
   }
 
+  void deserializeLazyLoadingEntry(HashFuncsMapType::iterator It);
+
+  /// Eagerly deserialize all the unloaded entries in the lazy loading map.
+  void deserializeLazyLoadingEntries();
+
+  bool isLazilyLoaded() const { return (bool)Buffer; }
+
   /// A map from a stable_hash to a vector of functions with that hash.
-  HashFuncsMapType HashToFuncs;
+  mutable HashFuncsMapType HashToFuncs;
   /// A vector of strings to hold names.
   SmallVector<std::string> IdToName;
   /// A map from StringRef (name) to an ID.
   StringMap<unsigned> NameToId;
   /// True if the function map is finalized with minimal content.
   bool Finalized = false;
+  /// The memory buffer that contains the serialized stable function map for
+  /// lazy loading.
+  /// Non-empty only if this StableFunctionMap is created from a MemoryBuffer
+  /// (i.e. by IndexedCodeGenDataReader::read()) and lazily deserialized.
+  std::shared_ptr<MemoryBuffer> Buffer;
+  /// Whether to read stable function names from the buffer.
+  bool ReadStableFunctionMapNames = true;
 
   friend struct StableFunctionMapRecord;
 };
diff --git a/llvm/include/llvm/CGData/StableFunctionMapRecord.h b/llvm/include/llvm/CGData/StableFunctionMapRecord.h
index a75cb12a70ba6..5a2176574c9e6 100644
--- a/llvm/include/llvm/CGData/StableFunctionMapRecord.h
+++ b/llvm/include/llvm/CGData/StableFunctionMapRecord.h
@@ -40,6 +40,14 @@ struct StableFunctionMapRecord {
                                  const StableFunctionMap *FunctionMap,
                                  std::vector<CGDataPatchItem> &PatchItems);
 
+  /// A static helper function to deserialize the stable function map entry.
+  /// Ptr should be pointing to the start of the fixed-sized fields of the
+  /// entry when passed in.
+  LLVM_ABI static void deserializeEntry(const unsigned char *Ptr,
+                                        stable_hash Hash,
+                                        StableFunctionMap *FunctionMap,
+                                        bool ReadStableFunctionMapNames = true);
+
   /// Serialize the stable function map to a raw_ostream.
   LLVM_ABI void serialize(raw_ostream &OS,
                           std::vector<CGDataPatchItem> &PatchItems) const;
@@ -48,6 +56,13 @@ struct StableFunctionMapRecord {
   LLVM_ABI void deserialize(const unsigned char *&Ptr,
                             bool ReadStableFunctionMapNames = true);
 
+  /// Lazily deserialize the stable function map from `Buffer` starting at
+  /// `Offset`. The individial stable function entry would be read lazily from
+  /// `Buffer` when the function map is accessed.
+  LLVM_ABI void lazyDeserialize(std::shared_ptr<MemoryBuffer> Buffer,
+                                uint64_t Offset,
+                                bool ReadStableFunctionMapNames = true);
+
   /// Serialize the stable function map to a YAML stream.
   LLVM_ABI void serializeYAML(yaml::Output &YOS) const;
 
@@ -70,6 +85,10 @@ struct StableFunctionMapRecord {
     yaml::Output YOS(OS);
     serializeYAML(YOS);
   }
+
+private:
+  void deserialize(const unsigned char *&Ptr, bool ReadStableFunctionMapNames,
+                   bool Lazy);
 };
 
 } // namespace llvm
diff --git a/llvm/lib/CGData/CodeGenData.cpp b/llvm/lib/CGData/CodeGenData.cpp
index cd012342e1958..b4f08c3d13b0d 100644
--- a/llvm/lib/CGData/CodeGenData.cpp
+++ b/llvm/lib/CGData/CodeGenData.cpp
@@ -186,7 +186,7 @@ Expected<Header> Header::readFromBuffer(const unsigned char *Curr) {
     return make_error<CGDataError>(cgdata_error::unsupported_version);
   H.DataKind = endian::readNext<uint32_t, endianness::little, unaligned>(Curr);
 
-  static_assert(IndexedCGData::CGDataVersion::CurrentVersion == Version3,
+  static_assert(IndexedCGData::CGDataVersion::CurrentVersion == Version4,
                 "Please update the offset computation below if a new field has "
                 "been added to the header.");
   H.OutlinedHashTreeOffset =
diff --git a/llvm/lib/CGData/CodeGenDataReader.cpp b/llvm/lib/CGData/CodeGenDataReader.cpp
index 0ab35499c8986..c7c0383930d50 100644
--- a/llvm/lib/CGData/CodeGenDataReader.cpp
+++ b/llvm/lib/CGData/CodeGenDataReader.cpp
@@ -26,6 +26,12 @@ static cl::opt<bool> IndexedCodeGenDataReadFunctionMapNames(
              "disabled to save memory and time for final consumption of the "
              "indexed CodeGenData in production."));
 
+cl::opt<bool> IndexedCodeGenDataLazyLoading(
+    "indexed-codegen-data-lazy-loading", cl::init(false), cl::Hidden,
+    cl::desc(
+        "Lazily load indexed CodeGenData. Enable to save memory and time "
+        "for final consumption of the indexed CodeGenData in production."));
+
 namespace llvm {
 
 static Expected<std::unique_ptr<MemoryBuffer>>
@@ -109,11 +115,20 @@ Error IndexedCodeGenDataReader::read() {
       return error(cgdata_error::eof);
     HashTreeRecord.deserialize(Ptr);
   }
+
+  // TODO: lazy loading support for outlined hash tree.
+  std::shared_ptr<MemoryBuffer> SharedDataBuffer = std::move(DataBuffer);
   if (hasStableFunctionMap()) {
     const unsigned char *Ptr = Start + Header.StableFunctionMapOffset;
     if (Ptr >= End)
       return error(cgdata_error::eof);
-    FunctionMapRecord.deserialize(Ptr, IndexedCodeGenDataReadFunctionMapNames);
+    if (IndexedCodeGenDataLazyLoading)
+      FunctionMapRecord.lazyDeserialize(SharedDataBuffer,
+                                        Header.StableFunctionMapOffset,
+                                        IndexedCodeGenDataReadFunctionMapNames);
+    else
+      FunctionMapRecord.deserialize(Ptr,
+                                    IndexedCodeGenDataReadFunctionMapNames);
   }
 
   return success();
diff --git a/llvm/lib/CGData/StableFunctionMap.cpp b/llvm/lib/CGData/StableFunctionMap.cpp
index 87f1e76afb60b..801c2a3bcfb41 100644
--- a/llvm/lib/CGData/StableFunctionMap.cpp
+++ b/llvm/lib/CGData/StableFunctionMap.cpp
@@ -15,8 +15,10 @@
 
 #include "llvm/CGData/StableFunctionMap.h"
 #include "llvm/ADT/SmallSet.h"
+#include "llvm/CGData/StableFunctionMapRecord.h"
 #include "llvm/Support/CommandLine.h"
 #include "llvm/Support/Debug.h"
+#include <mutex>
 
 #define DEBUG_TYPE "stable-function-map"
 
@@ -93,9 +95,10 @@ void StableFunctionMap::insert(const StableFunction &Func) {
 
 void StableFunctionMap::merge(const StableFunctionMap &OtherMap) {
   assert(!Finalized && "Cannot merge after finalization");
+  deserializeLazyLoadingEntries();
   for (auto &[Hash, Funcs] : OtherMap.HashToFuncs) {
-    auto &ThisFuncs = HashToFuncs[Hash];
-    for (auto &Func : Funcs) {
+    auto &ThisFuncs = HashToFuncs[Hash].Entries;
+    for (auto &Func : Funcs.Entries) {
       auto FuncNameId =
           getIdOrCreateForName(*OtherMap.getNameForId(Func->FunctionNameId));
       auto ModuleNameId =
@@ -114,25 +117,61 @@ size_t StableFunctionMap::size(SizeType Type) const {
   case UniqueHashCount:
     return HashToFuncs.size();
   case TotalFunctionCount: {
+    const_cast<StableFunctionMap *>(this)->deserializeLazyLoadingEntries();
     size_t Count = 0;
     for (auto &Funcs : HashToFuncs)
-      Count += Funcs.second.size();
+      Count += Funcs.second.Entries.size();
     return Count;
   }
   case MergeableFunctionCount: {
+    const_cast<StableFunctionMap *>(this)->deserializeLazyLoadingEntries();
     size_t Count = 0;
     for (auto &[Hash, Funcs] : HashToFuncs)
-      if (Funcs.size() >= 2)
-        Count += Funcs.size();
+      if (Funcs.Entries.size() >= 2)
+        Count += Funcs.Entries.size();
     return Count;
   }
   }
   llvm_unreachable("Unhandled size type");
 }
 
+const StableFunctionMap::StableFunctionEntries &
+StableFunctionMap::at(HashFuncsMapType::key_type FunctionHash) const {
+  auto It = HashToFuncs.find(FunctionHash);
+  if (isLazilyLoaded())
+    const_cast<StableFunctionMap *>(this)->deserializeLazyLoadingEntry(It);
+  return It->second.Entries;
+}
+
+void StableFunctionMap::deserializeLazyLoadingEntry(
+    HashFuncsMapType::iterator It) {
+  assert(isLazilyLoaded() && "Cannot deserialize non-lazily-loaded map");
+  std::call_once(It->second.LazyLoadFlag, [this, It]() {
+    for (auto Offset : It->second.Offsets)
+      StableFunctionMapRecord::deserializeEntry(
+          reinterpret_cast<const unsigned char *>(Offset), It->first, this,
+          ReadStableFunctionMapNames);
+  });
+}
+
+void ::StableFunctionMap::deserializeLazyLoadingEntries() {
+  if (!isLazilyLoaded())
+    return;
+  for (auto It = HashToFuncs.begin(); It != HashToFuncs.end(); ++It)
+    deserializeLazyLoadingEntry(It);
+}
+
+const StableFunctionMap::HashFuncsMapType &
+StableFunctionMap::getFunctionMap() const {
+  // Ensure all entries are deserialized before returning the raw map.
+  if (isLazilyLoaded())
+    const_cast<StableFunctionMap *>(this)->deserializeLazyLoadingEntries();
+  return HashToFuncs;
+}
+
 using ParamLocs = SmallVector<IndexPair>;
-static void removeIdenticalIndexPair(
-    SmallVector<std::unique_ptr<StableFunctionMap::StableFunctionEntry>> &SFS) {
+static void
+removeIdenticalIndexPair(StableFunctionMap::StableFunctionEntries &SFS) {
   auto &RSF = SFS[0];
   unsigned StableFunctionCount = SFS.size();
 
@@ -159,9 +198,7 @@ static void removeIdenticalIndexPair(
       SF->IndexOperandHashMap->erase(Pair);
 }
 
-static bool isProfitable(
-    const SmallVector<std::unique_ptr<StableFunctionMap::StableFunctionEntry>>
-        &SFS) {
+static bool isProfitable(const StableFunctionMap::StableFunctionEntries &SFS) {
   unsigned StableFunctionCount = SFS.size();
   if (StableFunctionCount < GlobalMergingMinMerges)
     return false;
@@ -202,8 +239,11 @@ static bool isProfitable(
 }
 
 void StableFunctionMap::finalize(bool SkipTrim) {
+  deserializeLazyLoadingEntries();
+  SmallVector<HashFuncsMapType::iterator> ToDelete;
   for (auto It = HashToFuncs.begin(); It != HashToFuncs.end(); ++It) {
-    auto &[StableHash, SFS] = *It;
+    auto &[StableHash, Storage] = *It;
+    auto &SFS = Storage.Entries;
 
     // Group stable functions by ModuleIdentifier.
     llvm::stable_sort(SFS, [&](const std::unique_ptr<StableFunctionEntry> &L,
@@ -236,7 +276,7 @@ void StableFunctionMap::finalize(bool SkipTrim) {
       }
     }
     if (Invalid) {
-      HashToFuncs.erase(It);
+      ToDelete.push_back(It);
       continue;
     }
 
@@ -248,8 +288,10 @@ void StableFunctionMap::finalize(bool SkipTrim) {
     removeIdenticalIndexPair(SFS);
 
     if (!isProfitable(SFS))
-      HashToFuncs.erase(It);
+      ToDelete.push_back(It);
   }
+  for (auto It : ToDelete)
+    HashToFuncs.erase(It);
 
   Finalized = true;
 }
diff --git a/llvm/lib/CGData/StableFunctionMapRecord.cpp b/llvm/lib/CGData/StableFunctionMapRecord.cpp
index 423e068023088..d60e4a30453d8 100644
--- a/llvm/lib/CGData/StableFunctionMapRecord.cpp
+++ b/llvm/lib/CGData/StableFunctionMapRecord.cpp
@@ -53,7 +53,7 @@ static SmallVector<const StableFunctionMap::StableFunctionEntry *>
 getStableFunctionEntries(const StableFunctionMap &SFM) {
   SmallVector<const StableFunctionMap::StableFunctionEntry *> FuncEntries;
   for (const auto &P : SFM.getFunctionMap())
-    for (auto &Func : P.second)
+    for (auto &Func : P.second.Entries)
       FuncEntries.emplace_back(Func.get());
 
   llvm::stable_sort(
@@ -104,17 +104,39 @@ void StableFunctionMapRecord::serialize(
       Writer.OS.tell() - NamesByteSizeOffset - sizeof(NamesByteSizeOffset);
   PatchItems.emplace_back(NamesByteSizeOffset, &NamesByteSize, 1);
 
-  // Write StableFunctionEntries whose pointers are sorted.
+  // Write StableFunctionEntries. The structure is:
+  // - Number of StableFunctionEntries
+  // - Hashes of StableFunctionEntries
+  // - Fixed-size fields for each StableFunctionEntry
+  //   - FunctionNameId
+  //   - ModuleNameId
+  //   - InstCount
+  //   - Relative offset to IndexOperandHashes
+  // - Total size of variable-sized IndexOperandHashes for lazy-loading support
+  // - Variable-sized IndexOperandHashes for each StableFunctionEntry
+  //   - Number of IndexOperandHashes
+  //   - Contents of each IndexOperandHashes
   auto FuncEntries = getStableFunctionEntries(*FunctionMap);
   Writer.write<uint32_t>(FuncEntries.size());
-
-  for (const auto *FuncRef : FuncEntries) {
+  for (const auto *FuncRef : FuncEntries)
     Writer.write<stable_hash>(FuncRef->Hash);
+  std::vector<uint64_t> IndexOperandHashesOffsets;
+  IndexOperandHashesOffsets.reserve(FuncEntries.size());
+  for (const auto *FuncRef : FuncEntries) {
     Writer.write<uint32_t>(FuncRef->FunctionNameId);
     Writer.write<uint32_t>(FuncRef->ModuleNameId);
     Writer.write<uint32_t>(FuncRef->InstCount);
-
+    const uint64_t Offset = Writer.OS.tell();
+    IndexOperandHashesOffsets.push_back(Offset);
+    Writer.write<uint64_t>(0);
+  }
+  const uint64_t IndexOperandHashesByteSizeOffset = Writer.OS.tell();
+  Writer.write<uint64_t>(0);
+  for (size_t I = 0; I < FuncEntries.size(); ++I) {
+    const uint64_t Offset = Writer.OS.tell() - IndexOperandHashesOffsets[I];
+    PatchItems.emplace_back(IndexOperandHashesOffsets[I], &Offset, 1);
     // Emit IndexOperandHashes sorted from IndexOperandHashMap.
+    const auto *FuncRef = FuncEntries[I];
     IndexOperandHashVecType IndexOperandHashes =
         getStableIndexOperandHashes(FuncRef);
     Writer.write<uint32_t>(IndexOperandHashes.size());
@@ -124,10 +146,64 @@ void StableFunctionMapRecord::serialize(
       Writer.write<stable_hash>(IndexOperandHash.second);
     }
   }
+  // Write the total size of IndexOperandHashes.
+  const uint64_t IndexOperandHashesByteSize =
+      Writer.OS.tell() - IndexOperandHashesByteSizeOffset - sizeof(uint64_t);
+  PatchItems.emplace_back(IndexOperandHashesByteSizeOffset,
+                          &IndexOperandHashesByteSize, 1);
+}
+
+void StableFunctionMapRecord::deserializeEntry(
+    const unsigned char *Ptr, stable_hash Hash, StableFunctionMap *FunctionMap,
+    bool ReadStableFunctionMapNames) {
+  assert(FunctionMap->ReadStableFunctionMapNames == ReadStableFunctionMapNames);
+  auto FunctionNameId =
+      endian::readNext<uint32_t, endianness::little, unaligned>(Ptr);
+  if (ReadStableFunctionMapNames)
+    assert(FunctionMap->getNameForId(FunctionNameId) &&
+           "FunctionNameId out of range");
+  auto ModuleNameId =
+      endian::readNext<uint32_t, endianness::little, unaligned>(Ptr);
+  if (ReadStableFunctionMapNames)
+    assert(FunctionMap->getNameForId(ModuleNameId) &&
+           "ModuleNameId out of range");
+  auto InstCount =
+      endian::readNext<uint32_t, endianness::little, unaligned>(Ptr);
+
+  // Read IndexOperandHashes to build IndexOperandHashMap
+  auto CurrentPosition = reinterpret_cast<uintptr_t>(Ptr);
+  auto IndexOperandHashesOffset =
+      endian::readNext<uint64_t, endianness::little, unaligned>(Ptr);
+  auto *IndexOperandHashesPtr = reinterpret_cast<const unsigned char *>(
+      CurrentPosition + IndexOperandHashesOffset);
+  auto NumIndexOperandHashes =
+      endian::readNext<uint32_t, endianness::little, unaligned>(
+          IndexOperandHashesPtr);
+  auto IndexOperandHashMap = std::make_unique<IndexOperandHashMapType>();
+  for (unsigned J = 0; J < NumIndexOperandHashes; ++J) {
+    auto InstIndex = endian::readNext<uint32_t, endianness::little, unaligned>(
+        IndexOperandHashesPtr);
+    auto OpndIndex = endian::readNext<uint32_t, endianness::little, unaligned>(
+        IndexOperandHashesPtr);
+    auto OpndHash =
+        endian::readNext<stable_hash, endianness::little, unaligned>(
+            IndexOperandHashesPtr);
+    assert(InstIndex < InstCount && "InstIndex out of range");
+
+    IndexOperandHashMap->try_emplace({InstIndex, OpndIndex}, OpndHash);
+  }
+
+  // Insert a new StableFunctionEntry into the map.
+  auto FuncEntry = std::make_unique<StableFunctionMap::StableFunctionEntry>(
+      Hash, FunctionNameId, ModuleNameId, InstCount,
+      std::move(IndexOperandHashMap));
+
+  FunctionMap->insert(std::move(FuncEntry));
 }
 
 void StableFunctionMapRecord::deserialize(const unsigned char *&Ptr,
-                                          bool ReadStableFunctionMapNames) {
+                                          bool ReadStableFunctionMapNames,
+                                          bool Lazy) {
   // Assert that Ptr is 4-byte aligned
   assert(((uintptr_t)Ptr % 4) == 0);
   // Read Names.
@@ -139,6 +215,7 @@ void StableFunctionMapRecord::deserialize(const unsigned char *&Ptr,
   const auto NamesByteSize =
       endian::readNext<uint64_t, endianness::little, unaligned>(Ptr);
   const auto NamesOffset = reinterpret_cast<uintptr_t>(Ptr);
+  FunctionMap->ReadStableFunctionMapNames = ReadStableFunctionMapNames;
   if (ReadStableFunctionMapNam...
[truncated]

@nocchijiang
Copy link
Contributor Author

@kyulee-com

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:codegen LTO Link time optimization (regular/full LTO or ThinLTO)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants