How LLVM's Type-Based Alias Analysis Actually Works || Packed Bits

Type-Based Alias Analysis (TBAA) is one of LLVM’s quietest but most important optimizations. It’s what lets the compiler prove that a float * and an int * cannot point at the same memory, which unlocks dead-store elimination, load forwarding, vectorization, and a long list of other tricks. If you’ve ever wondered why -fno-strict-aliasing slows your C code down, TBAA is the answer.

This post walks through how TBAA actually works in the LLVM source tree. We’ll start from a real test case, follow the metadata Clang emits, and then step through the analysis code line by line. By the end you’ll be able to read an arbitrary !tbaa node and predict what LLVM will conclude.

All source citations point to the LLVM monorepo. The key files are:

llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp — the analysis itself
llvm/include/llvm/Analysis/TypeBasedAliasAnalysis.h — public interface
clang/lib/CodeGen/CodeGenTBAA.cpp — Clang-side metadata generation
llvm/test/Analysis/TypeBasedAliasAnalysis/ — regression tests we’ll use as worked examples

The problem TBAA is trying to solve

Consider this C++ fragment:

struct foo { char v; };
struct bar { char v; };

char example(foo *a, bar *b) {
  char x = a->v;
  b->v = 0;
  char y = a->v;
  return x + y;
}

A naive compiler has to assume b->v = 0 might overwrite a->v, so it has to reload a->v into y. But C and C++ have a “strict aliasing” rule: an object may only be accessed through an lvalue of a compatible type (with some well-known exceptions like char*). Under that rule, foo* and bar* cannot overlap, so y must equal x and the second load is redundant.

TBAA is how Clang and LLVM communicate this information: Clang annotates every load and store with metadata describing the type that was used to access memory, and the analysis answers questions like “could these two accesses refer to the same byte?”.

The simplest possible test case

Open llvm/test/Analysis/TypeBasedAliasAnalysis/aliastest.ll. Here’s the first function, lines 7–13:

; CHECK: @test0_yes
; CHECK: add i8 %x, %x
define i8 @test0_yes(ptr %a, ptr %b) nounwind {
  %x = load i8, ptr %a, !tbaa !1
  store i8 0, ptr %b, !tbaa !2
  %y = load i8, ptr %a, !tbaa !1
  %z = add i8 %x, %y
  ret i8 %z
}

The CHECK lines tell us the expected outcome after opt -passes=gvn: the two loads collapse into one, so the add becomes %x + %x instead of %x + %y. That only happens if TBAA can prove the store to %b doesn’t clobber %a.

The metadata is at the bottom of the file, lines 47–68:

; Root note.
!0 = !{ }
; Some type.
!1 = !{!7, !7, i64 0}
; Some other non-aliasing type.
!2 = !{!8, !8, i64 0}

!7 = !{ !"foo", !0 }
!8 = !{ !"bar", !0 }

Three kinds of nodes are in play:

!0 is a root node — just an empty marker that says “everything below me is one family of types”
!7 and !8 are scalar type nodes — each carries a name and a pointer to its parent node
!1 and !2 are access tags — what !tbaa actually points to; they describe “the access is of type X, rooted at base type Y, at offset Z”

If we draw the type nodes using their parent pointers, they form a tree:

        !0   (root)
       /  \
     !7    !8
    (foo) (bar)

Throughout this post I’ll call this the type tree. It’s the picture to keep in your head: TBAA’s job is to compare two leaves of this tree and decide whether the accesses they describe can touch the same memory.

(Strictly speaking, when you add struct-path TBAA later, some nodes become reachable by more than one path — so it’s technically a DAG, not a tree — but for the parent-walking logic that decides aliasing, “tree” is exactly the right mental picture.)

In this simplified test, each access tag has the same base type and access type (that’s what the {!7, !7, i64 0} triple means). Real Clang-generated metadata uses the same shape but the base and access types often differ for struct field accesses.

How Clang emits this metadata

clang/lib/CodeGen/CodeGenTBAA.cpp is where the compiler builds these nodes. Three functions are worth looking at.

The root node (CodeGenTBAA.cpp:47-58):

llvm::MDNode *CodeGenTBAA::getRoot() {
  if (!Root) {
    if (Features.CPlusPlus)
      Root = MDHelper.createTBAARoot("Simple C++ TBAA");
    else
      Root = MDHelper.createTBAARoot("Simple C/C++ TBAA");
  }
  return Root;
}

Each translation unit has one root. A different frontend (say, a Rust compiler emitting LLVM IR) would pick a different root string, and TBAA uses that difference as a signal that the two resulting type trees are unrelated — more on that in a moment.

The “omnipotent char” node (CodeGenTBAA.cpp:63-68):

llvm::MDNode *CodeGenTBAA::getChar() {
  if (!Char)
    Char = createScalarTypeNode("omnipotent char", getRoot(), 1);
  return Char;
}

This is the special-case implementation of C99 §6.5p7: an lvalue of character type may access the stored value of any object. In the type tree, char sits right below the root and becomes the parent of virtually every other scalar type, which means every scalar type is a descendant of char. We’ll see in a minute why that descendant relationship makes char-typed accesses alias everything.

A realistic Clang-emitted tree looks more like this:

           root ("Simple C++ TBAA")
            |
           char ("omnipotent char")
          / |  \
        int short long ...

Each scalar type’s parent pointer eventually lands at the root by passing through char.

Scalar types (CodeGenTBAA.cpp:159-371, getTypeInfoHelper):

if (const BuiltinType *BTy = dyn_cast<BuiltinType>(Ty)) {
  switch (BTy->getKind()) {
  case BuiltinType::Char_U:
  case BuiltinType::Char_S:
  case BuiltinType::UChar:
  case BuiltinType::SChar:
    return getChar();
  case BuiltinType::UShort:
    return getTypeInfo(Context.ShortTy);
  case BuiltinType::UInt:
    return getTypeInfo(Context.IntTy);
  // ... etc.
  }
}

Two design decisions to notice: every flavor of char collapses to the same node (they all alias everything), and each unsigned integer type shares a node with its signed counterpart (since C says int and unsigned int may alias).

Access tags (CodeGenTBAA.cpp:629-655, getAccessTagInfo):

llvm::MDNode *CodeGenTBAA::getAccessTagInfo(TBAAAccessInfo Info) {
  ...
  if (Info.isMayAlias())
    Info = TBAAAccessInfo(getChar(), Info.Size);
  ...
  return N = MDHelper.createTBAAStructTagNode(
      Info.BaseType, Info.AccessType, Info.Offset);
}

This is the function whose output ends up attached to a load or store instruction. Notice the escape hatch at the top: if the access is tagged as may_alias (for example because the C source used __attribute__((may_alias))), the access type is replaced with omnipotent char, effectively disabling TBAA for that access.

Where the analysis gets invoked

The optimizer never calls TBAA directly — it asks the aggregate AAResults object, which chains alias-analysis providers. TBAA is one provider. Its entry point is TypeBasedAAResult::alias (TypeBasedAliasAnalysis.cpp:376-387):

AliasResult TypeBasedAAResult::alias(const MemoryLocation &LocA,
                                     const MemoryLocation &LocB,
                                     AAQueryInfo &AAQI, const Instruction *) {
  if (!shouldUseTBAA())
    return AliasResult::MayAlias;

  if (Aliases(LocA.AATags.TBAA, LocB.AATags.TBAA))
    return AliasResult::MayAlias;

  // Otherwise return a definitive result.
  return AliasResult::NoAlias;
}

The shape of this function is worth staring at: TBAA is a proof of negation. It only ever returns NoAlias when it can prove non-aliasing; any uncertainty falls back to MayAlias and lets other analyses have a shot.

Aliases() is a one-line wrapper around matchAccessTags (TypeBasedAliasAnalysis.cpp:728-730):

bool TypeBasedAAResult::Aliases(const MDNode *A, const MDNode *B) const {
  return matchAccessTags(A, B);
}

Walking through `matchAccessTags`

This is where the real work happens. The function is at TypeBasedAliasAnalysis.cpp:679-724. Simplified:

static bool matchAccessTags(const MDNode *A, const MDNode *B,
                            const MDNode **GenericTag) {
  if (A == B) {                           // ①
    if (GenericTag) *GenericTag = A;
    return true;
  }

  if (!A || !B) {                         // ②
    if (GenericTag) *GenericTag = nullptr;
    return true;
  }

  TBAAStructTagNode TagA(A), TagB(B);
  const MDNode *CommonType = getLeastCommonType(
      TagA.getAccessType(), TagB.getAccessType());   // ③

  if (!CommonType) {                      // ④
    if (GenericTag) *GenericTag = nullptr;
    return true;
  }

  bool MayAlias;
  if (mayBeAccessToSubobjectOf(TagA, TagB, CommonType, GenericTag, MayAlias) || // ⑤
      mayBeAccessToSubobjectOf(TagB, TagA, CommonType, GenericTag, MayAlias))
    return MayAlias;

  if (GenericTag) *GenericTag = createAccessTag(CommonType);   // ⑥
  return false;
}

Let’s run the two functions from aliastest.ll through this.

Tracing `test0_yes`

Inputs: A = !1 = {!7, !7, 0} (type foo), B = !2 = {!8, !8, 0} (type bar). Both !7 and !8 have parent !0 (the same root).

Step ①: !1 != !2, skip.
Step ②: both non-null, skip.
Step ③: compute the least common type of !7 (foo) and !8 (bar).

The LCA computation is at TypeBasedAliasAnalysis.cpp:503-540:

static const MDNode *getLeastCommonType(const MDNode *A, const MDNode *B) {
  if (!A || !B) return nullptr;
  if (A == B) return A;

  SmallSetVector<const MDNode *, 4> PathA;
  TBAANode TA(A);
  while (TA.getNode()) {
    if (!PathA.insert(TA.getNode()))
      report_fatal_error("Cycle found in TBAA metadata.");
    TA = TA.getParent();
  }
  SmallSetVector<const MDNode *, 4> PathB;
  TBAANode TB(B);
  while (TB.getNode()) {
    if (!PathB.insert(TB.getNode()))
      report_fatal_error("Cycle found in TBAA metadata.");
    TB = TB.getParent();
  }

  int IA = PathA.size() - 1;
  int IB = PathB.size() - 1;
  const MDNode *Ret = nullptr;
  while (IA >= 0 && IB >= 0) {
    if (PathA[IA] == PathB[IB]) Ret = PathA[IA];
    else break;
    --IA; --IB;
  }
  return Ret;
}

Tracing:

PathA = [!7(foo), !0(root)]
PathB = [!8(bar), !0(root)]
Comparing from the root end: PathA[1] == PathB[1] (both !0) → Ret = !0. Then PathA[0] != PathB[0] (foo vs bar) → break.
Returns !0 (the root is the common ancestor).

Back in matchAccessTags:

Step ④: CommonType = !0 is non-null, skip.
Step ⑤: call mayBeAccessToSubobjectOf(foo, bar, root, ...).

That function (TypeBasedAliasAnalysis.cpp:607-674) tries to prove that one access could be reaching a subfield of the other. For scalar types with no fields, it walks getField() once, gets nothing, and bails out. Neither direction succeeds.

Step ⑥: reached. Return false.

matchAccessTags returns false → Aliases returns false → alias() returns NoAlias. GVN is now allowed to forward the first load to the second one, giving %z = add i8 %x, %x. That’s exactly what the test expects.

Tracing `test0_no`

Same shape as test0_yes, but the two accesses live under different roots — so they belong to two unrelated type trees:

!3 = !{!9, !9, i64 0}
!4 = !{!10, !10, i64 0}
!9  = !{ !"foo", !0 }      ; rooted at !0
!10 = !{ !"bar", !12 }     ; rooted at a DIFFERENT root
!12 = !{!"different"}

Run through getLeastCommonType(!9, !10):

PathA = [!9, !0]
PathB = [!10, !12]
Comparing from the top: !0 != !12 — no match, break immediately.
Returns nullptr.

So step ④ fires:

if (!CommonType) return true;

matchAccessTags returns true (they may alias), alias() returns MayAlias, GVN leaves both loads in place, and the CHECK: add i8 %x, %y assertion in the test passes.

The lesson: nodes from different trees are never comparable — if there’s no shared ancestor, TBAA gives up and says “may alias”. This is the safety net that lets LLVM link IR from different frontends without accidentally concluding that a Rust i32 doesn’t alias a C++ int.

Struct-path TBAA: adding byte offsets

The aliastest.ll example uses the old “scalar” format. Real C++ code uses struct-path TBAA, which adds field offsets so the analysis can reason about StructA.f32 vs StructB.f16 precisely.

Open llvm/test/Analysis/TypeBasedAliasAnalysis/tbaa-path-new.ll:271-290 for the metadata side:

!2 = !{!3, !3, i64 0, i64 4}              ; access tag: uint32_t
!3 = !{!4, i64 4, !"int"}                 ; scalar type: int, parent=char
!4 = !{!5, i64 1, !"omnipotent char"}     ; omnipotent char, parent=root
!5 = !{!"Simple C++ TBAA"}                ; root

!6 = !{!7, !3, i64 4, i64 4}              ; access tag: StructA.f32 (offset 4, size 4)
!7 = !{!4, i64 16, !"_ZTS7StructA",
       !8, i64 0, i64 2,                  ; short f16 @ offset 0
       !3, i64 4, i64 4,                  ; int   f32 @ offset 4
       !8, i64 8, i64 2,
       !3, i64 12, i64 4}

Type nodes now carry their total size and a list of (field-type, offset, size) triples. Access tags now include the base type, the access type, the offset, and the size of the access. The mangled name _ZTS7StructA (Itanium C++ ABI) makes struct identity stable across translation units, which is what keeps TBAA working under LTO.

The matching code for these tags is mayBeAccessToSubobjectOf (TypeBasedAliasAnalysis.cpp:607-674). The core loop is:

TBAAStructTypeNode BaseType(BaseTag.getBaseType());
uint64_t OffsetInBase = BaseTag.getOffset();

for (;;) {
  if (!BaseType.getNode()) break;

  if (BaseType.getNode() == SubobjectTag.getBaseType()) {
    MayAlias = OffsetInBase == SubobjectTag.getOffset() ||
               BaseType.getNode() == BaseTag.getAccessType() ||
               SubobjectTag.getBaseType() == SubobjectTag.getAccessType();
    return true;
  }

  if (NewFormat && BaseType.getNode() == BaseTag.getAccessType())
    break;

  BaseType = BaseType.getField(OffsetInBase);
}

TBAAStructTypeNode::getField(Offset&) (TypeBasedAliasAnalysis.cpp:306-362) finds the field containing Offset, descends into it, and adjusts Offset to be relative to that field. This lets the analysis follow StructB.a.f32 by literally walking down byte-by-byte: enter StructB at offset 8, subtract 4 (where a starts within StructB), land inside StructA at offset 4, find the int field. If at any point the chain of containments cannot justify the other access’s offset, mayBeAccessToSubobjectOf returns false and we get NoAlias.

The second test in tbaa-path-new.ll shows what this precision buys you:

store i32 1, ptr %s, align 4, !tbaa !2    ; writing an int
store i16 4, ptr %A, align 4, !tbaa !9    ; writing a short
ret i32 1

The two accesses have overlapping addresses but different access types. In the type tree, int and short are both children of char — siblings of each other, not ancestors. Since neither sits above the other, neither access can be reaching “through” the other’s type, so matchAccessTags concludes NoAlias. The store i32 1 is then known to be live across the store i16 4, and constant propagation can fold the return value to 1.

Helper classes that appear everywhere

A few small wrappers deserve a quick look (TypeBasedAliasAnalysis.cpp:150-362):

template<typename MDNodeTy>
class TBAANodeImpl {
  MDNodeTy *Node = nullptr;
public:
  TBAANodeImpl<MDNodeTy> getParent() const {
    if (isNewFormat())
      return TBAANodeImpl(cast<MDNodeTy>(Node->getOperand(0)));
    if (Node->getNumOperands() < 2)
      return TBAANodeImpl<MDNodeTy>();
    MDNodeTy *P = dyn_cast_or_null<MDNodeTy>(Node->getOperand(1));
    return TBAANodeImpl<MDNodeTy>(P);
  }

  bool isTypeImmutable() const {
    if (Node->getNumOperands() < 3) return false;
    ConstantInt *CI = mdconst::dyn_extract<ConstantInt>(Node->getOperand(2));
    return CI && CI->getValue()[0];
  }
};

TBAANode is the generic traversal helper — it’s what getLeastCommonType uses. TBAAStructTagNode does the same for access tags (reading base type, access type, offset, size), and TBAAStructTypeNode exposes the field navigation used by mayBeAccessToSubobjectOf. Whenever you see A->getOperand(2) style code in other parts of LLVM dealing with TBAA, it’s almost always going through one of these wrappers.

The mental model

Walk away with these four points:

TBAA is a proof of NoAlias. The analysis is only useful when it can definitively disprove aliasing; any ambiguity becomes MayAlias.
Types form a tree rooted at a per-frontend root. char sits just below the root and is the ancestor of almost every other scalar type, which is how C’s character-access special case is encoded — anything reachable by walking up from a type passes through char.
Access tags, not types, are what the optimizer compares. A tag is {base type, access type, offset, size} and is attached to every load and store.
Struct-path TBAA adds offsets. Comparing two struct-path access tags reduces to walking down through the struct layout following those offsets; when the walk can’t reconcile the two accesses, they’re proven disjoint.

If you want to read more of the source on your own, the three functions worth tracing by hand are matchAccessTags (the dispatcher), getLeastCommonType (comparability check), and mayBeAccessToSubobjectOf (the precision engine). Everything else is bookkeeping and metadata plumbing.

How LLVM's Type-Based Alias Analysis Actually Works