KnownBits: The Per-Bit Abstract Domain That Half the LLVM Optimizer Relies On || Packed Bits

If you spend time reading LLVM optimization passes, you’ll run into the same pattern over and over: a fold starts with something like

KnownBits Known = computeKnownBits(X, DL, &AC, CxtI, &DT);
if ((SomeMask | Known.Zero).isAllOnes())
  return rewriteThing(...);

and the fold is only legal because of what Known told us. The KnownBits struct is one of LLVM’s most used internal abstractions. It shows up in InstCombine, InstructionSimplify, SelectionDAG’s demanded-bits analysis, GlobalISel, DemandedBits, the Attributor, and a long tail of other passes. It’s the compiler’s way of reasoning about integer values at the bit level without actually running them.

This post walks through what KnownBits is, how it’s computed, and — the fun part — how the rest of the optimizer uses it. I’ll cite a lot of real code across several files. If you’ve read my InstCombine post, you’ve already seen computeKnownBits called once. Here I want to pull back and show you the whole cast.

All line numbers refer to files under /data/dev/llvm-project/.

The struct

llvm/include/llvm/Support/KnownBits.h, line 24:

struct KnownBits {
  APInt Zero;
  APInt One;
  // ...
};

Two APInts of the same bit width as the value being tracked. For each bit position i:

Zero[i] == 1 means bit i is provably 0
One[i] == 1 means bit i is provably 1
both 0 means bit i is unknown — could be either
both 1 means conflict — the value is contradictory, i.e. unreachable

The central invariant is Zero.intersects(One) == false. When a fold proves that the invariant would be violated, it’s discovered a piece of the program that can never execute — and that’s itself a useful signal, which the struct exposes as hasConflict() (KnownBits.h:51):

bool hasConflict() const { return Zero.intersects(One); }

InstructionSimplify actually uses this to fold things to poison — more on that later.

The public API

KnownBits.h is worth skimming in full, but here are the operations you’ll see everywhere:

Queries. These are all constant-time:

bool isZero()                 // All bits provably 0
bool isAllOnes()              // All bits provably 1
bool isConstant()             // Every bit is known
APInt getConstant()           // Retrieve the constant (asserts isConstant())
bool isNegative()             // Sign bit known 1
bool isNonNegative()          // Sign bit known 0
APInt getMinValue()           // Unknown bits assumed 0 → minimum possible value
APInt getMaxValue()           // Unknown bits assumed 1 → maximum possible value
unsigned countMinTrailingZeros()   // Guaranteed trailing zeros
unsigned countMinLeadingZeros()    // Guaranteed leading zeros
unsigned countMinPopulation()      // Guaranteed popcount (lower bound)

Mutators:

void setAllZero()             // Value is zero
void setAllOnes()             // Value is all-ones (max unsigned)
void makeConstant(const APInt &C)
void makeNegative()           // Set sign bit known-1

Combining two KnownBits — two values, at two program points, with known bit sets; merge them:

// Both values must agree on every known bit → intersection
KnownBits intersectWith(const KnownBits &RHS) const;

// Either value is possible → union  
KnownBits unionWith(const KnownBits &RHS) const;

intersectWith is what you use for phi nodes — the merge has to be consistent with every incoming value. unionWith is for computing “this is either X or Y” — e.g., tracking what a value could be after a diamond if-then-else.

Static semantic operators. These implement the semantics of IR operations on the abstract domain:

static KnownBits add(const KnownBits &LHS, const KnownBits &RHS, bool NSW, bool NUW);
static KnownBits sub(const KnownBits &LHS, const KnownBits &RHS, bool NSW, bool NUW);
static KnownBits mul(const KnownBits &LHS, const KnownBits &RHS, bool NUW);
static KnownBits shl(const KnownBits &LHS, const KnownBits &RHS, bool NUW, bool NSW, bool ShAmtNonZero);
static KnownBits lshr(const KnownBits &LHS, const KnownBits &RHS, ...);
static KnownBits ashr(...);

This is the important part. KnownBits::shl(X, C) says “if you shift a value with these known bits left by a shift amount with these known bits, here are the known bits of the result”. It handles all the subtleties: when the shift amount itself is unknown, the result combines the possibilities from every possible shift.

There are even predicate versions — KnownBits::ult, KnownBits::sgt, etc. — which return an std::optional<bool>. If the ranges implied by both operands are fully disjoint on the specified comparison, the answer is a definite true or false. If they overlap, nullopt (unknown).

How `computeKnownBits` actually computes

The computation lives in llvm/lib/Analysis/ValueTracking.cpp. The top-level entry (ValueTracking.cpp:141-150):

void llvm::computeKnownBits(const Value *V, KnownBits &Known,
                            const SimplifyQuery &Q, unsigned Depth) {
  auto *FVTy = dyn_cast<FixedVectorType>(V->getType());
  APInt DemandedElts =
      FVTy ? APInt::getAllOnes(FVTy->getNumElements()) : APInt(1, 1);
  ::computeKnownBits(V, DemandedElts, Known, Q, Depth);
}

The SimplifyQuery carries the DataLayout, AssumptionCache, DominatorTree, and a context instruction. Those together let the analysis answer “what’s known about V at this particular program point”. The same SSA value can have different KnownBits at different points in the CFG (see the section on @llvm.assume below).

The heart of the computation is computeKnownBitsFromOperator (ValueTracking.cpp:1403-1702), a big switch over instruction opcodes. A few cases to ground the idea:

Instruction::Add (ValueTracking.cpp:1673-1678):

case Instruction::Add: {
  bool NSW = Q.IIQ.hasNoSignedWrap(cast<OverflowingBinaryOperator>(I));
  bool NUW = Q.IIQ.hasNoUnsignedWrap(cast<OverflowingBinaryOperator>(I));
  computeKnownBitsAddSub(true, I->getOperand(0), I->getOperand(1), NSW, NUW,
                         DemandedElts, Known, Known2, Q, Depth);
  break;
}

Recursively computes KnownBits of both operands, then calls the semantic operator KnownBits::computeForAddSub. The NSW / NUW flags sharpen the result — if the add can’t overflow, you can derive tighter bounds.

Instruction::Shl (ValueTracking.cpp:1627-1640):

case Instruction::Shl: {
  bool NUW = Q.IIQ.hasNoUnsignedWrap(cast<OverflowingBinaryOperator>(I));
  bool NSW = Q.IIQ.hasNoSignedWrap(cast<OverflowingBinaryOperator>(I));
  auto KF = [NUW, NSW](const KnownBits &KnownVal, const KnownBits &KnownAmt,
                       bool ShAmtNonZero) {
    return KnownBits::shl(KnownVal, KnownAmt, NUW, NSW, ShAmtNonZero);
  };
  computeKnownBitsFromShiftOperator(I, DemandedElts, Known, Known2, Q, Depth, KF);
  const APInt *C;
  if (match(I->getOperand(0), m_APInt(C)))
    Known.Zero.setLowBits(C->countr_zero());
  break;
}

Shift amounts can themselves be unknown; the helper deals with that. The bonus at the bottom — “if the thing being shifted is a constant, we can set trailing-zero bits directly” — is one of many small opportunistic refinements.

Instruction::Load with range metadata (ValueTracking.cpp:1413-1416):

case Instruction::Load:
  if (MDNode *MD =
          Q.IIQ.getMetadata(cast<LoadInst>(I), LLVMContext::MD_range))
    computeKnownBitsFromRangeMetadata(*MD, Known);
  break;

When the load has !range !0 metadata saying, e.g., !0 = !{i32 0, i32 256}, computeKnownBitsFromRangeMetadata (ValueTracking.cpp:580-607) extracts known bits directly:

unsigned CommonPrefixBits =
    (Range.getUnsignedMax() ^ Range.getUnsignedMin()).countl_zero();
APInt Mask = APInt::getHighBitsSet(BitWidth, CommonPrefixBits);
APInt UnsignedMax = Range.getUnsignedMax().zextOrTrunc(BitWidth);
Known.One &= UnsignedMax & Mask;
Known.Zero &= ~UnsignedMax & Mask;

For [0, 256): the common prefix of 0 and 255 is 24 leading zero bits. So the high 24 bits are known zero. The low 8 bits are whatever the value is — unknown.

Every IR opcode has a case. There are over 80 of them, each doing the same kind of work: recursively compute operand bits, run the semantic operator, return. Enforced depth limit (MaxRecursionDepth = 6 in the default) keeps it bounded.

Context: branches and `@llvm.assume`

The same SSA value can have different known bits at different program points — because dominating branches and @llvm.assume calls can narrow what’s possible. This is handled by computeKnownBitsFromContext (ValueTracking.cpp:1049-1074):

void llvm::computeKnownBitsFromContext(const Value *V, KnownBits &Known,
                                       const SimplifyQuery &Q, unsigned Depth) {
  if (Q.CC && Q.CC->AffectedValues.contains(V))
    computeKnownBitsFromCond(V, Q.CC->Cond, Known, Q, Q.CC->Invert, Depth);

  if (Q.DC && Q.DT) {
    for (CondBrInst *BI : Q.DC->conditionsFor(V)) {
      BasicBlockEdge Edge0(BI->getParent(), BI->getSuccessor(0));
      if (Q.DT->dominates(Edge0, Q.CxtI->getParent()))
        computeKnownBitsFromCond(V, BI->getCondition(), Known, Q,
                                 /*Invert*/ false, Depth);
      // ... similar for Edge1 with Invert=true
    }
  }

  if (!Q.AC)
    return;
  for (AssumptionCache::ResultElem &Elem : Q.AC->assumptionsFor(V)) {
    // ... process @llvm.assume intrinsics
  }
}

Three sources of context tighten the bits:

The “cond context” (Q.CC) — the current simplification is happening inside a conditional arm; use the branch condition to narrow.
Dominating branches — iterate every conditional branch in the dominance chain whose condition mentions V; refine based on which side we’re on.
Assumptions — @llvm.assume(cond) is equivalent to “assume cond holds from here on”; the AssumptionCache indexes these by value.

The recursive computeKnownBitsFromCond (ValueTracking.cpp:1006-1047) handles compound conditions via and/or trees, eventually bottoming out at an ICmpInst or similar. So a branch like if ((x & 15) == 0) { ...; use(x); } lets computeKnownBits(x) inside the then-branch know the low 4 bits are zero.

Alignment assumptions get special treatment. From ValueTracking.cpp:1084-1100:

if (RetainedKnowledge RK = getKnowledgeFromBundle(...)) {
  if (RK.WasOn == V && RK.AttrKind == Attribute::Alignment &&
      isPowerOf2_64(RK.ArgValue) &&
      isValidAssumeForContext(I, Q.CxtI, Q.DT, /*AllowEphemerals*/ true))
    Known.Zero.setLowBits(Log2_64(RK.ArgValue));
}

@llvm.assume(ptr align(64)) → alignment 64 → low 6 bits of the pointer are known zero. This directly plugs into the KnownBits representation as a bit-level fact.

This context integration is what makes @llvm.assume a useful optimization hint, not just a compiler-side comment. When you write assert(x < 256) and compile with asserts on, that assertion becomes an assume, which becomes a KnownBits fact, which unlocks every downstream fold that can take advantage of “high 24 bits of x are zero”.

Sibling analyses

A few cousins you’ll see alongside KnownBits:

MaskedValueIsZero (ValueTracking.cpp:318-323):

bool llvm::MaskedValueIsZero(const Value *V, const APInt &Mask,
                             const SimplifyQuery &SQ, unsigned Depth) {
  KnownBits Known(Mask.getBitWidth());
  computeKnownBits(V, Known, SQ, Depth);
  return Mask.isSubsetOf(Known.Zero);
}

Convenience wrapper for “are these specific bits known zero?”. Used constantly.

isKnownNonZero(V) — returns true if KnownBits shows V != 0. Either Known.One has any bit set, or the value is otherwise provably nonzero via a recursive structural analysis.

isKnownToBeAPowerOfTwo(V) — true if exactly one bit is set. Used by InstCombine patterns that want to replace divisions by shifts.

ComputeNumSignBits (ValueTracking.cpp:4238) — a different representation. Instead of tracking each bit’s zero/one status, it asks “how many of the leading bits are guaranteed to equal the sign bit?”. For a value known to fit in i17, ComputeNumSignBits returns 15 (the top 15 bits match the sign bit). This is a coarser representation, but for many arithmetic patterns — sign-extend-in-register, ashr folds — it’s cheaper and sufficient. KnownBits and NumSignBits are complementary; most passes query both.

Consumer 1: InstCombine

This is where I first saw computeKnownBits called in my own reading. Most InstCombine folds are guarded by a KnownBits precondition. Three worked examples.

`add (xor X, MASK), C` → `sub (MASK + C), X`

From InstCombineAddSub.cpp:954-960:

if (C2->isMask()) {
  KnownBits LHSKnown = computeKnownBits(X, &Add);
  if ((*C2 | LHSKnown.Zero).isAllOnes())
    return BinaryOperator::CreateSub(ConstantInt::get(Ty, *C2 + *C), X);
}

Input:

%xor = xor i32 %X, 15   ; C2 = 15 = 0b1111 (a mask)
%add = add i32 %xor, 2  ; C = 2

Output (if KnownBits check passes):

%sub = sub i32 17, %X

Why the check is essential. The identity (X xor mask) + C == (mask + C) - X only holds when X has no bits set outside mask. If X has higher bits set, the xor leaves them alone but the subtraction on the right would give a completely different result. The check (*C2 | LHSKnown.Zero).isAllOnes() asks exactly that: every bit is either in C2 (so xor flips it) or known zero in X (so nothing changes). Pass this check, and the algebraic identity is valid.

Without KnownBits, InstCombine would have to skip this fold conservatively on all inputs. With KnownBits, it’s applied whenever the pass can prove the precondition.

`or X, C` → `xor X, C` when high bits are zero

From InstCombineAndOrXor.cpp:2536-2544:

if (match(Op0, m_APInt(Op0C))) {
  if (Op0C->isMask()) {
    KnownBits RHSKnown = llvm::computeKnownBits(
        Op1, SQ.getWithInstruction(&I).getWithoutDomCondCache());
    if ((*Op0C | RHSKnown.Zero).isAllOnes())
      return BinaryOperator::CreateXor(Op1, Op0);
  }
}

Input:

%or = or i32 %Y, 255

Output (if check passes):

%xor = xor i32 %Y, 255

Why. or X, MASK and xor X, MASK produce the same result if and only if every bit of MASK is already zero in X. The xor form is preferred because it’s canonical and because xor has nicer algebraic properties for subsequent folds (it’s its own inverse, xor X, MASK, MASK == X).

Paired power-of-two bit tests

From InstCombineAndOrXor.cpp:708-718:

if (Mask & Mask_NotAllZeros &&
    isKnownToBeAPowerOfTwo(B, /*OrZero=*/false, Q) &&
    isKnownToBeAPowerOfTwo(D, /*OrZero=*/false, Q)) {
  Value *Mask = Builder.CreateOr(B, D);
  Value *Masked = Builder.CreateAnd(A, Mask);
  return Builder.CreateICmp(NewCC, Masked, Mask);
}

Input:

%ab = and i32 %A, %B             ; B known power-of-two
%cmpb = icmp ne i32 %ab, 0
%ad = and i32 %A, %D             ; D known power-of-two
%cmpd = icmp ne i32 %ad, 0
%result = and i1 %cmpb, %cmpd    ; both bits set?

Output:

%Mask = or i32 %B, %D
%Masked = and i32 %A, %Mask
%result = icmp eq i32 %Masked, %Mask

Three instructions instead of five. Why the KnownBits guards matter: the transformation requires that each of B and D is a single bit. If B had two bits set, (A & B) != 0 means “at least one of these two bits is set in A”, but the rewrite checks “both bits of B are set in A”. Without isKnownToBeAPowerOfTwo, you don’t have the guarantee.

Consumer 2: InstructionSimplify

InstructionSimplify.cpp is the “try to prove this thing reduces to a simpler value without creating new instructions” pass. It uses KnownBits heavily for shift folds. From InstructionSimplify.cpp:1340-1365:

KnownBits KnownAmt = computeKnownBits(Op1, Q);
if (KnownAmt.getMinValue().uge(KnownAmt.getBitWidth()))
  return PoisonValue::get(Op0->getType());

unsigned NumValidShiftBits = Log2_32_Ceil(KnownAmt.getBitWidth());
if (KnownAmt.countMinTrailingZeros() >= NumValidShiftBits)
  return Op0;  // Shift by 0 is identity

if (IsNSW) {
  KnownBits KnownVal = computeKnownBits(Op0, Q);
  KnownBits KnownShl = KnownBits::shl(KnownVal, KnownAmt);
  if (KnownShl.hasConflict())
    return PoisonValue::get(Op0->getType());
}

Three distinct facts extracted from KnownBits:

Shift amount guaranteed >= bitwidth → poison. Shifting by too much is undefined behavior; the result is poison.
Shift amount’s low bits are all known zero → shift by 0, identity fold.
The nsw shl would create contradictory bits → hasConflict() → poison.

That last one is interesting. nsw promises “no signed wrap” — the sign bit can’t flip unexpectedly. If the analysis can prove the sign bit must be both 0 and 1 (e.g., because the original sign bit gets shifted out and the next bit must be 1), then the nsw promise is violated on every possible input, and the whole operation is poison. hasConflict() surfaces this directly.

Consumer 3: SelectionDAG and SimplifyDemandedBits

The backend has its own parallel universe of KnownBits that operates on SDNode instead of Value. Same representation, different IR. From llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:3340-3350:

KnownBits SelectionDAG::computeKnownBits(SDValue Op, const APInt &DemandedElts,
                                         unsigned Depth) const {
  unsigned BitWidth = Op.getScalarValueSizeInBits();
  KnownBits Known(BitWidth);

  if (auto OptAPInt = Op->bitcastToAPInt()) {
    return KnownBits::makeConstant(*std::move(OptAPInt));
  }

  if (Depth >= MaxRecursionDepth)
    return Known;
  // ... dispatch on SDNode opcode ...
}

Parallel infrastructure, same principles. Targets can add their own cases via TargetLowering::computeKnownBitsForTargetNode, which lets them express facts about target-specific instructions (e.g., AArch64’s “this instruction always produces a zero-extended value”).

The really interesting backend consumer is SimplifyDemandedBits — the dual of KnownBits. Known bits tell you what a value produces; demanded bits tell you what its users actually look at. If a user only reads the low 16 bits, the producing instruction can be narrowed to produce only those 16 bits, even if it could produce more. From TargetLowering.cpp:658-705:

bool TargetLowering::SimplifyDemandedBits(SDValue Op, const APInt &DemandedBits,
                                          DAGCombinerInfo &DCI) const {
  SelectionDAG &DAG = DCI.DAG;
  TargetLoweringOpt TLO(DAG, !DCI.isBeforeLegalize(),
                        !DCI.isBeforeLegalizeOps());
  KnownBits Known;

  bool Simplified = SimplifyDemandedBits(Op, DemandedBits, Known, TLO);
  if (Simplified) {
    DCI.AddToWorklist(Op.getNode());
    DCI.CommitTargetLoweringOpt(TLO);
  }
  return Simplified;
}

The full SimplifyDemandedBits implementation is hundreds of lines of opcode dispatch. For each SDNode kind, it knows which input bits are demanded given the output demand, recursively simplifies the inputs, and uses the resulting KnownBits to prove the node can be replaced.

Classic example: trunc i32 %x to i16 demands only the low 16 bits of %x. If %x is an add of two values, and those values’ high 16 bits are known zero, the add can be narrowed to i16. The demand-bits analysis drives this rewrite, and KnownBits is what proves it’s safe.

GlobalISel — the newer backend code-gen framework — has the same infrastructure in GISelKnownBits (llvm/lib/CodeGen/GlobalISel/GISelKnownBits.cpp), operating on Machine IR.

Consumer 4: DemandedBits analysis

The middle-end has its own demanded-bits pass: llvm/lib/Analysis/DemandedBits.cpp. The header comment (lines 1-19) is the clearest explanation I’ve seen:

// This pass implements a demanded bits analysis. A demanded bit is one that
// contributes to a result; bits that are not demanded can be either zero or
// one without affecting control or data flow. For example in this sequence:
//
//   %1 = add i32 %x, %y
//   %2 = trunc i32 %1 to i16
//
// Only the lowest 16 bits of %1 are demanded; the rest are removed by the trunc.

It uses KnownBits at its core (DemandedBits.cpp:52-78):

auto ComputeKnownBits =
    [&](unsigned BitWidth, const Value *V1, const Value *V2) {
      if (KnownBitsComputed) return;
      KnownBitsComputed = true;

      Known = KnownBits(BitWidth);
      computeKnownBits(V1, Known, DL, &AC, UserI, &DT);

      if (V2) {
        Known2 = KnownBits(BitWidth);
        computeKnownBits(V2, Known2, DL, &AC, UserI, &DT);
      }
    };

DemandedBits caches KnownBits queries and uses them to decide, for each operand, which bits actually propagate to the final result. Downstream passes (BDCE, specifically — “bit-tracking DCE”) use DemandedBits to delete instructions whose results aren’t actually observed.

Consumer 5: bit-tests and integer narrowing

InstCombine uses MaskedValueIsZero liberally to choose between expensive and cheap equivalents. One example (InstCombineAddSub.cpp:974-978):

if (ShAmt &&
    MaskedValueIsZero(X, APInt::getHighBitsSet(BitWidth, ShAmt), &Add)) {
  Constant *ShAmtC = ConstantInt::get(Ty, ShAmt);
  Value *NewShl = Builder.CreateShl(X, ShAmtC, "sext");
  return BinaryOperator::CreateAShr(NewShl, ShAmtC);
}

This is a sign-extend-in-register rewrite: if the high ShAmt bits of X are zero, we can model the sign extension as ashr(shl(X, ShAmt), ShAmt). That pattern is canonical and often gets mapped to a single hardware instruction (like x86’s MOVSX variants).

Pattern: when you see this:

if (MaskedValueIsZero(V, SomeMask, Q)) { ... }

what it’s really asking is “can I assume V’s bits in SomeMask are zero?”. And the answer is “yes, if KnownBits proves it at this program point”. It’s a one-line check against a property that would otherwise require ad-hoc proof.

End-to-end: tracing an optimization through KnownBits

Let me walk through the test case from llvm/test/Transforms/InstCombine/all-bits-shift.ll. The input:

define signext i32 @main() {
  %a = load i32, ptr @a
  %cond = icmp eq i32 %a, 0
  %shift_amt = zext i1 %cond to i32   ; 0 or 1
  %shr1 = lshr i32 2072, %shift_amt
  %shr2 = lshr i32 %shr1, 7
  %and = and i32 %shr2, 1
  ret i32 %and
}

The optimizer wants to prove %and is a constant. Tracing the KnownBits reasoning:

%shift_amt is zext i1 ... to i32. Any zext of an i1 has high 31 bits known zero and the low bit unknown. So KnownBits for %shift_amt is: Zero = 0xFFFFFFFE, One = 0.
%shr1 = lshr 2072, %shift_amt. 2072 = 0b100000011000. Applying KnownBits::lshr(const, unknown_in_{0,1}):
- Shift by 0: 0b100000011000
- Shift by 1: 0b010000001100
- Union of possibilities: Zero = 0xFFFFF003, One = 0 (some bits might be 1 on either path; no bit is always 1).
%shr2 = lshr %shr1, 7. KnownBits::lshr(known_from_step_2, const 7). The low 7 bits from step 2’s high bits shift into position. In both paths, the top bits of the value are zero, so after shifting right 7, the high bits are definitely zero. The low bits… depend on bits 7+ of %shr1, which by step 2 are also known zero for most positions. The result has high bits zero and low bit unknown.
%and = and %shr2, 1. KnownBits::and(shr2_known, 1_known). For bit 0, both operands have bit 0 potentially set. For bits 1 and above, 1’s KnownBits has Zero = 0xFFFFFFFE. AND with a known-zero bit produces known-zero.

After all that: %and has high 31 bits known zero and low bit… hmm, looking more carefully — after the chain of shifts, bit 0 of %shr2 is 0 on both paths (2072 » 7 = 16 = 0b10000 has bit 0 = 0; 1036 » 7 = 8 = 0b1000 has bit 0 = 0). So bit 0 is known zero too. Therefore %and has all 32 bits known zero, which means %and == 0.

The optimizer replaces the whole function with ret i32 0.

This is the kind of reasoning that’s purely mechanical once you have the KnownBits infrastructure. Each step is one KnownBits::someOp call. Build up the abstract interpretation and the proof falls out. Without KnownBits, proving this would require hand-coded special-case reasoning about each instruction in the chain — which is exactly the kind of code you want to avoid writing in an optimizer.

Why KnownBits is the right abstraction

A few design choices worth appreciating:

It’s a lattice. “All bits unknown” is the top; “specific constant” is the bottom (before any conflict). Every operation is monotone — more input information never produces less output information. This is exactly the shape abstract interpretation needs.

Conflict is bottom. When Zero.intersects(One), we’ve proven the value is contradictory, i.e., that the code point is unreachable. The same KnownBits struct naturally encodes “known dead code” at no cost.

It composes cleanly. The static operators (KnownBits::add, ::shl, etc.) are pure functions on the abstract domain. They don’t need an IR context. You can compose them freely: KnownBits::add(KnownBits::shl(a, 2), b) is fine. This makes it easy to use the same infrastructure from IR passes and SelectionDAG and GlobalISel, all of which have different in-memory representations but identical arithmetic semantics.

It has a tight upper bound on cost. computeKnownBits has a depth limit (6 by default). Beyond that, it returns “all unknown”. This keeps analysis bounded even on degenerate input.

It integrates with dominators, assumes, and metadata. The same call site that reads bits can be enriched by branch conditions, @llvm.assume, range metadata, alignment bundles, and so on. Users get “what bits can this value have at this program point”, which is almost always what they wanted.

Why so many consumers?

Almost every optimization in LLVM reduces, at some level, to “can I prove enough about this value to transform it?”. KnownBits is one of the most general answers to that question, so it shows up everywhere. A rough taxonomy of the consumers I’ve traced:

Consumer	What it uses KnownBits for
InstCombine	Precondition checks for rewrites (disjoint masks, high-bits-zero, etc.)
InstructionSimplify	Proving results are constant, or proving `poison`
DemandedBits	Tracks which bits are actually observed; KnownBits completes the dual
SelectionDAG / GlobalISel	Backend-level narrowing, constant folding through patterns
SimplifyDemandedBits (backend)	Drops input bits the output doesn’t care about
Attributor / FunctionAttrs	Argument attribute inference (alignment, bounds)
Constant folding	`ConstantFoldInstOperands` uses it transitively
BDCE (Bit-tracking DCE)	Deletes instructions whose bits are never read

Any time you read an LLVM pass and see a precondition check that feels bit-level — “but only if the high bits are zero”, “but only if this is a power of two”, “but only if this can’t overflow” — odds are good there’s a KnownBits query underneath.

KnownBits: The Per-Bit Abstract Domain That Half the LLVM Optimizer Relies On

The struct

The public API

How computeKnownBits actually computes

Context: branches and @llvm.assume

Sibling analyses

Consumer 1: InstCombine

add (xor X, MASK), C → sub (MASK + C), X

or X, C → xor X, C when high bits are zero

Paired power-of-two bit tests

Consumer 2: InstructionSimplify

Consumer 3: SelectionDAG and SimplifyDemandedBits

Consumer 4: DemandedBits analysis

Consumer 5: bit-tests and integer narrowing

End-to-end: tracing an optimization through KnownBits

Why KnownBits is the right abstraction

Why so many consumers?

Further reading

How `computeKnownBits` actually computes

Context: branches and `@llvm.assume`

`add (xor X, MASK), C` → `sub (MASK + C), X`

`or X, C` → `xor X, C` when high bits are zero