Expected Distance in Graph Random Walks: The Real Difference Between Conditional and Absolute Expectation - Devuly | Smart Analytics for Developers & Projects

This article explains how to compute expected distance in a graph random walk. The core conclusion is simple: if you derive the expectation backward from the terminal node, the recurrence works directly; if you derive it forward from the start node and ignore arrival probabilities, the result is wrong. It addresses a common pain point: a recurrence can look correct but still produce the wrong answer. Keywords: random walk, conditional expectation, Markov chain.

Table of Contents

Technical Specification Snapshot

Parameter	Value
Topic	Expected distance in a graph random walk
Mathematical Model	Absorbing Markov chain, DAG, law of total expectation
Language	Chinese
Protocol / License	Originally published as a blog post; attribution is required for reposting
Star Count	Not applicable
Core Dependencies	Probability theory, conditional expectation, path probability decomposition

Correctly Modeling This Problem Requires Starting From the Current State

The original problem asks for the expected path length of a random walk from a start node to a terminal node in a weighted directed acyclic graph. Each outgoing edge has an associated probability, and the outgoing probabilities of the same node sum to 1.

The key issue is not merely how to write the recurrence. The real issue is whether the state definition matches the probabilistic meaning. Once the meaning of the state and the conditioning event become misaligned, the recurrence may still look symmetric while producing the wrong answer.

1 -> 2, weight 1, probability 0.5
1 -> 3, weight 1, probability 0.5
2 -> 4, weight 1, probability 1.0
3 -> 4, weight 0, probability 1.0

This example is the minimal counterexample: the two paths have the same probability, but different total lengths, which is enough to expose the flawed recurrence.

AI Visual Insight: This diagram shows a 4-node DAG random-walk model. Node 1 splits to nodes 2 and 3 with equal probability, and nodes 2 and 3 then move deterministically to terminal node 4. The two candidate paths produce total lengths 2 and 1, respectively, making this the smallest verifiable example for testing whether forward and backward recurrences are truly equivalent.

The Backward Recurrence From the Terminal Node Is Mathematically Sound

Define (E^{out}(u)) as the expected distance from node (u) to the terminal node (t). This is fundamentally a conditional expectation: you already know that the current state is (u).

Therefore, to determine which successor is chosen on the first step, you only need to expand by the law of total expectation over the outgoing edge probabilities of the current node:

def expected_out(u, graph, memo):
    # The expected distance from the terminal node to itself is 0
    if graph[u]["terminal"]:
        return 0.0
    if u in memo:
        return memo[u]

    ans = 0.0
    for v, p, w in graph[u]["edges"]:
        # Contribution of the current edge = transition probability × (edge weight + future expectation)
        ans += p * (w + expected_out(v, graph, memo))

    memo[u] = ans
    return ans

This code corresponds exactly to the standard recurrence: (E^{out}(u)=\sum p{u,v}(w{u,v}+E^{out}(v))).

The Reason It Works Is That Suffix Path Probabilities Are Naturally Normalized

If you start from node (u), every valid path must eventually reach the terminal node. After partitioning the path set by the first edge, the sum of probabilities of all paths from each successor (v) to the terminal is 1.

This step is critical. It means the suffix expectation does not need to be divided by any additional arrival probability, because “being at (v)” is already the world in which the conditioning event holds.

Forward Derivation From the Start Node Usually Fails Because It Omits Arrival Probability

Many people define (E^{in}(u)) as the expected distance accumulated from the start node to node (u), and then write something like this:

E^{in}(u) = Σ p_{v,u} · (E^{in}(v) + w_{v,u})

This is wrong because it treats “reaching (v)” as if it were guaranteed. In reality, a random walk starting from the source does not necessarily visit every intermediate node (v), so (E^{in}(v)) is not a properly normalized conditional expectation.

The Correct Fix Requires Introducing the Arrival Probability (π(u))

Let (π(u)) denote the probability of eventually reaching node (u) from the start node. It satisfies its own probability recurrence along incoming edges. Only when you explicitly put that probability back into the formula does the forward derivation become valid.

def forward_probability_and_expectation(topo, preds, start):
    pi = {u: 0.0 for u in topo}
    E = {u: 0.0 for u in topo}
    pi[start] = 1.0

    for u in topo:
        for v, p, w in preds.get(u, []):
            # Arrival probability recurrence: first reach v, then move to u
            pi[u] += pi[v] * p
            # Absolute expectation recurrence: the edge weight must be scaled by the prior arrival probability
            E[u] += p * (E[v] + w * pi[v])

    return pi, E

This code shows why the edge-weight term must include (π(v)): the edge contributes only when the walk has already reached (v).

The Real Issue Is Confusing Conditional Expectation With Absolute Expectation

From the perspective of probability theory, (E^{out}(u)) is (\mathbb{E}[d\mid X_0=u]), which is a conditional expectation. By contrast, if (E^{in}(u)) is defined as the accumulated contribution to node (u) from the start node, it is closer to an unnormalized absolute expectation.

The most important difference is this: the former already fixes the fact that the walker is currently at that node, while the latter does not guarantee that the node is visited at all. That is why two recurrences with similar surface forms can have completely different meanings.

The Most Practical Test Is to Check Whether the State Is Already Normalized

If the state is defined as “starting from the current node,” you can usually apply the law of total expectation directly to the next step. If the state is defined as “accumulated from the source to some node,” you must first ask whether that node is guaranteed to be reached. If not, you cannot omit the probability weight.

Conditional expectation: E[d | already reached u]
Absolute expectation: E[total contribution of d up to u]
Relationship: E_abs(u) = π(u) × E_cond(u)

This relationship is the fastest way to detect a flawed derivation.

Path Decomposition Is a Better Validation Tool Than Formal Symmetry

Forward recurrence and backward recurrence both appear to be “sums of neighbor contributions,” but the only reliable validation method is path-set decomposition: determine exactly which group of paths each term represents, whether the probabilities are complete, and whether the expectation has already been normalized.

If any step confuses “what must happen after reaching a node” with “what may or may not have happened before reaching a node,” then a conditional expectation has been incorrectly written as an absolute expectation, and the path contributions will be overcounted or undercounted.

Practical Guidance for Engineering and Algorithm Design

In programming contests and engineering models, prefer a state design that derives backward from the terminal node. It naturally satisfies the Markov property and probability normalization, produces a shorter recurrence, and is less error-prone.

If the problem must be derived along incoming edges, model arrival probability and accumulated contribution separately. Do not try to use a single state to represent both reachability and average cost after arrival.

FAQ

Q1: Why does (E^{out}(u)) not need to be multiplied by an additional arrival probability?

A: Because it is already conditioned on “currently being at (u),” so it is a conditional expectation. The total probability of all future paths from (u) is 1, which means the normalization is already built in.

Q2: Why must the edge-weight term in (E^{in}(u)) include (π(v))?

A: Because the cost of edge ((v,u)) is incurred only if the walk first reaches (v). Since the probability of reaching (v) is generally less than 1, the edge contribution must be multiplied by that prior arrival probability.

Q3: How can I quickly tell whether my expectation recurrence is mixing up conditional expectation?

A: Check whether the state means “average cost given that the process is already in this state.” If it does not, then check whether you forgot the probability that the state is actually visited. Once a node is not guaranteed to be reached, you usually cannot directly apply the conditional-expectation form.

Core Summary: This article analyzes expected distance in DAG random walks and explains, in a unified way, why backward recurrence works, why naive forward recurrence fails, and how path probability, arrival probability, and conditional expectation resolve the most common derivation errors.