Chapter 6: The Hawkes Process

The Hawkes process, introduced by Alan Hawkes in 1971, is a self-exciting point process: each event raises the probability of future events. The intensity increases with each arrival and then decays back toward the baseline. This captures the fundamental dynamics of aftershocks, information cascades, order-book micro-structure, and contagion.


6.1 Definition: The Conditional Intensity

The univariate Hawkes process has conditional intensity:

λ*(t) = μ + ∫_{-∞}^t φ(t − s) dN(s)
       = μ + Σ_{tᵢ < t} φ(t − tᵢ)

where:

Each event at time tᵢ contributes φ(t − tᵢ) to the intensity at all future times t > tᵢ. The kernel φ decays toward zero as t − tᵢ grows, so the influence of past events fades.


6.2 The Exponential Kernel

The most widely used kernel is the exponential (also called sum-of-exponentials):

φ(t) = α · exp(−β · t),   t > 0

where α > 0 controls the jump magnitude and β > 0 controls the decay rate. The conditional intensity becomes:

λ*(t) = μ + α · Σ_{tᵢ < t} exp(−β · (t − tᵢ))

The exponential kernel has a crucial recursive structure: define R(t) = Σ_{tᵢ < t} exp(−β(t − tᵢ)). Then:

λ*(t) = μ + α · R(t)

Between consecutive events (tₖ, tₖ₊₁), the intensity decays exponentially:

λ*(t) = μ + (λ*(tₖ⁺) − μ) · exp(−β · (t − tₖ))   for t ∈ (tₖ, tₖ₊₁)

At each new event at time tₖ₊₁, the intensity jumps:

λ*(tₖ₊₁⁺) = λ*(tₖ₊₁⁻) + α

This piecewise-exponential structure makes simulation and likelihood computation tractable in O(n) time.


6.3 The Branching Ratio and Stationarity

Define the branching ratio:

n* = ∫₀^∞ φ(t) dt

For the exponential kernel: n* = α / β.

Interpretation: n* is the expected number of offspring (“daughters”) that a single event generates. This connects the Hawkes process to branching processes:

For a stationary Hawkes process (requires n* < 1), the mean intensity is:

E[λ*(t)] = μ / (1 − n*)

The background rate μ drives immigrants; self-excitation amplifies the total rate by a factor of 1/(1 − n*).


6.4 Branching Process Interpretation

Every Hawkes process has an equivalent cluster (branching) representation:

  1. Immigrants arrive as an HPP with rate μ
  2. Each immigrant spawns a Poisson-random number of offspring with mean n*
  3. Each offspring independently spawns more offspring (same distribution)
  4. Offspring arrive at a delay distributed as φ(t) / n* relative to their parent

The resulting process of all events (immigrants + all generations of offspring) is the Hawkes process. This representation makes it clear why n* < 1 is needed for stationarity: it is the survival condition for the branching process.


6.5 Ogata’s Thinning Algorithm

The standard simulation method for Hawkes processes is Ogata’s modified thinning (1981):

def simulate_hawkes(mu, alpha, beta, T):
    events = []
    t = 0.0
    lambda_bar = mu   # upper bound on intensity

    while True:
        # Step 1: propose next event from HPP(lambda_bar)
        t += np.random.exponential(1.0 / lambda_bar)
        if t > T:
            break

        # Step 2: compute current intensity
        intensity = mu + alpha * sum(np.exp(-beta * (t - ti)) for ti in events)

        # Step 3: accept/reject
        if np.random.uniform() < intensity / lambda_bar:
            events.append(t)
            lambda_bar = intensity + alpha   # new upper bound after jump
        else:
            lambda_bar = intensity           # tighten bound after rejection
    return np.array(events)

The key insight: right after an event, λ*(t⁺) = λ*(t) + α, so we set λ_bar = λ*(t⁺). Between events, λ*(t) is decreasing (only exponential decay), so the current value is a valid upper bound at any future time (until the next event).

See code/06_hawkes_univariate.py for the full vectorized implementation with intensity trajectory visualization.


6.6 Effect of the Branching Ratio

The branching ratio n* dramatically shapes the qualitative behavior:

n* Character Cluster size E[cluster]
0.1 Mostly background, rare clustering 1.11
0.5 Moderate self-excitation 2
0.8 Strong clustering, long bursts 5
0.95 Near-critical, very long cascades 20

For seismological applications, n* ≈ 0.9 is typical. For financial order books, n* can reach 0.95+.


6.7 Key Takeaways


← Chapter 5 Table of Contents Chapter 7: Estimating Hawkes Processes →