Chapter 8: Multivariate Hawkes Processes

In many applications, there are multiple interacting event streams. A neural population has many neurons that excite and inhibit each other. A limit order book has buy orders and sell orders that cross-excite. Seismic networks observe events at multiple stations with correlated activity. The multivariate Hawkes process models all of these simultaneously.


8.1 Definition

Consider M event streams N₁(t), ..., Nₘ(t). The multivariate Hawkes process assigns each stream a conditional intensity:

λᵢ*(t) = μᵢ + Σⱼ₌₁ᴹ Σ_{tⱼₖ < t} φᵢⱼ(t − tⱼₖ)

where:

The M × M matrix of kernels Φ = [φᵢⱼ] is the kernel matrix. Diagonal terms φᵢᵢ represent self-excitation (as in the univariate case); off-diagonal terms φᵢⱼ represent cross-excitation from stream j to stream i.


8.2 Exponential Kernels

Using the exponential kernel for each pair:

φᵢⱼ(t) = αᵢⱼ · exp(−βᵢⱼ · t)

The integrated kernel matrix is A with entries Aᵢⱼ = αᵢⱼ / βᵢⱼ.

Stability condition: The process is stationary if and only if:

ρ(A) < 1

where ρ(A) is the spectral radius (largest eigenvalue in absolute value) of A. This generalizes the univariate condition n* < 1.


8.3 Mean Rates in Stationarity

In stationarity, the mean rate vector Λ = (E[λ₁*], ..., E[λₘ*])ᵀ satisfies:

Λ = (I − Aᵀ)⁻¹ μ

where μ = (μ₁, ..., μₘ)ᵀ. This linear system shows how background rates are amplified by the network of excitations.

Example (2D): For equal self-excitation a₁₁ = a₂₂ = a and mutual excitation a₁₂ = a₂₁ = b:

A = [[a, b],
     [b, a]]

ρ(A) = a + b   (largest eigenvalue)

Mean rates: Λ₁ = Λ₂ = μ / (1 − a − b)   (when μ₁ = μ₂ = μ)

8.4 Granger Causality

The kernel matrix A defines a Granger causality graph: j → i if Aᵢⱼ > 0 (events in stream j cause future events in stream i).

This gives a principled way to recover the causal structure of interacting event streams. Hypothesis test: H₀: Aᵢⱼ = 0 vs. H₁: Aᵢⱼ > 0 can be done via a likelihood ratio test.


8.5 Simulation: Multi-Dimensional Ogata Thinning

The thinning algorithm extends naturally to M dimensions:

def simulate_multivariate_hawkes(mu, alpha, beta, T, M):
    # events[i] = list of event times for stream i
    events = [[] for _ in range(M)]
    t = 0.0
    lambda_bar = sum(mu)  # initial upper bound on total intensity

    while True:
        t += np.random.exponential(1.0 / lambda_bar)
        if t > T:
            break

        # Compute current intensities for each stream
        intensities = np.array([
            mu[i] + sum(
                alpha[i, j] * sum(np.exp(-beta[i,j] * (t - s)) for s in events[j])
                for j in range(M)
            ) for i in range(M)
        ])
        total_intensity = intensities.sum()

        if np.random.uniform() < total_intensity / lambda_bar:
            # Accept: pick which stream
            stream = np.random.choice(M, p=intensities / total_intensity)
            events[stream].append(t)
            lambda_bar = total_intensity + alpha[:, stream].sum()
        else:
            lambda_bar = total_intensity

    return [np.array(ev) for ev in events]

See code/08_multivariate_hawkes.py for the full optimized implementation.


8.6 MLE for the Multivariate Case

The log-likelihood separates across streams (for the exponential kernel):

ℓ(θ) = Σᵢ₌₁ᴹ ℓᵢ(μᵢ, {αᵢⱼ}, {βᵢⱼ})

where each ℓᵢ is the log-likelihood for stream i:

ℓᵢ = Σₖ log λᵢ*(tᵢₖ) − ∫₀ᵀ λᵢ*(t) dt

The compensator for stream i is:

∫₀ᵀ λᵢ*(t) dt = μᵢ T + Σⱼ (αᵢⱼ/βᵢⱼ) Σₖ [1 − exp(−βᵢⱼ(T − tⱼₖ))]

The recursion for Rᵢⱼ (the contribution of stream j events to stream i intensity) generalizes the univariate Rᵢ:

Rᵢⱼ(tᵢₖ) = Σ_{tⱼₗ < tᵢₖ} exp(−βᵢⱼ(tᵢₖ − tⱼₗ))

These can be computed by merging the event sequences and updating the relevant Rᵢⱼ values at each event.


8.7 Application: Limit Order Book

A canonical application is the 2D Hawkes model for bid/ask arrivals:

The kernel matrix reveals:

Empirically, A₁₂ ≈ A₂₁ (symmetric cross-excitation), and ρ(A) can be as high as 0.95 in liquid markets.


8.8 Key Takeaways


← Chapter 7 Table of Contents Chapter 9: Cox Processes →