Generated using AI. Be aware that everything might not be accurate.

Chapter 9: Cox Processes

Hawkes processes achieve burstiness through self-excitation: events cause events. But there is another mechanism: the arrival rate itself may be randomly varying, driven by an unobserved environment. A Cox process (also called a doubly stochastic Poisson process) captures exactly this — the intensity is a random process, and conditional on it, events are Poisson.

9.1 Definition

A Cox process directed by the random intensity process {Λ(t)}_{t≥0} is defined as:

Conditional on the realization {Λ(t) = λ(t)}, the event process N is an inhomogeneous Poisson process with intensity λ(t).

In other words, there are two levels of randomness:

The intensity process Λ(t) is drawn from some distribution over functions
Given Λ(t), events occur as a standard NHPP

9.2 Variance and Overdispersion

A key property of Cox processes is overdispersion: the variance of event counts exceeds the mean.

For any interval (a, b]:

E[N(a,b]] = E[Λ(a,b]]
Var[N(a,b]] = E[Λ(a,b]] + Var[Λ(a,b]]

The first term is the Poisson variance (equidispersion); the second term is additional variance from the randomness of the intensity. Thus Var > Mean whenever the intensity is non-constant and random.

Fano factor: F = 1 + Var[Λ(a,b]] / E[Λ(a,b]] > 1.

This is the diagnostic for Cox processes: observe F > 1 in the data, and the Poisson model is inadequate. Note that Hawkes processes also produce overdispersion (via clustering), so F > 1 alone does not distinguish Cox from Hawkes — the ACF structure does.

9.3 The Log-Gaussian Cox Process (LGCP)

The most popular Cox process in applications is the Log-Gaussian Cox Process (LGCP), where:

log Λ(t) = G(t)

and G(t) is a Gaussian process with mean function m(t) and covariance kernel k(t, s).

Why log-Gaussian? The log transformation ensures Λ(t) > 0 for all t. The Gaussian process provides a flexible model for smooth random variation.

Common kernels:

Squared exponential: k(t,s) = σ²·exp(−(t−s)²/(2ℓ²)) — smooth, infinitely differentiable
Matérn 3/2: k(t,s) = σ²(1+√3|t−s|/ℓ)·exp(−√3|t−s|/ℓ) — rougher, 1 time differentiable
Periodic: k(t,s) = σ²·exp(−2sin²(π|t−s|/p)/ℓ²) — for periodic intensity functions

9.4 Simulation of the LGCP

Simulation is a two-step procedure:

from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF, ConstantKernel

def simulate_lgcp(T, n_grid=500, sigma=1.0, length_scale=5.0, mu_log=1.0):
    # Step 1: sample a GP path on a fine grid
    t_grid = np.linspace(0, T, n_grid).reshape(-1, 1)
    kernel = ConstantKernel(sigma**2) * RBF(length_scale)
    gp = GaussianProcessRegressor(kernel=kernel)
    log_lambda = gp.sample_y(t_grid, n_samples=1).flatten() + mu_log

    # Step 2: simulate NHPP via thinning
    lambda_grid = np.exp(log_lambda)
    lambda_bar = lambda_grid.max() * 1.05
    dt = T / n_grid

    events = []
    t = 0.0
    while t < T:
        t += np.random.exponential(1.0 / lambda_bar)
        if t >= T:
            break
        idx = min(int(t / dt), n_grid - 1)
        if np.random.uniform() < lambda_grid[idx] / lambda_bar:
            events.append(t)
    return np.array(events), t_grid.flatten(), lambda_grid

Step 1 samples a random intensity path; Step 2 simulates events from that path. See code/09_cox_processes.py for the full implementation.

9.5 Shot Noise Cox Process

An alternative to the LGCP is the shot noise Cox process, where the intensity is driven by a superposition of decaying pulses:

Λ(t) = μ + Σᵢ h(t − sᵢ)

where {sᵢ} are the events of a Poisson process (the “shocks”), and h(t) is a deterministic response function (e.g., h(t) = a · exp(−b·t) · 1{t>0}).

This resembles the Hawkes process but with a key difference: the shocks {sᵢ} are not the observed events N. The shocks are latent (unobserved). The Cox process models an unobserved random environment; the Hawkes process models observed self-excitation.

9.6 Distinguishing Cox from Hawkes

Both Cox processes and Hawkes processes can produce overdispersion (F > 1) and clustering. The key difference lies in the autocorrelation function (ACF) of the counting process:

Cox (LGCP): The ACF of N(t, t+h] at lag Δ reflects the kernel k(t, t+Δ) of the GP — typically smooth and slowly decaying.
Hawkes: The ACF decays as exp(−β·Δ/(1−n*)) — governed by the excitation kernel, often faster-decaying and with a specific shape.

Non-parametric estimation of the second-order structure (Bartlett spectrum or pair correlation function) can distinguish the two.

9.7 Key Takeaways

A Cox process has a random intensity Λ(t); conditional on it, events are Poisson.
The variance formula Var[N(A)] = E[Λ(A)] + Var[Λ(A)] explains overdispersion.
The LGCP uses a log-Gaussian process for the intensity: flexible, smooth, and easy to simulate.
Shot noise Cox processes have a latent shock process driving the intensity.
Cox vs. Hawkes: both produce F > 1, but the ACF structure distinguishes them.

← Chapter 8

Table of Contents

Chapter 10: Marked Point Processes →

>> You can subscribe to my mailing list here for a monthly update. <<

Gaëlle Candel