Generated using AI. Be aware that everything might not be accurate.

Chapter 14: Applications and Case Studies

Theory and algorithms only become real when applied to concrete problems. This chapter walks through four end-to-end case studies — seismology, financial markets, neuroscience, and system monitoring — each illustrating a different point process family and a different set of practical challenges.

14.1 Case Study 1: Earthquake Aftershocks (ETAS)

Problem: Given a synthetic earthquake catalog, fit the ETAS model and forecast aftershock activity.

Data: A mainshock at t=0, m=6.5, followed by aftershocks over 30 days (simulated via the branching ETAS model).

Model: ETAS (chapter 10) with Gutenberg-Richter magnitudes.

Pipeline:

# 1. Load catalog
times, magnitudes = load_catalog("synthetic_quakes.csv")

# 2. Fit ETAS parameters via MLE
params_hat = fit_etas(times, magnitudes, Mc=2.0, T=30.0)

# 3. Goodness of fit: time-rescaling theorem
tau = compute_rescaled_times_etas(times, magnitudes, params_hat, Mc=2.0)
u = 1 - np.exp(-np.diff(np.concatenate([[0], tau])))
ks_stat, p_value = stats.kstest(u, 'uniform')

# 4. Forecast: expected number of M≥3 events in next 7 days
lambda_forecast = forecast_etas(params_hat, times, magnitudes, t_start=30, T_fore=7)

Key findings:

ETAS captures the Omori-law decay of aftershock rate (p ≈ 1.1)
The branching ratio n* = K·E[exp(α(m−Mc))] quantifies cascade productivity
The KS test passes (p > 0.3), validating the model
The 7-day forecast gives an expected count with a Poisson confidence interval

14.2 Case Study 2: Financial Order Book (2D Hawkes)

Problem: Model the interplay between bid and ask order arrivals in a limit order book.

Data: Synthetic bid/ask arrival sequences generated from a 2D Hawkes model with known parameters.

Model: Bivariate Hawkes with exponential kernels (chapter 8).

Key parameters:

A = [[0.6, 0.3], [0.3, 0.6]] — strong self-excitation, moderate cross-excitation
ρ(A) = 0.9 — near the stability boundary (realistic for liquid markets)
Mean bid rate ≈ mean ask rate ≈ μ / (1 − 0.9) = 10μ

# Fit bivariate Hawkes
events_bid, events_ask = simulate_bivariate_hawkes(
    mu=[0.1, 0.1], alpha=[[0.6, 0.3],[0.3, 0.6]],
    beta=[[1.0, 1.0],[1.0, 1.0]], T=1000)

params = fit_bivariate_hawkes(events_bid, events_ask, T=1000)
A_hat = params.alpha / params.beta

print("Estimated kernel matrix A:")
print(A_hat)
print(f"Spectral radius: {np.max(np.abs(np.linalg.eigvals(A_hat))):.3f}")

Key findings:

Cross-excitation A₁₂ ≈ A₂₁ (symmetric) — buy and sell sides excite each other equally
High ρ(A) means the market is “reflexive”: a single large order triggers a long cascade
Granger causality test: off-diagonal LRT confirms both A₁₂ > 0 and A₂₁ > 0

14.3 Case Study 3: Neural Spike Trains

Problem: Determine whether a neuron’s spike train is better modeled as Poisson or as a renewal process with a refractory period.

Data: Synthetic spike train generated from a Gamma renewal process with k=5 (regular, short refractory period).

Model comparison: HPP (Poisson null) vs. Gamma renewal.

from scipy.stats import gamma, expon

# Fit HPP: rate = 1 / mean ISI
isi = np.diff(np.concatenate([[0], spike_times]))
rate_hat = 1.0 / isi.mean()

# KS test for exponential inter-spike intervals (Poisson hypothesis)
ks_stat, p_val = stats.kstest(isi, 'expon', args=(0, 1/rate_hat))
print(f"Poisson KS test: p = {p_val:.4f}")  # expect p << 0.05

# Fit Gamma renewal
from scipy.stats import gamma
k_hat, _, scale_hat = gamma.fit(isi, floc=0)
print(f"Gamma fit: k={k_hat:.2f}, mean={k_hat*scale_hat:.3f}")
print(f"CV = {1/np.sqrt(k_hat):.3f}")  # expect CV < 1

Key findings:

The KS test strongly rejects the Poisson model (p < 0.001): the neuron has a refractory period
Gamma fit: k ≈ 5, CV ≈ 0.45 — significantly more regular than Poisson
Time-rescaling with the Gamma model passes the KS test (p > 0.2)
The Fano factor F ≈ CV² ≈ 0.2 — well below 1, confirming sub-Poisson regularity

14.4 Case Study 4: Server Log Anomaly Detection

Problem: Detect anomalous traffic bursts in server request logs modeled as an NHPP with a daily periodic rate.

Data: Synthetic server requests at a sinusoidal rate λ(t) = 10 · (1 + 0.7·sin(2πt/24)) with occasional injected traffic spikes.

Approach: Fit an NHPP with sinusoidal rate, then flag periods where the observed count significantly exceeds the forecast.

# Fit NHPP with sinusoidal rate
from scipy.optimize import minimize

def nhpp_negloglik(params, events, T):
    lambda0, A, period = params
    intensity = lambda t: lambda0 * (1 + A * np.sin(2*np.pi*t/period))
    # Compensator (analytic for sinusoidal)
    Lambda = lambda t: lambda0 * (t - A*period/(2*np.pi) * np.cos(2*np.pi*t/period))
    term1 = sum(np.log(intensity(t)) for t in events)
    term2 = Lambda(T) - Lambda(0)
    return -(term1 - term2)

result = minimize(nhpp_negloglik, [10.0, 0.5, 24.0], args=(events, T),
                  bounds=[(0.1,100), (0.0,0.99), (1,100)])
lambda0_hat, A_hat, period_hat = result.x

# Flag anomalous hours: observed count vs. expected Poisson(Λ)
for hour in range(int(T/24)):
    obs = count_events(events, hour*24, (hour+1)*24)
    expected = lambda0_hat * 24  # rough
    p_val = 1 - stats.poisson.cdf(obs - 1, expected)
    if p_val < 0.01:
        print(f"Hour {hour}: {obs} events (expected ~{expected:.1f}), p={p_val:.4f} *** ANOMALY")

Key findings:

The fitted sinusoidal model captures the daily cycle accurately
Three injected spike hours are detected at p < 0.001
No false positives in the normal traffic windows (p > 0.05)

See code/14_case_studies.py for the full implementations of all four case studies.

14.5 Common Pitfalls

Edge effects: The Hawkes process is fit as if started from an empty history. If data begins mid-sequence (e.g., you start recording 6 hours into a seismic sequence), the early intensity estimates are biased downward. Fix: burn in by conditioning on events before the observation window, or discard early events.

Stationarity assumption: The Hawkes model assumes stationarity (n* < 1), but near major mainshocks n* may temporarily exceed 1 (supercritical cascade). This violates the model but is physically meaningful.

Non-identifiability: For very short observation windows or low event counts, μ, α, and β cannot be separately identified — only n* = α/β and μ/(1−n*) can be estimated reliably.

Magnitude threshold: In seismology, always apply a completeness magnitude cut Mc before fitting. Incomplete catalogs bias all parameter estimates.

← Chapter 13

Table of Contents

Chapter 15: Action Plan →

>> You can subscribe to my mailing list here for a monthly update. <<

Gaëlle Candel

Chapter 14: Applications and Case Studies

14.1 Case Study 1: Earthquake Aftershocks (ETAS)

14.2 Case Study 2: Financial Order Book (2D Hawkes)

14.3 Case Study 3: Neural Spike Trains

14.4 Case Study 4: Server Log Anomaly Detection

14.5 Common Pitfalls