Theory and algorithms only become real when applied to concrete problems. This chapter walks through four end-to-end case studies — seismology, financial markets, neuroscience, and system monitoring — each illustrating a different point process family and a different set of practical challenges.
Problem: Given a synthetic earthquake catalog, fit the ETAS model and forecast aftershock activity.
Data: A mainshock at t=0, m=6.5, followed by aftershocks over 30 days (simulated via the branching ETAS model).
Model: ETAS (chapter 10) with Gutenberg-Richter magnitudes.
Pipeline:
# 1. Load catalog
times, magnitudes = load_catalog("synthetic_quakes.csv")
# 2. Fit ETAS parameters via MLE
params_hat = fit_etas(times, magnitudes, Mc=2.0, T=30.0)
# 3. Goodness of fit: time-rescaling theorem
tau = compute_rescaled_times_etas(times, magnitudes, params_hat, Mc=2.0)
u = 1 - np.exp(-np.diff(np.concatenate([[0], tau])))
ks_stat, p_value = stats.kstest(u, 'uniform')
# 4. Forecast: expected number of M≥3 events in next 7 days
lambda_forecast = forecast_etas(params_hat, times, magnitudes, t_start=30, T_fore=7)
Key findings:
p ≈ 1.1)n* = K·E[exp(α(m−Mc))] quantifies cascade productivityProblem: Model the interplay between bid and ask order arrivals in a limit order book.
Data: Synthetic bid/ask arrival sequences generated from a 2D Hawkes model with known parameters.
Model: Bivariate Hawkes with exponential kernels (chapter 8).
Key parameters:
A = [[0.6, 0.3], [0.3, 0.6]] — strong self-excitation, moderate cross-excitationρ(A) = 0.9 — near the stability boundary (realistic for liquid markets)μ / (1 − 0.9) = 10μ# Fit bivariate Hawkes
events_bid, events_ask = simulate_bivariate_hawkes(
mu=[0.1, 0.1], alpha=[[0.6, 0.3],[0.3, 0.6]],
beta=[[1.0, 1.0],[1.0, 1.0]], T=1000)
params = fit_bivariate_hawkes(events_bid, events_ask, T=1000)
A_hat = params.alpha / params.beta
print("Estimated kernel matrix A:")
print(A_hat)
print(f"Spectral radius: {np.max(np.abs(np.linalg.eigvals(A_hat))):.3f}")
Key findings:
A₁₂ ≈ A₂₁ (symmetric) — buy and sell sides excite each other equallyρ(A) means the market is “reflexive”: a single large order triggers a long cascadeA₁₂ > 0 and A₂₁ > 0Problem: Determine whether a neuron’s spike train is better modeled as Poisson or as a renewal process with a refractory period.
Data: Synthetic spike train generated from a Gamma renewal process with k=5 (regular, short refractory period).
Model comparison: HPP (Poisson null) vs. Gamma renewal.
from scipy.stats import gamma, expon
# Fit HPP: rate = 1 / mean ISI
isi = np.diff(np.concatenate([[0], spike_times]))
rate_hat = 1.0 / isi.mean()
# KS test for exponential inter-spike intervals (Poisson hypothesis)
ks_stat, p_val = stats.kstest(isi, 'expon', args=(0, 1/rate_hat))
print(f"Poisson KS test: p = {p_val:.4f}") # expect p << 0.05
# Fit Gamma renewal
from scipy.stats import gamma
k_hat, _, scale_hat = gamma.fit(isi, floc=0)
print(f"Gamma fit: k={k_hat:.2f}, mean={k_hat*scale_hat:.3f}")
print(f"CV = {1/np.sqrt(k_hat):.3f}") # expect CV < 1
Key findings:
k ≈ 5, CV ≈ 0.45 — significantly more regular than PoissonF ≈ CV² ≈ 0.2 — well below 1, confirming sub-Poisson regularityProblem: Detect anomalous traffic bursts in server request logs modeled as an NHPP with a daily periodic rate.
Data: Synthetic server requests at a sinusoidal rate λ(t) = 10 · (1 + 0.7·sin(2πt/24)) with occasional injected traffic spikes.
Approach: Fit an NHPP with sinusoidal rate, then flag periods where the observed count significantly exceeds the forecast.
# Fit NHPP with sinusoidal rate
from scipy.optimize import minimize
def nhpp_negloglik(params, events, T):
lambda0, A, period = params
intensity = lambda t: lambda0 * (1 + A * np.sin(2*np.pi*t/period))
# Compensator (analytic for sinusoidal)
Lambda = lambda t: lambda0 * (t - A*period/(2*np.pi) * np.cos(2*np.pi*t/period))
term1 = sum(np.log(intensity(t)) for t in events)
term2 = Lambda(T) - Lambda(0)
return -(term1 - term2)
result = minimize(nhpp_negloglik, [10.0, 0.5, 24.0], args=(events, T),
bounds=[(0.1,100), (0.0,0.99), (1,100)])
lambda0_hat, A_hat, period_hat = result.x
# Flag anomalous hours: observed count vs. expected Poisson(Λ)
for hour in range(int(T/24)):
obs = count_events(events, hour*24, (hour+1)*24)
expected = lambda0_hat * 24 # rough
p_val = 1 - stats.poisson.cdf(obs - 1, expected)
if p_val < 0.01:
print(f"Hour {hour}: {obs} events (expected ~{expected:.1f}), p={p_val:.4f} *** ANOMALY")
Key findings:
See code/14_case_studies.py for the full implementations of all four case studies.
Edge effects: The Hawkes process is fit as if started from an empty history. If data begins mid-sequence (e.g., you start recording 6 hours into a seismic sequence), the early intensity estimates are biased downward. Fix: burn in by conditioning on events before the observation window, or discard early events.
Stationarity assumption: The Hawkes model assumes stationarity (n* < 1), but near major mainshocks n* may temporarily exceed 1 (supercritical cascade). This violates the model but is physically meaningful.
Non-identifiability: For very short observation windows or low event counts, μ, α, and β cannot be separately identified — only n* = α/β and μ/(1−n*) can be estimated reliably.
Magnitude threshold: In seismology, always apply a completeness magnitude cut Mc before fitting. Incomplete catalogs bias all parameter estimates.
| ← Chapter 13 | Table of Contents | Chapter 15: Action Plan → |