So far, every event has been just a time. In practice, events carry additional information: an earthquake has a magnitude, a trade has a size and direction, a neural spike has a waveform. A marked point process augments the event times with marks — random variables attached to each event.
A marked point process is a collection of pairs {(tᵢ, mᵢ)}, where:
tᵢ ∈ [0, ∞) is the event time (the ground process)mᵢ ∈ M is the mark (a real number, a vector, a category, etc.)The ground process N_g(t) = |{i : tᵢ ≤ t}| is the point process of times, ignoring marks.
The full conditional intensity factors as:
λ*(t, m) = λ_g*(t) · p(m | t, F_{t-})
where:
λ_g*(t) is the conditional intensity of the ground processp(m | t, F_{t-}) is the conditional mark distribution given the current time and historyThe simplest case is independent marks: the mark distribution does not depend on the event time or past history:
p(m | t, F_{t-}) = p(m)
In this case, the log-likelihood separates:
ℓ = ℓ_ground + ℓ_marks
ℓ_ground = Σᵢ log λ_g*(tᵢ) − ∫ λ_g*(t) dt
ℓ_marks = Σᵢ log p(mᵢ)
Ground process and mark distribution can be estimated independently. The ground process can be any of the models covered in previous chapters (HPP, NHPP, Hawkes, Cox).
More interesting is when marks depend on the history. In seismology, for example, larger mainshocks tend to produce larger aftershocks — the mark distribution changes as the process evolves.
If p(m | t, F_{t-}) = p(m | λ_g*(t)), i.e., marks depend on the current intensity, the log-likelihood no longer separates, and joint estimation is required.
The most influential marked point process model in seismology is the Epidemic-Type Aftershock Sequence (ETAS) model (Ogata, 1988). Each earthquake at time tᵢ with magnitude mᵢ contributes to future seismicity according to:
λ*(t) = μ + Σ_{tᵢ < t} K · exp(α(mᵢ − Mc)) / (t − tᵢ + c)^p
where:
μ is the background rateK is the productivity parameterα controls magnitude-dependent productivityMc is the completeness magnitude thresholdc, p are the modified Omori law parameters (typically p ≈ 1.1)p(m) = β_GR · exp(−β_GR(m − Mc)) for m ≥ McThe key feature: large events (mᵢ ≫ Mc) generate exponentially more offspring. This is why major earthquakes produce long, intense aftershock sequences.
Simulation combines branching structure with magnitude-dependent productivity:
def simulate_etas(mu, K, alpha, c, p, beta_gr, Mc, T):
events = [] # list of (time, magnitude)
def add_offspring(t_parent, m_parent, depth=0):
if depth > 50:
return # safety limit
n_offspring = np.random.poisson(K * np.exp(alpha * (m_parent - Mc)))
for _ in range(n_offspring):
# Sample delay from modified Omori (truncated power law)
delay = c * ((np.random.uniform()**(-1/(p-1))) - 1)
t_child = t_parent + delay
if t_child < T:
m_child = Mc + np.random.exponential(1.0 / beta_gr)
events.append((t_child, m_child))
add_offspring(t_child, m_child, depth + 1)
# Background events
n_bg = np.random.poisson(mu * T)
bg_times = np.random.uniform(0, T, n_bg)
for t in bg_times:
m = Mc + np.random.exponential(1.0 / beta_gr)
events.append((t, m))
add_offspring(t, m)
events.sort(key=lambda x: x[0])
return events
See code/10_marked_point_processes.py for the full implementation.
In seismology, earthquake magnitudes follow the Gutenberg-Richter law:
log₁₀ N(≥m) = a − b · m
where N(≥m) is the number of earthquakes with magnitude ≥ m, and b ≈ 1 is the b-value (nearly universal). This is an exponential distribution on the magnitude scale:
p(m) = β_GR · exp(−β_GR · (m − Mc)), m ≥ Mc
β_GR = b · ln(10)
The MLE of β_GR from observed magnitudes m₁, ..., mₙ (all ≥ Mc) is:
β̂_GR = n / Σᵢ(mᵢ − Mc) = 1 / (m̄ − Mc)
λ*(t, m) = λ_g*(t) · p(m | t, history).| ← Chapter 9 | Table of Contents | Chapter 11: Maximum Likelihood Estimation → |