Generated using AI. Be aware that everything might not be accurate.

Chapter 1: Mathematical Foundations

Before diving into specific models, we need a precise mathematical vocabulary. This chapter introduces the key objects — counting processes, compensators, and the conditional intensity — and shows how they fit together. The level is rigorous enough to support the rest of the book, but we prioritize intuition over full measure-theoretic generality.

1.1 Counting Processes

A point process on [0, ∞) is a random locally finite counting measure: a random way of placing a finite number of points in any bounded interval.

We represent it as a counting process N(t):

N(t) = |{i : tᵢ ≤ t}|  (number of events in (0, t])

N(t) has the following structural properties:

N(0) = 0
N(t) is non-decreasing and integer-valued
N(t) is right-continuous: N(t) = lim_{s↓t} N(s)
Jumps are of size 1 (no simultaneous events, the simple process assumption)

The increment N(a, b] = N(b) − N(a) counts events in the half-open interval (a, b].

1.2 The History (Filtration)

To talk about conditioning on the past, we need the concept of a filtration {F_t}_{t≥0}. Intuitively, F_t is “everything we know up to and including time t”: the event times, the marks (if any), and any external covariates up to that point.

We write F_{t-} for the history strictly before time t — what we know just before a potential event at t.

1.3 The Conditional Intensity Function

The central object in point process theory is the conditional intensity function (also called the hazard rate or stochastic intensity):

λ*(t) = lim_{dt→0} E[N(t, t+dt] | F_{t-}] / dt

The star notation λ*(t) emphasizes that this is a random variable: it depends on the history F_{t-}. For a Poisson process it reduces to a deterministic function; for a Hawkes process it depends on all past event times.

Interpretation: λ*(t) · dt is the probability of an event in the tiny interval (t, t+dt], given everything that happened before t.

The conditional intensity function completely characterizes the distribution of a simple point process. If two processes have the same λ*(t) for all t, they have the same distribution.

1.4 The Compensator and Doob-Meyer Decomposition

Define the compensator (or cumulative conditional intensity):

Λ*(t) = ∫₀ᵗ λ*(s) ds

The Doob-Meyer decomposition theorem tells us that every locally square-integrable martingale can be decomposed as:

N(t) = Λ*(t) + M(t)

where M(t) is a martingale (a process with no predictable drift: E[M(t) | F_s] = M(s) for s ≤ t).

This decomposition is fundamental: Λ*(t) is the “predictable” part of N(t), and M(t) is pure noise. The compensator measures the expected total number of events given history.

1.5 Campbell’s Theorem

A key result for computing expectations involving point processes is Campbell’s theorem: for a non-negative measurable function f,

E[∑ᵢ f(tᵢ)] = E[∫ f(t) λ*(t) dt]

For a Poisson process with deterministic intensity λ(t), this simplifies to:

E[∑ᵢ f(tᵢ)] = ∫ f(t) λ(t) dt

Campbell’s theorem is the point-process analog of the law of total expectation, and it underlies the derivation of log-likelihoods.

1.6 The Log-Likelihood of a Point Process

Given observations t_1 < t_2 < ... < t_n on [0, T], the log-likelihood under a model with conditional intensity λ*(t; θ) is:

ℓ(θ) = ∑ᵢ₌₁ⁿ log λ*(tᵢ; θ) − ∫₀ᵀ λ*(t; θ) dt

This formula is the cornerstone of inference for point processes. The first term rewards the model for assigning high intensity at the observed event times. The second term penalizes for predicting too many events overall (the integral is the expected number of events under the model).

This formula will recur throughout the book; each model family yields a different structure for λ*(t) that makes the integral more or less tractable.

1.7 Stationarity

A point process is stationary (in the wide sense) if its distribution is invariant under time shifts: for any τ, the process {N(t+τ) − N(τ)}_{t≥0} has the same distribution as {N(t)}_{t≥0}.

For a stationary process, the mean rate is well-defined:

λ̄ = lim_{T→∞} N(T) / T  (almost surely, by ergodicity)

Stationarity is a key assumption for the Hawkes process: we require the branching ratio n* < 1 to ensure the process does not explode.

1.8 Key Takeaways

A point process is fully characterized by its conditional intensity λ*(t), which encodes the instantaneous event rate given all past history.
The compensator Λ*(t) = ∫₀ᵗ λ*(s) ds is the predictable component; the residual M(t) = N(t) − Λ*(t) is a martingale.
The log-likelihood has a universal form: ℓ = Σ log λ*(tᵢ) − ∫ λ*(t) dt.
All specific models in this book are obtained by choosing a specific form for λ*(t).

← Introduction

Table of Contents

Chapter 2: The Homogeneous Poisson Process →

>> You can subscribe to my mailing list here for a monthly update. <<

Gaëlle Candel