Choosing the right database and designing an efficient schema is crucial for a comment system that performs well and scales with your needs.
PostgreSQL is the gold standard for comment systems:
SQLite works well for smaller sites:
MySQL/MariaDB remains popular:
MongoDB offers flexibility:
Redis for caching layer:
Firestore/Firebase for serverless:
| Service | Free Tier | Best For |
|---|---|---|
| Supabase | 500MB | PostgreSQL + Auth |
| PlanetScale | 5GB | MySQL, branching |
| Turso | 8GB | Edge SQLite |
| Neon | 512MB | Serverless PostgreSQL |
| MongoDB Atlas | 512MB | Document storage |
CREATE TABLE comments (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
page_id VARCHAR(255) NOT NULL,
parent_id UUID REFERENCES comments(id),
-- Author information
author_name VARCHAR(100) NOT NULL,
author_email VARCHAR(255),
author_url VARCHAR(500),
author_avatar VARCHAR(500),
-- Content
content TEXT NOT NULL,
content_html TEXT, -- Pre-rendered HTML
-- Metadata
status VARCHAR(20) DEFAULT 'pending',
ip_address INET,
user_agent TEXT,
-- Timestamps
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
published_at TIMESTAMP WITH TIME ZONE,
-- Indexes
CONSTRAINT valid_status CHECK (status IN ('pending', 'approved', 'spam', 'deleted'))
);
CREATE INDEX idx_comments_page ON comments(page_id);
CREATE INDEX idx_comments_parent ON comments(parent_id);
CREATE INDEX idx_comments_status ON comments(status);
CREATE INDEX idx_comments_created ON comments(created_at DESC);
Pages Table - Track pages with comments:
CREATE TABLE pages (
id VARCHAR(255) PRIMARY KEY, -- URL path or slug
url VARCHAR(500) NOT NULL,
title VARCHAR(255),
comment_count INTEGER DEFAULT 0,
last_comment_at TIMESTAMP WITH TIME ZONE,
settings JSONB DEFAULT '{}',
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
Users Table - For authenticated comments:
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
provider VARCHAR(50) NOT NULL, -- github, google, email
provider_id VARCHAR(255) NOT NULL,
email VARCHAR(255),
name VARCHAR(100),
avatar_url VARCHAR(500),
role VARCHAR(20) DEFAULT 'commenter',
banned_at TIMESTAMP WITH TIME ZONE,
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
UNIQUE(provider, provider_id)
);
Reactions Table - Likes, upvotes, etc.:
CREATE TABLE reactions (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
comment_id UUID REFERENCES comments(id) ON DELETE CASCADE,
user_identifier VARCHAR(255) NOT NULL, -- User ID or hash
reaction_type VARCHAR(20) NOT NULL, -- like, heart, helpful
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
UNIQUE(comment_id, user_identifier, reaction_type)
);
Simple structure, comments displayed chronologically:
Comment 1
Comment 2
Comment 3
Comment 4
Implementation: parent_id is always NULL.
Common in blogs - replies grouped under parent:
Comment 1
├── Reply 1.1
├── Reply 1.2
└── Reply 1.3
Comment 2
└── Reply 2.1
Implementation: parent_id references top-level comment only.
Unlimited nesting levels:
Comment 1
├── Reply 1.1
│ └── Reply 1.1.1
│ └── Reply 1.1.1.1
├── Reply 1.2
Implementation: Use recursive queries or materialized path.
Store the path to each comment for efficient queries:
ALTER TABLE comments ADD COLUMN path VARCHAR(1000);
-- Example paths:
-- Comment 1: '001'
-- Reply 1.1: '001.001'
-- Reply 1.1.1: '001.001.001'
Query all descendants:
SELECT * FROM comments
WHERE path LIKE '001.%'
ORDER BY path;
For complex hierarchies:
CREATE TABLE comment_closure (
ancestor_id UUID REFERENCES comments(id),
descendant_id UUID REFERENCES comments(id),
depth INTEGER NOT NULL,
PRIMARY KEY (ancestor_id, descendant_id)
);
Comment counts - Maintain counters:
-- Trigger to update page comment count
CREATE OR REPLACE FUNCTION update_comment_count()
RETURNS TRIGGER AS $$
BEGIN
IF TG_OP = 'INSERT' AND NEW.status = 'approved' THEN
UPDATE pages SET comment_count = comment_count + 1
WHERE id = NEW.page_id;
ELSIF TG_OP = 'UPDATE' THEN
IF OLD.status != 'approved' AND NEW.status = 'approved' THEN
UPDATE pages SET comment_count = comment_count + 1
WHERE id = NEW.page_id;
ELSIF OLD.status = 'approved' AND NEW.status != 'approved' THEN
UPDATE pages SET comment_count = comment_count - 1
WHERE id = NEW.page_id;
END IF;
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
Reply counts - Track per comment:
ALTER TABLE comments ADD COLUMN reply_count INTEGER DEFAULT 0;
Essential indexes:
-- Composite index for common query
CREATE INDEX idx_comments_page_status_created
ON comments(page_id, status, created_at DESC);
-- Partial index for approved comments only
CREATE INDEX idx_comments_approved
ON comments(page_id, created_at DESC)
WHERE status = 'approved';
-- Full-text search index
CREATE INDEX idx_comments_content_search
ON comments USING gin(to_tsvector('english', content));
Efficient comment fetching:
-- Get comments with reply counts
SELECT
c.*,
COUNT(r.id) AS reply_count
FROM comments c
LEFT JOIN comments r ON r.parent_id = c.id AND r.status = 'approved'
WHERE c.page_id = $1
AND c.status = 'approved'
AND c.parent_id IS NULL
GROUP BY c.id
ORDER BY c.created_at DESC
LIMIT 50;
Handle deletions gracefully:
-- Soft delete pattern
ALTER TABLE comments
ADD COLUMN deleted_at TIMESTAMP WITH TIME ZONE;
-- Archive instead of delete
CREATE TABLE comments_archive (LIKE comments INCLUDING ALL);
Track all changes:
CREATE TABLE comment_history (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
comment_id UUID NOT NULL,
action VARCHAR(20) NOT NULL, -- created, edited, moderated
old_data JSONB,
new_data JSONB,
actor_id UUID, -- Moderator or system
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
Disqus provides XML export:
Direct database access:
-- WordPress comments structure
SELECT
comment_ID,
comment_post_ID,
comment_parent,
comment_author,
comment_author_email,
comment_content,
comment_date
FROM wp_comments
WHERE comment_approved = '1';
For static site generators:
import json
import psycopg2
def import_comments(json_file, db_conn):
with open(json_file) as f:
comments = json.load(f)
cursor = db_conn.cursor()
for comment in comments:
cursor.execute("""
INSERT INTO comments (page_id, author_name, content, created_at)
VALUES (%s, %s, %s, %s)
""", (comment['page'], comment['author'],
comment['text'], comment['date']))
db_conn.commit()
| Aspect | Recommendation |
|---|---|
| Database | PostgreSQL for features, SQLite for simplicity |
| Threading | Two-level for blogs, materialized path for forums |
| Indexes | Composite on (page_id, status, created_at) |
| Counts | Denormalize with triggers |
| Deletions | Soft delete with archive |
In the next chapter, we’ll implement the backend API using these schemas.
Navigation: