Database Optimization Techniques for Better Performance
Database performance can make or break your application. Here are essential optimization techniques every developer should know.
Indexing
PostgreSQL Indexes
-- Single column index
CREATE INDEX idx_users_email ON users(email);
-- Composite index (order matters!)
CREATE INDEX idx_orders_user_date ON orders(user_id, created_at);
-- Partial index (index subset of data)
CREATE INDEX idx_active_users ON users(email) WHERE active = true;
-- Unique index
CREATE UNIQUE INDEX idx_users_email_unique ON users(email);
-- Expression index
CREATE INDEX idx_users_lower_email ON users(LOWER(email));MongoDB Indexes
// Single field index
db.users.createIndex({ email: 1 });
// Compound index
db.orders.createIndex({ userId: 1, createdAt: -1 });
// Text index for search
db.posts.createIndex({ title: "text", content: "text" });
// TTL index (auto-delete documents)
db.sessions.createIndex({ createdAt: 1 }, { expireAfterSeconds: 3600 });When to Index
✅ Index columns used in WHERE clauses
✅ Index columns used in JOIN conditions
✅ Index columns used in ORDER BY
✅ Index foreign keys
❌ Don't index columns with low selectivity
❌ Don't over-index (slows down writes)
❌ Don't index small tables
Query Optimization
Use EXPLAIN
-- PostgreSQL
EXPLAIN ANALYZE SELECT * FROM users WHERE email = 'test@example.com';
-- Look for:
-- - Seq Scan (bad for large tables)
-- - Index Scan (good)
-- - cost and actual timeAvoid SELECT *
-- Bad
SELECT * FROM users WHERE id = 1;
-- Good
SELECT id, name, email FROM users WHERE id = 1;Pagination
-- Bad (offset is slow for large values)
SELECT * FROM posts ORDER BY created_at DESC OFFSET 10000 LIMIT 20;
-- Good (cursor-based)
SELECT * FROM posts
WHERE created_at < '2024-08-15'
ORDER BY created_at DESC
LIMIT 20;Batch Operations
-- Bad
INSERT INTO logs (message) VALUES ('log 1');
INSERT INTO logs (message) VALUES ('log 2');
INSERT INTO logs (message) VALUES ('log 3');
-- Good
INSERT INTO logs (message) VALUES
('log 1'),
('log 2'),
('log 3');N+1 Query Problem
The Problem
// Bad - N+1 queries
const posts = await db.query('SELECT * FROM posts');
for (const post of posts) {
// This runs N times!
const author = await db.query('SELECT * FROM users WHERE id = ?', [post.authorId]);
}The Solution
// Good - Single query with JOIN
const posts = await db.query(`
SELECT posts.*, users.name as author_name
FROM posts
JOIN users ON posts.author_id = users.id
`);
// Or use IN clause
const posts = await db.query('SELECT * FROM posts');
const authorIds = posts.map(p => p.authorId);
const authors = await db.query('SELECT * FROM users WHERE id IN (?)', [authorIds]);With ORM (Prisma)
// Bad
const posts = await prisma.post.findMany();
// author accessed in loop triggers N queries
// Good - Include relation
const posts = await prisma.post.findMany({
include: { author: true }
});Connection Pooling
Node.js with pg
import { Pool } from 'pg';
const pool = new Pool({
host: 'localhost',
database: 'mydb',
max: 20, // Max connections
idleTimeoutMillis: 30000,
connectionTimeoutMillis: 2000,
});
// Use pool for queries
const result = await pool.query('SELECT * FROM users');Prisma Connection Pool
DATABASE_URL="postgresql://user:pass@localhost:5432/mydb?connection_limit=20&pool_timeout=10"
Caching Strategies
Redis Cache Layer
import Redis from 'ioredis';
const redis = new Redis();
async function getUser(id: string) {
// Check cache first
const cached = await redis.get(`user:${id}`);
if (cached) {
return JSON.parse(cached);
}
// Query database
const user = await db.query('SELECT * FROM users WHERE id = ?', [id]);
// Cache for 1 hour
await redis.setex(`user:${id}`, 3600, JSON.stringify(user));
return user;
}
// Invalidate cache on update
async function updateUser(id: string, data: any) {
await db.query('UPDATE users SET ... WHERE id = ?', [id]);
await redis.del(`user:${id}`);
}Database Design
Normalization
-- Bad (denormalized)
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
user_name VARCHAR(255),
user_email VARCHAR(255),
product_name VARCHAR(255),
product_price DECIMAL
);
-- Good (normalized)
CREATE TABLE users (
id SERIAL PRIMARY KEY,
name VARCHAR(255),
email VARCHAR(255)
);
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(255),
price DECIMAL
);
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
user_id INTEGER REFERENCES users(id),
product_id INTEGER REFERENCES products(id)
);Denormalization for Read Performance
-- Sometimes denormalization is good for reads
CREATE TABLE order_summaries (
id SERIAL PRIMARY KEY,
order_id INTEGER,
user_name VARCHAR(255), -- Denormalized
total_amount DECIMAL,
item_count INTEGER
);Monitoring Queries
PostgreSQL Slow Query Log
-- In postgresql.conf
log_min_duration_statement = 1000 -- Log queries > 1 second
-- Find slow queries
SELECT query, calls, mean_time, total_time
FROM pg_stat_statements
ORDER BY total_time DESC
LIMIT 10;Best Practices Summary
- Always use indexes on frequently queried columns
- **Avoid SELECT *** - fetch only needed columns
- Use EXPLAIN to analyze query performance
- Implement connection pooling for efficiency
- Cache frequently accessed data with Redis
- Watch for N+1 queries - use JOINs or batching
- Use cursor-based pagination for large datasets
- Monitor slow queries and optimize regularly
- Normalize for writes, denormalize for reads when needed
- Keep your database statistics updated
Optimized databases lead to faster applications and happier users!