Database DesignSQL vs NoSQL

SQL databases provide ACID guarantees and relational data modeling. NoSQL databases trade some consistency guarantees for horizontal scalability and flexible schemas. Knowing when to use each is fundamental to system design.

SQL (Relational) Databases

SQL databases organize data into tables with rows and columns, with strict schemas and relationships enforced through foreign keys.

ACID Properties:

  • Atomicity: Transactions are all-or-nothing
  • Consistency: Data always transitions from one valid state to another
  • Isolation: Concurrent transactions don't interfere
  • Durability: Committed data persists through failures

Examples: PostgreSQL, MySQL, SQLite, Oracle, SQL Server

Strengths:

  • Complex queries with JOINs across related data
  • Strong consistency guarantees
  • Mature tooling, widespread expertise
  • ACID transactions (critical for financial systems)
  • Schema enforcement catches data errors early

Weaknesses:

  • Harder to scale horizontally (sharding is complex)
  • Schema changes are painful at scale
  • Slower writes for very high write loads
  • Poor fit for unstructured or highly variable data

NoSQL Database Categories

Document Stores

Store data as JSON/BSON documents. Flexible schema — each document can have different fields.

Examples: MongoDB, Firestore, CouchDB

Best for: Content management, user profiles, product catalogs, any data that varies in shape

// User document — fields vary per user
{
  "_id": "user123",
  "name": "Alice",
  "preferences": { "theme": "dark" },
  "social": { "twitter": "@alice" }
}

Key-Value Stores

Simple key→value pairs. Extremely fast.

Examples: Redis, DynamoDB (also document), Memcached

Best for: Sessions, caching, real-time leaderboards, feature flags

Column-Family Stores

Data organized by column families rather than rows. Optimized for time-series and analytical workloads.

Examples: Cassandra, HBase, Bigtable

Best for: Time-series data, IoT sensor data, activity feeds, write-heavy workloads at massive scale

Graph Databases

Nodes and edges model relationships directly.

Examples: Neo4j, Amazon Neptune

Best for: Social networks, recommendation engines, fraud detection — any problem where relationships are first-class

Decision Framework

| Need | Choose | |---|---| | Complex queries, JOINs, reporting | SQL | | Financial transactions, strong consistency | SQL | | Flexible/evolving schema | Document (MongoDB) | | Extreme write throughput | Column-family (Cassandra) | | Key-value lookups, caching | Redis, DynamoDB | | Graph traversals | Neo4j | | Time-series data | InfluxDB, TimescaleDB, Cassandra | | Multi-region active-active | Cassandra, DynamoDB |

Common Combination Patterns

Real systems often use both:

  • PostgreSQL + Redis: PostgreSQL for source of truth, Redis for caching hot data
  • PostgreSQL + Elasticsearch: PostgreSQL for storage, Elasticsearch for full-text search
  • Cassandra + PostgreSQL: Cassandra for write-heavy time-series data, PostgreSQL for user accounts and billing

Interview Tips

  • Never dismiss SQL — it scales to billions of rows with proper indexing and read replicas
  • Justify your database choice with specific requirements: "We need ACID transactions for payments, so PostgreSQL"
  • Know that most NoSQL databases sacrifice some ACID properties — know which ones and when that's acceptable
  • For most CRUD applications, PostgreSQL or MySQL is the right answer unless you have a specific reason for NoSQL
  • Cassandra is the go-to answer for "massive write throughput, multi-region" scenarios