Question 1

What is PostgreSQL and what are its core features?

Accepted Answer

PostgreSQL is a powerful, open-source object-relational database system. Its core features include full ACID compliance, support for SQL standards, robust transactional concurrency, foreign key constraints, schemas, and rich extensions (like PostGIS for geospatial data).

Question 2

Explain the difference between Clustered and Non-Clustered indexes in relational databases.

Accepted Answer

- Clustered Index: Sorts and stores the actual data rows in the table based on the key values (usually the Primary Key). A table can only have one clustered index.
- Non-Clustered Index: Stores index keys along with pointer addresses mapping to the actual data rows, allowing multiple non-clustered indexes per table.

Question 3

Explain how to perform basic CRUD operations in SQL.

Accepted Answer

CRUD stands for Create, Read, Update, Delete:
- Create: `INSERT INTO users (name, email) VALUES ('John', 'john@example.com');`.
- Read: `SELECT * FROM users WHERE status = 'active';`.
- Update: `UPDATE users SET status = 'active' WHERE id = 1;`.
- Delete: `DELETE FROM users WHERE id = 1;`.

Question 4

What is a Primary Key and how does it differ from a Unique constraint?

Accepted Answer

- Primary Key: Uniquely identifies each row in a table. It cannot contain `NULL` values, and a table can have only one primary key.
- Unique Constraint: Enforces uniqueness for a column or group of columns, but allows `NULL` values, allowing multiple unique constraints per table.

Question 5

What is a Foreign Key and how does it enforce referential integrity?

Accepted Answer

A Foreign Key is a column or group of columns in one table that references the Primary Key of another table. It enforces referential integrity by preventing database actions that would leave orphaned child records (like deleting a parent record still referenced by children).

Question 6

What are database joins and what are the main types?

Accepted Answer

Joins combine rows from multiple tables based on related columns:
- `INNER JOIN`: Returns rows with matching values in both tables.
- `LEFT JOIN`: Returns all rows from the left table and matching rows from the right table.
- `RIGHT JOIN`: Returns all rows from the right table and matching rows from the left table.
- `FULL JOIN`: Returns rows when there is a match in either table.

Question 7

Explain the GROUP BY clause and aggregate functions in SQL.

Accepted Answer

The `GROUP BY` clause groups rows sharing the same values into summary rows (like grouping users by country). It is used with aggregate functions (like `COUNT()`, `SUM()`, `AVG()`, `MAX()`, `MIN()`) to perform calculations on each group.

Question 8

What is the difference between WHERE and HAVING clauses?

Accepted Answer

- `WHERE` filters rows before any groupings are applied.
- `HAVING` filters group summary rows after the `GROUP BY` clause has executed, often used with aggregate functions.

Question 9

Explain database transactions and the ACID properties.

Accepted Answer

A transaction is a unit of database work. ACID properties guarantee reliability:
- Atomicity: Either all operations succeed or all roll back.
- Consistency: Transactions move the database from one valid state to another.
- Isolation: Concurrent transactions do not interfere.
- Durability: Committed updates are permanent.

Question 10

What is the difference between CHAR, VARCHAR, and TEXT data types?

Accepted Answer

- `CHAR(n)`: Fixed-length string, padded with spaces if shorter than `n`.
- `VARCHAR(n)`: Variable-length string with a maximum limit of `n` characters.
- `TEXT`: Unlimited variable-length string (optimized in PostgreSQL with zero performance difference compared to VARCHAR).

Question 11

What is a Database Schema in PostgreSQL?

Accepted Answer

A schema is a namespace that contains database objects (tables, views, indexes, functions) within a database, allowing you to organize tables into logical groups and control access permissions.

Question 12

Explain how pagination works in PostgreSQL using LIMIT and OFFSET.

Accepted Answer

Use the `LIMIT` and `OFFSET` clauses: `SELECT * FROM users LIMIT 10 OFFSET 20;`. This skips the first 20 rows and returns the next 10, though performance degrades on large offsets.

Question 13

What is the role of the transaction isolation level Read Committed?

Accepted Answer

Read Committed is PostgreSQL's default isolation level. It prevents dirty reads (reading uncommitted data), but allows non-repeatable reads (data read twice in the same transaction can change if another transaction commits updates).

Question 14

What is the difference between DELETE and TRUNCATE commands?

Accepted Answer

- `DELETE`: DML command that deletes rows matching a filter one-by-one, triggering database triggers and keeping space allocated.
- `TRUNCATE`: DDL command that deallocates all table pages, bypassing triggers and releasing disk space immediately.

Question 15

What are database views in PostgreSQL?

Accepted Answer

A view is a virtual table representing the compiled output of an SQL query. It does not store data physically (unless materialized), acting as a layer to simplify complex queries.

Question 16

Explain the use of the COALESCE function.

Accepted Answer

The `COALESCE` function accepts a list of arguments and returns the first non-null value: `SELECT COALESCE(phone, 'N/A') FROM users;`. It is useful for formatting null values in queries.

Question 17

What is the purpose of the EXPLAIN command in PostgreSQL?

Accepted Answer

`EXPLAIN` displays the query execution plan generated by the PostgreSQL planner, showing whether it will perform index scans (`Index Scan`) or full table scans (`Seq Scan`) to retrieve data.

Question 18

Explain Multi-Version Concurrency Control (MVCC) in PostgreSQL.

Accepted Answer

MVCC allows concurrent readers and writers without locking tables. When a row is updated, PostgreSQL does not overwrite the existing record. Instead, it writes a new version of the row, marking the old version as dead. Every row has metadata columns (`xmin` and `xmax`) tracking transaction visibility. Readers only see row versions committed before their transaction began, ensuring isolation.

Question 19

What is the VACUUM command in PostgreSQL and what is Table Bloat?

Accepted Answer

Due to MVCC, deleted or updated rows leave 'dead tuples' in memory. Table Bloat occurs when dead tuples accumulate, increasing file sizes and slowing queries. The `VACUUM` command scans tables, marks space occupied by dead tuples as reusable for new writes, and updates statistics. `VACUUM FULL` locks the table and rebuilds it to release disk space to the OS.

Question 20

Explain how to optimize query performance using EXPLAIN ANALYZE.

Accepted Answer

Append `ANALYZE` to `EXPLAIN` to execute the query and return actual durations: `EXPLAIN ANALYZE SELECT * FROM users;`. Audit the output:
- Seq Scan: Indicates a sequential scan, suggesting a missing index.
- Actual time: Traces bottlenecks to specific join or sort stages, helping refine index configs.

Question 21

How do you write database integration tests in Java/Spring using Testcontainers?

Accepted Answer

Use the Testcontainers library. In test class setups, instantiate a PostgreSQL container: `static PostgreSQLContainer container = new PostgreSQLContainer<>("postgres:15")`. Spring Boot automatically boots, runs Flyway migrations, executes repository tests, and stops the container.

Question 22

Explain PostgreSQL transaction isolation levels: Read Committed vs Serializable.

Accepted Answer

- Read Committed (Default): Prevents reading uncommitted data, but allows non-repeatable reads.
- Serializable: Enforces strict isolation. It monitors concurrent transactions; if a conflict occurs (write skew), PostgreSQL rolls back one transaction, requiring the application to retry.

Question 23

Explain B-Tree, GIN, and Hash indexes in PostgreSQL.

Accepted Answer

- B-Tree (Default): Self-balancing trees optimized for sorting, range checks, and equality.
- GIN (Generalized Inverted Index): Optimized for array types and JSONB full-text search lookups.
- Hash: Fast equality-only index, not supporting ranges.

Question 24

How do you detect slow SQL queries using pg_stat_statements?

Accepted Answer

Add `pg_stat_statements` to `shared_preload_libraries` in PostgreSQL config. Query the statistics view: `SELECT query, total_exec_time FROM pg_stat_statements ORDER BY total_exec_time DESC;` to find queries causing CPU load.

Question 25

How do you mock database repositories in Java unit tests using Mockito?

Accepted Answer

Annotate the repository interface with `@Mock`. Use Mockito to mock CRUD methods (like `findById` or `save`), returning mock database entities to isolate service class tests.

Question 26

Explain foreign key constraints and cascade actions.

Accepted Answer

Foreign keys enforce integrity. Define cascade actions: `ON DELETE CASCADE` automatically deletes child records if the parent is deleted. `ON DELETE SET NULL` resets child reference columns to null.

Question 27

What is connection pooling and how do you configure PgBouncer?

Accepted Answer

PostgreSQL creates a process per connection, which is memory expensive. PgBouncer is a connection pooler that maintains a pool of active connections to the database, distributing them to incoming client connections to save memory.

Question 28

Explain PostgreSQL partition tables and how they optimize large datasets.

Accepted Answer

Partitioning splits a massive table into smaller physical tables (e.g. partitioning orders by year). The planner only scans partitions matching query dates (partition pruning), optimizing query speeds.

Question 29

What is write-ahead logging (WAL) and how is it used in replication?

Accepted Answer

WAL logs all modifications before they are written to data pages. During replication, the primary server streams WAL logs to standby replica nodes (streaming replication), which apply operations locally to stay in sync.

Question 30

How do you test database triggers in PostgreSQL integration tests?

Accepted Answer

Write integration tests that insert data. Trigger execution occurs automatically. Execute queries to assert that the audit log collections or calculated fields modified by the trigger are correct.

Question 31

Explain how to write custom SQL functions and procedures.

Accepted Answer

Use `CREATE FUNCTION` (read-only, returns values) or `CREATE PROCEDURE` (executes transactions, calls `COMMIT`). Write code in PL/pgSQL to handle variables and conditional statements.

Question 32

What is the difference between JSON and JSONB data types in PostgreSQL?

Accepted Answer

- `JSON` stores text representations, parsing JSON strings on every query.
- `JSONB` stores binary representations, which is slower to write but faster to query and supports index lookups (GIN), making it preferred.

Question 33

How do you manage database migrations using Liquibase or Flyway?

Accepted Answer

Write migration scripts as versioned SQL files (e.g., `V1__init.sql`). Liquibase or Flyway runs migrations sequentially on startup and records executed migrations in a database table to avoid duplicate runs.

Question 34

Explain the PostgreSQL planner optimization process, detailing how Statistics (pg_statistic, pg_stats) and Join algorithms (Nested Loop, Hash Join, Merge Join) are selected.

Accepted Answer

The planner compiles SQL queries into execution plans. It relies on statistics collected by the autovacuum daemon (stored in `pg_statistic`). Statistics track null fractions, cardinality, and histograms of values.

Based on cost estimates, the planner selects join algorithms:
1. Nested Loop: Scans one row in the outer table, then searches matching rows in the inner table. Optimal for small tables or when indexes are available.
2. Hash Join: Builds an in-memory hash table of the smaller table, then scans the larger table to match keys. Efficient for large datasets without sorting.
3. Merge Join: Sorts both tables on join keys, then merges them. Optimal when tables are pre-sorted or indexed. If statistics are stale, the planner may select the wrong algorithm, causing slow queries. Fix by running `ANALYZE table_name`.

Question 35

How would you optimize a high-write PostgreSQL database experiencing lock contention and table bloat (100M+ rows)?

Accepted Answer

To optimize high-write databases:
1. Autovacuum Tuning: Configure autovacuum aggressively to clean dead tuples quickly: set `autovacuum_vacuum_scale_factor = 0.05` (triggers vacuum once 5% of rows change) and increase `autovacuum_max_workers`.
2. Prevent Lock Contention: Avoid long-running transactions. Use non-blocking indexes `CREATE INDEX CONCURRENTLY` to prevent locking tables during writes.
3. Partitioning: Partition tables by date ranges. Dropping old data by dropping partitions avoids generating dead tuples and bypasses vacuum overhead.

Question 36

Explain PostgreSQL MVCC write amplification and how HOT (Heap-Only Tuples) updates optimize performance.

Accepted Answer

When a row is updated in PostgreSQL, the engine writes a new row version. If the table has indexes, the new version requires updating all indexes to point to the new row address (write amplification). HOT (Heap-Only Tuples) updates resolve this. If the update does not modify indexed columns and the new row version is stored on the same physical page as the old version, the index pointers remain unchanged, avoiding index write overhead.

Question 37

Explain how to secure a PostgreSQL database in production, focusing on SSL/TLS, pg_hba.conf, and role-based permissions.

Accepted Answer

Secure PostgreSQL by:
1. pg_hba.conf: Restrict connection origins. Block public access and configure MD5/SCRAM-SHA-256 password authentication for authorized IPs.
2. SSL/TLS: Enforce SSL connections (`ssl = on` in config) to encrypt traffic in transit.
3. Role-Based Access Control (RBAC): Create read-only and write-only roles, and grant permissions to specific schemas rather than superuser accounts.

Question 38

How would you design a high-availability PostgreSQL cluster supporting replication and connection pooling?

Accepted Answer

To design a high-availability cluster:
- Replication: Configure streaming replication (one active Primary, multiple standby Replicas).
- Connection Pooling: Deploy PgBouncer in transaction mode on replicas to manage client connections, saving database memory.
- Load Balancing: Use tools like Patroni with Consul to monitor node health and handle automatic failovers by promoting standby replicas to primaries if the primary crashes.

Question 39

Explain how the PostgreSQL Query Planner decides between index scans and sequential scans.

Accepted Answer

The planner compares cost metrics. If a query matches a large percentage of rows (e.g. > 20% of the table), the planner will select a sequential scan (`Seq Scan`) instead of an index scan, since reading the index and then jumping to pages is slower than scanning sequential pages.

Question 40

How do you execute online database schema migrations without downtime?

Accepted Answer

Perform migrations in non-blocking steps: add columns as nullable first, deploy code updates to populate fields, run background scripts to update existing records, and finally apply not-null constraints concurrently using validation checks.

Question 41

Explain how to debug lock issues using pg_locks and pg_stat_activity.

Accepted Answer

Query lock views to find blocked processes: `SELECT blocked_locks.pid, blocking_locks.pid AS blocking_pid FROM pg_catalog.pg_locks blocked_locks JOIN pg_catalog.pg_locks blocking_locks ON...`. Resolve by killing the blocking pid using `pg_terminate_backend(pid)`.

Question 42

How do you implement row-level security (RLS) in PostgreSQL?

Accepted Answer

Enable RLS on tables: `ALTER TABLE users ENABLE ROW LEVEL SECURITY;`. Define policies using expressions: `CREATE POLICY user_policy ON users TO web_user USING (tenant_id = current_setting('app.current_tenant_id'));`.

Question 43

How do you optimize memory settings (shared_buffers, work_mem) in postgresql.conf?

Accepted Answer

Optimize settings based on hardware:
- `shared_buffers`: Set to 25% of system RAM.
- `work_mem`: Increase (e.g., 64MB) to allow complex sorts and joins to execute in memory, preventing writes to temporary disk files.

Question 44

Explain how to write custom aggregates in PostgreSQL.

Accepted Answer

Define state transition functions (`SFUNC`) and state data types (`STYPE`). Register them using `CREATE AGGREGATE`, specifying how PostgreSQL accumulates values across groups.

Question 45

How do you trace and fix memory leaks in PgBouncer?

Accepted Answer

Monitor active memory usage on PgBouncer containers. Set limits on connection parameters (`max_client_conn`), and configure client timeouts to close inactive sessions.

Question 46

Explain the difference between logical replication and physical replication.

Accepted Answer

- Physical Replication: Copies raw binary data pages, creating identical standby replicas.
- Logical Replication: Streams SQL modifications, allowing replication of specific tables or across different PostgreSQL versions.

Question 47

How do you implement full-text search indexes in PostgreSQL using tsvector?

Accepted Answer

Convert text columns using `to_tsvector`. Create GIN indexes on the tsvector columns: `CREATE INDEX fts_idx ON articles USING gin(to_tsvector('english', body));` to enable fast text searches.

Question 48

How do you build a custom foreign data wrapper (FDW)?

Accepted Answer

Create custom schema linkages using foreign data wrapper extensions (like `postgres_fdw`). This lets you query tables in external databases directly using standard SQL queries.

Question 49

Explain PostgreSQL table inheritance.

Accepted Answer

PostgreSQL supports table inheritance: `CREATE TABLE child () INHERITS (parent)`. Child tables inherit all columns defined on the parent, allowing query checks on parent tables to fetch data from children.

Question 50

How do you debug circular foreign key references?

Accepted Answer

Circular references prevent table truncates. Resolve by deferring constraints check triggers: `SET CONSTRAINTS ALL DEFERRED` inside transactions, allowing updates to settle before checks run.

PostgreSQL Interview Questions for 2–5 Years Experience (2026)

What is PostgreSQL and Why is it Critical in Modern Engineering?

PostgreSQL Lifecycle Visualizer

Core Architectural Concepts in PostgreSQL

MVCC Concurrency Models

Query Planner Statistics

Table Partition Boundaries

WAL replication Logs

JSONB Document Indexing

check_circleWhy Modern Companies Choose PostgreSQL

lightbulbStrategic Preparation Tips

errorCrucial Mistakes to Avoid

trending_upHiring Trends & Career Outlook (2026)

Basics