SQL Injection: How databases work

Tables, rows, the parser, the planner, the executor - what actually happens between SELECT and the result. The mental model every later lesson builds on.

Beginner10 min

SQLiRDBMSQuery Plan

Loading lesson…

Challenges Learn

Scoreboard

Teams

SQL Injection·Lesson 2 of 20

SQL Injection: How databases work

Tables, rows, the parser, the planner, the executor - what actually happens between SELECT and the result. The mental model every later lesson builds on.

Beginner10 min

SQLiRDBMSQuery Plan

Every SQLi payload you craft in the next 18 lessons lands in a database that thinks in three steps: parse, plan, execute. The string you send over the wire is the source code of the database - and you, the attacker, are a developer of that source code.

What you'll be able to do

Describe the parse → plan → execute pipeline of a SQL query.
Explain why a SQL string is the program the database runs.
Recognise the difference between a row, a column, a WHERE clause, and a JOIN.
Predict which rows a simple query returns before you run it.

Key terms

RDBMS: A relational database management system - PostgreSQL, MySQL, SQLite, SQL Server, Oracle. They all speak SQL with slight dialect differences.
Query plan: The tree of operations (scan, filter, sort, join) the planner picks. SQLi exploitation often hinges on knowing what the planner can and cannot do.
String literal: A value wrapped in quotes inside a SQL query - e.g. 'alice'. The quote characters delimit data; crossing them turns data into code.
Result set: The rows a SELECT returns. SQLi often doesn't care about the result set directly - it cares about whether one was returned at all, or how long it took.

What is it?

The three-stage query pipeline

When your application sends SELECT * FROM users WHERE id = 1, the database does three things:

Parser - turns the text into tokens (SELECT, *, FROM, users, WHERE, id, =, 1) and checks the grammar. A malformed payload dies here with a syntax error.
Planner - chooses an execution strategy. For a single-row lookup it might use the primary key index; for an unbounded LIKE '%foo%' it has to scan every row. The plan is visible in EXPLAIN output and is one of the most useful debugging tools an attacker has.
Executor - reads rows, applies filters, returns results. This is the only stage that touches actual data. Every row in the result set is a row you can observe from the network - through a length difference, a redirect, a timing offset, or an error message.

How a query reaches the result set

The application is the only piece the developer wrote. Everything to the right of it is a black box that does what it is told. That black box is the only place where attacker input becomes code.

Try it

Run six queries in the playground

Pick a query and watch the database walk through parse, plan, and execute. The users table has five rows; the orders table has three. The Filter by role = admin query is the one to memorise - it is the shape of every classic SQLi payload.

~/projects/learn-idedev

postgres@localhost:5432/acme

Pick a query

SQL

SELECT id, username, email FROM users WHERE role = 'admin';

Execution plan

1Parser: detect WHERE clause + string literal
2Planner: scan users, apply filter role = "admin"
3Executor: emit alice (1) and carol (3) only

Parser

tokenise the SQL text

Planner

pick an execution strategy

Executor

read rows, apply filters

Result2 rows

id	username	email	role
1	alice	alice@corp.io	admin
3	carol	carol@corp.io	admin

Real-world relevance

Why this matters for the next 18 lessons

When an attacker probes a vulnerable endpoint, the only output they get back is a result set - or a partial signal of one. The plan determines what signals are observable. A UNION payload works only if the database returns data through the page; a boolean payload works only if the WHERE filter changes something visible; a time payload works only if the executor will spend CPU on a function call the attacker controls. Every later lesson assumes you have this pipeline in your head.

Two dialect differences matter throughout the course. MySQL uses SLEEP(n) for time-based payloads; PostgreSQL uses pg_sleep(n). MySQL comments are -- (with a trailing space) and /* ... */; PostgreSQL accepts both plus -- without the trailing space. The sim above uses MySQL syntax; the defence in every lesson is the same.

Mitigation

How parameterisation changes the pipeline

With parameterised queries, the parser still tokenises the SQL template - but the user value never enters the parser as a token. The driver sends a separate wire-protocol message that the executor binds as a typed value. The string literal is not built by concatenation; the database knows the boundary.

javascriptparameterised

// VULNERABLE - string concatenation
const query = "SELECT * FROM users WHERE role = '" + role + "'";

// SAFE - parameterised
const query = 'SELECT * FROM users WHERE role = $1';
await db.query(query, [role]);

What to remember

A SQL string is the program the database parses and runs - that is the leverage SQLi has.
The pipeline is parse → plan → execute. Every signal an attacker observes comes from one of these three stages.
A WHERE clause is a filter; a JOIN is a filter across two tables; an ORDER BY is a sort.
The result set is the only thing the application can return - but attackers can infer it through side channels (length, time, status code) even when nothing is returned at all.

Knowledge check

0/3 answered · 0 correct

1.In the parse → plan → execute pipeline, which stage is the one that reads rows from disk?
2.A WHERE clause is best described as…
3.Why is parameterised query syntax preferred over string concatenation?