Author: This chapter originated as part of
Enhancement of the ANSI SQL Implementation of PostgreSQL, Stefan Simkovics'
Master's Thesis prepared at Vienna University of Technology under the direction
of O.Univ.Prof.Dr. Georg Gottlob and Univ.Ass. Mag. Katrin Seyr.
This chapter gives an overview of the internal structure of the
backend of PostgreSQL. After having
read the following sections you should have an idea of how a query
is processed. This chapter does not aim to provide a detailed
description of the internal operation of
PostgreSQL, as such a document would be
very extensive. Rather, this chapter is intended to help the reader
understand the general sequence of operations that occur within the
backend from the point at which a query is received, to the point
at which the results are returned to the client.
Here we give a short overview of the stages a query has to pass in
order to obtain a result.
A connection from an application program to the PostgreSQL
server has to be established. The application program transmits a
query to the server and waits to receive the results sent back by the
server.
The parser stage checks the query
transmitted by the application
program for correct syntax and creates
a query tree.
The rewrite system takes
the query tree created by the parser stage and looks for
any rules (stored in the
system catalogs) to apply to
the query tree. It performs the
transformations given in the rule bodies.
One application of the rewrite system is in the realization of
views.
Whenever a query against a view
(i.e. a virtual table) is made,
the rewrite system rewrites the user's query to
a query that accesses the base tables given in
the view definition instead.
The planner/optimizer takes
the (rewritten) query tree and creates a
query plan that will be the input to the
executor.
It does so by first creating all possible paths
leading to the same result. For example if there is an index on a
relation to be scanned, there are two paths for the
scan. One possibility is a simple sequential scan and the other
possibility is to use the index. Next the cost for the execution of
each path is estimated and the cheapest path is chosen. The cheapest
path is expanded into a complete plan that the executor can use.
The executor recursively steps through
the plan tree and
retrieves rows in the way represented by the plan.
The executor makes use of the
storage system while scanning
relations, performs sorts and joins,
evaluates qualifications and finally hands back the rows derived.
In the following sections we will cover each of the above listed items
in more detail to give a better understanding of PostgreSQL's internal
control and data structures.