# index.html.md
# postgast
[GitHub](https://github.com/eddieland/postgast) | [PyPI](https://pypi.org/project/postgast/)
BSD-licensed Python bindings to [libpg_query](https://github.com/pganalyze/libpg_query),
the PostgreSQL parser extracted as a standalone C library.
Parse, deparse, normalize, fingerprint, split, and scan PostgreSQL SQL
statements from Python with a minimal dependency footprint — just `protobuf`
and the vendored C library.
```python
import postgast
# Parse a query into an AST
tree = postgast.parse("SELECT id, name FROM users WHERE active = true")
# Deparse an AST back to SQL
sql = postgast.deparse(tree)
# Normalize a query (replace constants with placeholders)
postgast.normalize("SELECT * FROM users WHERE id = 42")
# => "SELECT * FROM users WHERE id = $1"
# Fingerprint a query
fp = postgast.fingerprint("SELECT * FROM users WHERE id = 42")
# Split a multi-statement string
postgast.split("SELECT 1; SELECT 2;")
# => ["SELECT 1", "SELECT 2"]
```
## Installation
```bash
pip install postgast
```
## LLM-Friendly Docs
Machine-readable documentation is available for AI coding assistants:
- [llms.txt](/llms.txt) — concise overview
- [llms-full.txt](/llms-full.txt) — complete documentation in a single file
These files follow the [llms.txt standard](https://llmstxt.org/).
## Contents
* [User Guide](guide.md)
* [Parsing](guide.md#parsing)
* [Deparsing](guide.md#deparsing)
* [Normalization](guide.md#normalization)
* [Fingerprinting](guide.md#fingerprinting)
* [Splitting](guide.md#splitting)
* [Scanning](guide.md#scanning)
* [Tree Walking](guide.md#tree-walking)
* [AST Helpers](guide.md#ast-helpers)
* [DDL Helpers](guide.md#ddl-helpers)
* [Error Handling](guide.md#error-handling)
* [Common Usage Patterns](examples.md)
* [Query Analysis](examples.md#query-analysis)
* [Query Monitoring](examples.md#query-monitoring)
* [SQL Formatting](examples.md#sql-formatting)
* [Batch Processing](examples.md#batch-processing)
* [DDL Tooling](examples.md#ddl-tooling)
* [Tree Walking](examples.md#tree-walking)
* [Working with the Protobuf AST](examples.md#working-with-the-protobuf-ast)
* [PL/pgSQL Parsing](examples.md#pl-pgsql-parsing)
* [Error Handling](examples.md#error-handling)
* [Tokenization](examples.md#tokenization)
* [API Reference](api.md)
* [Core Functions](api.md#core-functions)
* [Tree Walking](api.md#tree-walking)
* [AST Helpers](api.md#ast-helpers)
* [DDL Helpers](api.md#ddl-helpers)
* [Types](api.md#types)
* [Exceptions](api.md#exceptions)
* [Architecture](architecture.md)
* [Why ctypes (not Cython, Rust, or C extensions)](architecture.md#why-ctypes-not-cython-rust-or-c-extensions)
* [How the binding layer works](architecture.md#how-the-binding-layer-works)
* [Alternatives considered](architecture.md#alternatives-considered)
* [Upgrade Policy](upgrade-policy.md)
* [Python](upgrade-policy.md#python)
* [PostgreSQL (libpg_query)](upgrade-policy.md#postgresql-libpg-query)
* [Versioning](upgrade-policy.md#versioning)
* [What this means in practice](upgrade-policy.md#what-this-means-in-practice)
* [Contributing](contributing.md)
* [Quick Start](contributing.md#quick-start)
* [Development Commands](contributing.md#development-commands)
# api.html.md
# API Reference
## Core Functions
### postgast.parse(query)
Parse a SQL query into a protobuf AST.
Calls libpg_query’s `pg_query_parse_protobuf` to parse the query and returns the deserialized `ParseResult`
protobuf message containing the abstract syntax tree.
* **Parameters:**
**query** ([`str`](https://docs.python.org/3/library/stdtypes.html#str)) – A SQL query string.
* **Return type:**
`ParseResult`
* **Returns:**
A `ParseResult` protobuf message with `version` (int) and `stmts` (list of `RawStmt`) fields.
* **Raises:**
[**PgQueryError**](#postgast.PgQueryError) – If the query contains a syntax error.
### Example
```pycon
>>> tree = parse("SELECT id, name FROM users WHERE active = true")
>>> len(tree.stmts)
1
>>> tree.stmts[0].stmt.HasField("select_stmt")
True
```
### postgast.deparse(tree)
Convert a protobuf parse tree back into a SQL string.
Calls libpg_query’s `pg_query_deparse_protobuf` to convert a `ParseResult` AST back into SQL text. This is the
inverse of [`postgast.parse()`](#postgast.parse).
#### NOTE
The deparsed SQL is canonicalized by libpg_query and may differ from the original query in whitespace, casing,
or parenthesization while remaining semantically equivalent.
* **Parameters:**
**tree** (`ParseResult` | `AstNode`) – A `ParseResult` protobuf message (as returned by [`postgast.parse()`](#postgast.parse)), or a typed `AstNode`
wrapper.
* **Return type:**
[`str`](https://docs.python.org/3/library/stdtypes.html#str)
* **Returns:**
The deparsed SQL string.
* **Raises:**
[**PgQueryError**](#postgast.PgQueryError) – If the parse tree cannot be deparsed.
### Example
```pycon
>>> from postgast import parse, deparse
>>> tree = parse("SELECT id FROM users")
>>> deparse(tree)
'SELECT id FROM users'
```
### postgast.normalize(query)
Normalize a SQL query by replacing literal constants with placeholders.
Calls libpg_query’s `pg_query_normalize` to replace literal values (strings, numbers, etc.) with parameter
placeholders (`$1`, `$2`, …). This is useful for grouping structurally equivalent queries.
* **Parameters:**
**query** ([`str`](https://docs.python.org/3/library/stdtypes.html#str)) – A SQL query string.
* **Return type:**
[`str`](https://docs.python.org/3/library/stdtypes.html#str)
* **Returns:**
The normalized query with constants replaced by positional placeholders.
* **Raises:**
[**PgQueryError**](#postgast.PgQueryError) – If the query cannot be parsed.
### Example
```pycon
>>> normalize("SELECT * FROM users WHERE id = 42 AND name = 'Alice'")
'SELECT * FROM users WHERE id = $1 AND name = $2'
```
### postgast.fingerprint(query)
Compute a structural fingerprint of a SQL query.
Calls libpg_query’s `pg_query_fingerprint` to produce a hash that identifies structurally equivalent queries
regardless of literal values.
* **Parameters:**
**query** ([`str`](https://docs.python.org/3/library/stdtypes.html#str)) – A SQL query string.
* **Return type:**
[`FingerprintResult`](#postgast.FingerprintResult)
* **Returns:**
A `FingerprintResult` containing the numeric fingerprint and its hex string representation.
* **Raises:**
[**PgQueryError**](#postgast.PgQueryError) – If the query cannot be parsed.
### Example
```pycon
>>> result = fingerprint("SELECT * FROM users WHERE id = 1")
>>> result.hex
'0ca858a0484f5826'
>>> result == fingerprint("SELECT * FROM users WHERE id = 2")
True
```
### postgast.split(sql, , method='parser')
Split a multi-statement SQL string into individual statements.
Calls the selected libpg_query split function to split the input into individual SQL statements. The `"parser"`
method (default) uses the full PostgreSQL parser for improved accuracy, while `"scanner"` uses a faster
scanner-based approach that tolerates invalid SQL.
* **Parameters:**
* **sql** ([`str`](https://docs.python.org/3/library/stdtypes.html#str)) – A SQL string potentially containing multiple statements.
* **method** ([`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)[`'scanner'`, `'parser'`]) – Which libpg_query splitter to use. `"parser"` (default) calls `pg_query_split_with_parser` for
improved accuracy on valid SQL. `"scanner"` calls `pg_query_split_with_scanner`, which tolerates
malformed SQL but may miss some edge cases.
* **Return type:**
[`list`](https://docs.python.org/3/library/stdtypes.html#list)[[`str`](https://docs.python.org/3/library/stdtypes.html#str)]
* **Returns:**
A list of individual SQL statement strings.
* **Raises:**
* [**PgQueryError**](#postgast.PgQueryError) – If the SQL causes a parse/scanner error.
* [**ValueError**](https://docs.python.org/3/library/exceptions.html#ValueError) – If *method* is not `"scanner"` or `"parser"`.
### Example
```pycon
>>> split("SELECT 1; SELECT 2")
['SELECT 1', ' SELECT 2']
>>> split("SELECT 'hello;world'")
["SELECT 'hello;world'"]
```
### postgast.scan(sql)
Tokenize a SQL string into a sequence of scan tokens.
Calls libpg_query’s `pg_query_scan` to tokenize the input and returns the deserialized `ScanResult` protobuf
message containing a list of `ScanToken` objects with token type, keyword kind, and byte positions.
* **Parameters:**
**sql** ([`str`](https://docs.python.org/3/library/stdtypes.html#str)) – A SQL string to tokenize.
* **Return type:**
`ScanResult`
* **Returns:**
A `ScanResult` protobuf message with `version` (int) and `tokens` (list of `ScanToken`) fields.
* **Raises:**
[**PgQueryError**](#postgast.PgQueryError) – If the input contains a scan error (e.g., unterminated string literal).
### Example
```pycon
>>> result = scan("SELECT 1")
>>> len(result.tokens)
2
>>> result.tokens[0].start, result.tokens[0].end
(0, 6)
```
## Tree Walking
### postgast.walk(node)
Depth-first pre-order traversal of a protobuf message tree.
Yields `(field_name, message)` tuples for every protobuf message encountered. The *field_name* is the protobuf
field name that led to the message (e.g. `"where_clause"`, `"target_list"`), or an empty string for the root.
`Node` oneof wrappers are transparently unwrapped so that only concrete message types appear in the output.
* **Parameters:**
**node** ([`Message`](https://googleapis.dev/python/protobuf/latest/google/protobuf/message.html#google.protobuf.message.Message)) – Any protobuf `Message` instance (`ParseResult`, `SelectStmt`, etc.).
* **Yields:**
`(field_name, message)` tuples in depth-first pre-order.
* **Return type:**
[`Generator`](https://docs.python.org/3/library/collections.abc.html#collections.abc.Generator)[[`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Message`](https://googleapis.dev/python/protobuf/latest/google/protobuf/message.html#google.protobuf.message.Message)], [`None`](https://docs.python.org/3/library/constants.html#None), [`None`](https://docs.python.org/3/library/constants.html#None)]
### Example
```pycon
>>> from postgast import parse, walk
>>> tree = parse("SELECT 1")
>>> for field_name, node in walk(tree):
... if field_name:
... print(f"{field_name}: {type(node).__name__}")
stmts: RawStmt
stmt: SelectStmt
target_list: ResTarget
val: A_Const
ival: Integer
```
### *class* postgast.Visitor
Base class for protobuf parse tree visitors.
Subclass and override `visit_` methods (e.g. `visit_SelectStmt`, `visit_ColumnRef`) to handle
specific node types. Unhandled types fall through to [`generic_visit()`](#postgast.Visitor.generic_visit) which recurses into children.
Call [`visit()`](#postgast.Visitor.visit) on a root message to start traversal:
```default
class TableCollector(Visitor):
def __init__(self):
self.tables = []
def visit_RangeVar(self, node):
self.tables.append(node.relname)
collector = TableCollector()
collector.visit(parse_result)
```
#### visit(node)
Dispatch *node* to `visit_` or [`generic_visit()`](#postgast.Visitor.generic_visit).
Looks up a method named `visit_` (where `` matches the protobuf descriptor name, e.g.
`visit_SelectStmt`). Falls back to [`generic_visit()`](#postgast.Visitor.generic_visit) if no specific handler exists.
* **Parameters:**
**node** ([`Message`](https://googleapis.dev/python/protobuf/latest/google/protobuf/message.html#google.protobuf.message.Message)) – Any protobuf `Message` instance.
* **Return type:**
[`None`](https://docs.python.org/3/library/constants.html#None)
#### generic_visit(node)
Visit all message-typed children of *node*.
Override this method to customize the default traversal behavior. Call `super().generic_visit(node)` from a
`visit_*` handler to continue recursion into a node’s children after custom processing.
* **Parameters:**
**node** ([`Message`](https://googleapis.dev/python/protobuf/latest/google/protobuf/message.html#google.protobuf.message.Message)) – Any protobuf `Message` instance.
* **Return type:**
[`None`](https://docs.python.org/3/library/constants.html#None)
## AST Helpers
### postgast.find_nodes(tree, node_type)
Yield all protobuf messages matching *node_type* from a parse tree.
Walks the tree in depth-first pre-order (same as [`walk()`](#postgast.walk)) and yields every message that is an instance of
*node_type*.
* **Parameters:**
* **tree** ([`Message`](https://googleapis.dev/python/protobuf/latest/google/protobuf/message.html#google.protobuf.message.Message)) – Any protobuf `Message` (`ParseResult`, `SelectStmt`, etc.).
* **node_type** ([`type`](https://docs.python.org/3/library/functions.html#type)[[`TypeVar`](https://docs.python.org/3/library/typing.html#typing.TypeVar)(`_M`, bound= [`Message`](https://googleapis.dev/python/protobuf/latest/google/protobuf/message.html#google.protobuf.message.Message))]) – Protobuf message class to match (e.g., `RangeVar`).
* **Yields:**
Matching instances in depth-first pre-order.
* **Return type:**
[`Generator`](https://docs.python.org/3/library/collections.abc.html#collections.abc.Generator)[[`TypeVar`](https://docs.python.org/3/library/typing.html#typing.TypeVar)(`_M`, bound= [`Message`](https://googleapis.dev/python/protobuf/latest/google/protobuf/message.html#google.protobuf.message.Message)), [`None`](https://docs.python.org/3/library/constants.html#None), [`None`](https://docs.python.org/3/library/constants.html#None)]
### Example
```pycon
>>> from postgast import find_nodes, parse
>>> from postgast.pg_query_pb2 import RangeVar
>>> tree = parse("SELECT * FROM users JOIN orders ON users.id = orders.uid")
>>> [n.relname for n in find_nodes(tree, RangeVar)]
['users', 'orders']
```
### postgast.extract_tables(tree)
Yield table names referenced in a parse tree.
Walks all `RangeVar` nodes and yields their names as dot-joined strings (`"schema.table"` when
schema-qualified, `"table"` otherwise).
Results preserve encounter order and include duplicates. Use `set()` on the result to get unique table names.
* **Parameters:**
**tree** ([`Message`](https://googleapis.dev/python/protobuf/latest/google/protobuf/message.html#google.protobuf.message.Message)) – Any protobuf `Message` (`ParseResult`, `SelectStmt`, etc.).
* **Yields:**
Table names in encounter order.
* **Return type:**
[`Generator`](https://docs.python.org/3/library/collections.abc.html#collections.abc.Generator)[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`None`](https://docs.python.org/3/library/constants.html#None), [`None`](https://docs.python.org/3/library/constants.html#None)]
### Example
```pycon
>>> from postgast import extract_tables, parse
>>> tree = parse("SELECT * FROM public.users JOIN orders ON true")
>>> list(extract_tables(tree))
['public.users', 'orders']
```
### postgast.extract_columns(tree)
Yield column references found in a parse tree.
Walks all `ColumnRef` nodes and yields their names as dot-joined strings. `SELECT *` produces `"*"`;
`t.*` produces `"t.*"`.
Results preserve encounter order and include duplicates.
* **Parameters:**
**tree** ([`Message`](https://googleapis.dev/python/protobuf/latest/google/protobuf/message.html#google.protobuf.message.Message)) – Any protobuf `Message` (`ParseResult`, `SelectStmt`, etc.).
* **Yields:**
Column references in encounter order.
* **Return type:**
[`Generator`](https://docs.python.org/3/library/collections.abc.html#collections.abc.Generator)[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`None`](https://docs.python.org/3/library/constants.html#None), [`None`](https://docs.python.org/3/library/constants.html#None)]
### Example
```pycon
>>> from postgast import extract_columns, parse
>>> tree = parse("SELECT u.name, age FROM users u WHERE age > 18")
>>> list(extract_columns(tree))
['u.name', 'age', 'age']
```
### postgast.extract_functions(tree)
Yield function call names found in a parse tree.
Walks all `FuncCall` nodes and yields their names as dot-joined strings (`"schema.func"` when
schema-qualified, `"func"` otherwise).
Results preserve encounter order and include duplicates.
* **Parameters:**
**tree** ([`Message`](https://googleapis.dev/python/protobuf/latest/google/protobuf/message.html#google.protobuf.message.Message)) – Any protobuf `Message` (`ParseResult`, `SelectStmt`, etc.).
* **Yields:**
Function names in encounter order.
* **Return type:**
[`Generator`](https://docs.python.org/3/library/collections.abc.html#collections.abc.Generator)[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`None`](https://docs.python.org/3/library/constants.html#None), [`None`](https://docs.python.org/3/library/constants.html#None)]
### Example
```pycon
>>> from postgast import extract_functions, parse
>>> tree = parse("SELECT lower(name), count(*) FROM users")
>>> list(extract_functions(tree))
['lower', 'count']
```
### postgast.extract_function_identity(tree)
Return the identity of the first `CREATE FUNCTION` statement in a parse tree.
Finds the first `CreateFunctionStmt` node where `is_procedure` is `False` and returns a
[`FunctionIdentity`](#postgast.FunctionIdentity) with the schema and function name.
* **Parameters:**
**tree** ([`Message`](https://googleapis.dev/python/protobuf/latest/google/protobuf/message.html#google.protobuf.message.Message)) – Any protobuf `Message` (`ParseResult`, `SelectStmt`, etc.).
* **Return type:**
[`FunctionIdentity`](#postgast.FunctionIdentity) | [`None`](https://docs.python.org/3/library/constants.html#None)
* **Returns:**
A [`FunctionIdentity`](#postgast.FunctionIdentity) or `None` if no matching node is found.
### Example
```pycon
>>> from postgast import extract_function_identity, parse
>>> sql = "CREATE FUNCTION public.add(a int, b int) RETURNS int LANGUAGE sql AS $$ SELECT a + b $$"
>>> identity = extract_function_identity(parse(sql))
>>> identity.schema, identity.name
('public', 'add')
```
### postgast.extract_trigger_identity(tree)
Return the identity of the first `CREATE TRIGGER` statement in a parse tree.
Finds the first `CreateTrigStmt` node and returns a [`TriggerIdentity`](#postgast.TriggerIdentity) with the trigger name, schema,
and table name.
* **Parameters:**
**tree** ([`Message`](https://googleapis.dev/python/protobuf/latest/google/protobuf/message.html#google.protobuf.message.Message)) – Any protobuf `Message` (`ParseResult`, `SelectStmt`, etc.).
* **Return type:**
[`TriggerIdentity`](#postgast.TriggerIdentity) | [`None`](https://docs.python.org/3/library/constants.html#None)
* **Returns:**
A [`TriggerIdentity`](#postgast.TriggerIdentity) or `None` if no matching node is found.
### Example
```pycon
>>> from postgast import extract_trigger_identity, parse
>>> sql = "CREATE TRIGGER my_trg AFTER INSERT ON orders FOR EACH ROW EXECUTE FUNCTION notify()"
>>> identity = extract_trigger_identity(parse(sql))
>>> identity.trigger, identity.table
('my_trg', 'orders')
```
## DDL Helpers
### postgast.set_or_replace(tree)
Set `replace = True` on eligible DDL nodes in a parse tree.
Walks *tree* and flips the `replace` flag on `CreateFunctionStmt`, `CreateTrigStmt`, and `ViewStmt` nodes
where it is currently `False`.
* **Parameters:**
**tree** ([`Message`](https://googleapis.dev/python/protobuf/latest/google/protobuf/message.html#google.protobuf.message.Message)) – A protobuf `Message` (typically a `ParseResult`).
* **Return type:**
[`int`](https://docs.python.org/3/library/functions.html#int)
* **Returns:**
Number of nodes that were modified.
### Example
```pycon
>>> from postgast import set_or_replace, parse, deparse
>>> tree = parse("CREATE VIEW v AS SELECT 1")
>>> set_or_replace(tree)
1
>>> "OR REPLACE" in deparse(tree)
True
```
### postgast.ensure_or_replace(sql)
Return *sql* with all eligible `CREATE` statements rewritten to `CREATE OR REPLACE`.
Parses the input, sets `replace = True` on `CreateFunctionStmt`, `CreateTrigStmt`, and `ViewStmt` nodes,
and deparses back to SQL.
* **Parameters:**
**sql** ([`str`](https://docs.python.org/3/library/stdtypes.html#str)) – One or more SQL statements.
* **Return type:**
[`str`](https://docs.python.org/3/library/stdtypes.html#str)
* **Returns:**
The rewritten SQL text.
* **Raises:**
[**PgQueryError**](#postgast.PgQueryError) – If *sql* cannot be parsed.
### Example
```pycon
>>> from postgast import ensure_or_replace
>>> ensure_or_replace("CREATE VIEW v AS SELECT 1")
'CREATE OR REPLACE VIEW v AS SELECT 1'
```
### postgast.to_drop(sql)
Return the `DROP` statement corresponding to a `CREATE` statement.
Parses *sql*, builds a `DropStmt` protobuf from the parsed AST, and deparses it back to SQL. Supports:
- `CREATE FUNCTION` / `CREATE PROCEDURE`
- `CREATE TRIGGER`
- `CREATE VIEW`
- `CREATE TABLE`
- `CREATE INDEX`
- `CREATE SEQUENCE`
- `CREATE SCHEMA`
- `CREATE EXTENSION`
- `CREATE TYPE` (enum, range, composite)
- `CREATE MATERIALIZED VIEW ... AS`
All `OR REPLACE` and `IF NOT EXISTS` variants are accepted.
* **Parameters:**
**sql** ([`str`](https://docs.python.org/3/library/stdtypes.html#str)) – A single CREATE statement.
* **Return type:**
[`str`](https://docs.python.org/3/library/stdtypes.html#str)
* **Returns:**
The corresponding DROP statement.
* **Raises:**
* [**ValueError**](https://docs.python.org/3/library/exceptions.html#ValueError) – If *sql* contains zero or more than one statement, or if the statement is not a supported CREATE
type.
* [**PgQueryError**](#postgast.PgQueryError) – If *sql* is not valid SQL.
### Example
```pycon
>>> from postgast import to_drop
>>> to_drop("CREATE TABLE public.users (id int)")
'DROP TABLE public.users'
```
## Types
### *class* postgast.FingerprintResult(fingerprint, hex)
Result of fingerprinting a SQL query.
* **Parameters:**
* **fingerprint** ([*int*](https://docs.python.org/3/library/functions.html#int))
* **hex** ([*str*](https://docs.python.org/3/library/stdtypes.html#str))
#### fingerprint
The uint64 numeric hash.
#### hex
The hexadecimal string representation of the fingerprint.
### *class* postgast.FunctionIdentity(schema, name)
Identity parts of a `CREATE FUNCTION` statement.
* **Parameters:**
* **schema** ([*str*](https://docs.python.org/3/library/stdtypes.html#str) *|* *None*)
* **name** ([*str*](https://docs.python.org/3/library/stdtypes.html#str))
#### schema
Schema name, or `None` for unqualified functions.
#### name
Function name.
### *class* postgast.TriggerIdentity(trigger, schema, table)
Identity parts of a `CREATE TRIGGER` statement.
* **Parameters:**
* **trigger** ([*str*](https://docs.python.org/3/library/stdtypes.html#str))
* **schema** ([*str*](https://docs.python.org/3/library/stdtypes.html#str) *|* *None*)
* **table** ([*str*](https://docs.python.org/3/library/stdtypes.html#str))
#### trigger
Trigger name.
#### schema
Schema qualifying the target table, or `None`.
#### table
Target table name.
## Exceptions
### *class* postgast.PgQueryError(message, , cursorpos=0, context=None, funcname=None, filename=None, lineno=0)
Structured error raised when libpg_query rejects a SQL statement.
Every postgast function that calls into libpg_query ([`parse()`](#postgast.parse), [`deparse()`](#postgast.deparse),
[`normalize()`](#postgast.normalize), [`fingerprint()`](#postgast.fingerprint), [`split()`](#postgast.split), [`scan()`](#postgast.scan),
`parse_plpgsql()`, and `format_sql()`) may raise this exception. The error carries the
same structured fields that the C library provides, so callers can build precise diagnostics (e.g., underlining the
offending token) without parsing the message string.
`cursorpos` is a **1-based byte offset** into the original SQL string pointing to the token where the error was
detected. When it is `0` the position is unknown. Because it counts *bytes*, `e.cursorpos - 1` only equals the
corresponding Python string index when the SQL is pure ASCII. For SQL containing multibyte UTF-8 characters
(e.g., Unicode identifiers or string literals), index into the UTF-8-encoded `bytes` representation instead:
```default
pos = sql.encode("utf-8")[: e.cursorpos - 1].decode("utf-8")
char_offset = len(pos)
```
The `funcname`, `filename`, and `lineno` fields refer to the *internal C source* of
libpg_query / PostgreSQL’s parser, not to your Python code. They are mainly useful for filing upstream bug reports.
* **Parameters:**
* **message** ([*str*](https://docs.python.org/3/library/stdtypes.html#str))
* **cursorpos** ([*int*](https://docs.python.org/3/library/functions.html#int))
* **context** ([*str*](https://docs.python.org/3/library/stdtypes.html#str) *|* *None*)
* **funcname** ([*str*](https://docs.python.org/3/library/stdtypes.html#str) *|* *None*)
* **filename** ([*str*](https://docs.python.org/3/library/stdtypes.html#str) *|* *None*)
* **lineno** ([*int*](https://docs.python.org/3/library/functions.html#int))
#### message
Human-readable error description from the PostgreSQL parser.
#### cursorpos
1-based byte offset in the SQL string where the error was detected
(`0` when the position is unavailable).
#### context
Additional context from the parser (e.g., PL/pgSQL function name), or `None`.
#### funcname
Internal C function name where the error originated, or `None`.
#### filename
Internal C source file where the error originated, or `None`.
#### lineno
Line number in the internal C source file (`0` when unavailable).
### Examples
Catch a syntax error and inspect the cursor position:
```pycon
>>> from postgast import parse, PgQueryError
>>> try:
... parse("SELECT FROM")
... except PgQueryError as e:
... print(e.cursorpos)
12
```
Use `cursorpos` to highlight the error location (ASCII-safe shortcut):
```pycon
>>> from postgast import parse, PgQueryError
>>> sql = "SELECT * FORM users"
>>> try:
... parse(sql)
... except PgQueryError as e:
... idx = max(e.cursorpos - 1, 0)
... print(sql)
... print(" " * idx + "^")
... print(e.message)
SELECT * FORM users
^
syntax error at or near "FORM"
```
For SQL that may contain non-ASCII characters, convert via the encoded bytes to get the correct
character offset:
```default
sql = "SELECT 'ü' FORM t"
try:
parse(sql)
except PgQueryError as e:
idx = len(sql.encode("utf-8")[: e.cursorpos - 1].decode("utf-8"))
print(sql)
print(" " * idx + "^")
```
# architecture.html.md
# Architecture
This page covers the main technical decisions behind `postgast` and why we
made them.
## Why ctypes (not Cython, Rust, or C extensions)
`postgast` binds to `libpg_query` using Python’s built-in `ctypes`
module rather than Cython, PyO3/Rust, or a hand-written CPython C extension.
This was a deliberate choice. Here’s the reasoning.
### Pure-Python packaging
With `ctypes` the only compiled artifact is the vendored `libpg_query`
shared library itself. Everything above it (struct definitions, function
signatures, error handling, protobuf deserialization) is plain Python. This
means:
- **No compiler toolchain at install time.** Users never need Cython, a Rust
toolchain, or a C compiler to install `postgast`. `pip install postgast`
delivers a pre-built wheel containing the shared library and pure-Python
code.
- **Simpler CI matrix.** Wheels are built by compiling a single C library
(`libpg_query`) per platform. There’s no second compilation step for a
Python extension module, which removes an entire class of ABI-compatibility
issues (limited API, stable ABI, per-interpreter builds, etc.).
- **Easier debugging.** Every line between the C boundary and the public API
is inspectable Python. You can step through `native.py` with a normal
debugger, no mixed C/Python stack frames to deal with.
### Minimal dependency footprint
The only runtime dependency is `protobuf`. There’s no build-time dependency
on Cython or `setuptools-rust`, no transitive dependency on `cffi`, and
no compiled glue code. Fewer moving parts means fewer ways the install can
break.
### BSD licensing
`pglast`, the most established `libpg_query` wrapper for Python, is
licensed under GPLv3. That makes it unusable in many commercial and
permissively-licensed projects. By keeping the binding layer to `ctypes`
(stdlib) plus `protobuf` (BSD-compatible), `postgast` can ship under the
BSD 2-Clause license with no copyleft obligations.
### Trade-offs
`ctypes` isn’t free of downsides:
- **No compile-time type checking at the C boundary.** If the
`libpg_query` struct layout changes between versions, the ctypes bindings
break silently at runtime rather than failing to compile. We mitigate this
by pinning to a specific `libpg_query` version and testing across
platforms in CI.
- **Per-call overhead.** Each `ctypes` call has slightly more overhead than
a direct C extension call. In practice this is negligible because the real
work happens inside `libpg_query` (parsing a full PostgreSQL grammar) and
the `ctypes` marshalling cost is dwarfed by the parser itself.
- **Manual struct definitions.** The ctypes `Structure` classes in
`native.py` must mirror the C structs exactly. It’s a small amount of
code (~200 lines) maintained by hand. A Cython `.pxd` or Rust
`bindgen` would generate these, but at the cost of the toolchain
complexity described above.
On balance the simplicity, portability, and licensing benefits outweigh the
minor ergonomic costs.
## How the binding layer works
All C interop lives in a single internal module, `native.py`. It does three
things:
1. **Loads the shared library.** Checks for a vendored copy bundled in the
wheel first, then falls back to `ctypes.util.find_library` for
system-installed libraries.
2. **Defines ctypes Structure classes** that mirror every `libpg_query`
result type (`PgQueryParseResult`, `PgQueryNormalizeResult`, etc.).
3. **Declares function signatures** (`argtypes` / `restype`) for each
public C function so that calls are type-checked at the Python level.
Higher-level modules (`parse.py`, `deparse.py`, `normalize.py`, …)
import `native.lib` and follow a consistent pattern:
```python
# Pseudocode (actual code is in each module)
result = native.lib.pg_query_parse_protobuf(sql.encode())
try:
check_error(result) # raise PgQueryError if result.error is set
payload = extract(result) # read return value
finally:
native.lib.pg_query_free_protobuf_parse_result(result)
```
The `finally` block ensures the C-allocated memory is always freed, even
when an error is raised.
### Protobuf deserialization
`libpg_query` returns parse trees as serialized Protocol Buffer messages.
`postgast` deserializes them using the official `protobuf` library into
generated Python message classes (`pg_query_pb2`). This avoids writing a
custom deserializer and tracks the upstream `.proto` schema exactly.
Binary payloads are read with `ctypes.string_at(data, length)` rather than
`c_char_p` because protobuf data can contain embedded null bytes that
`c_char_p` would silently truncate.
## Alternatives considered
### Cython
Cython would give compile-time type safety at the C boundary and marginally
faster call overhead. But it requires a C compiler at wheel-build time *and*
introduces a Cython build dependency. For a thin binding layer (~200 lines of
struct definitions and function signatures), the added build complexity isn’t
justified.
### Rust (PyO3 / maturin)
A Rust extension via PyO3 would provide memory safety guarantees and strong
typing. But `libpg_query` is a C library, so the Rust layer would still
call C via FFI. Adding Rust introduces a second toolchain (`cargo`), a
`maturin` build backend, and complicates cross-compilation. The binding
layer is too thin to benefit from Rust’s strengths.
### CFFI
CFFI is a popular alternative to `ctypes` that offers an ABI mode (similar
to `ctypes`) and an API mode (generates a C extension). ABI mode provides
no real advantage over `ctypes` for this use case, and API mode
reintroduces the C compiler requirement. Staying with `ctypes` avoids
adding `cffi` as a dependency.
### Hand-written CPython C extension
A C extension would be the fastest option, but it ties the code to CPython
internals, requires careful reference counting, and complicates building
wheels for multiple Python versions. The performance difference is immaterial
for this library’s workload.
# contributing.html.md
# Contributing
See the [contributing guide on GitHub](https://github.com/eddieland/postgast/blob/main/CONTRIBUTING.md)
for full details on setting up a development environment, running tests, and
submitting pull requests.
## Quick Start
```bash
git clone https://github.com/eddieland/postgast.git
cd postgast
make install # Install dependencies
make build-native # Compile vendored libpg_query
make test # Run tests
```
## Development Commands
```bash
make fmt # Autoformat (mdformat, codespell, ruff)
make lint # Format + type-check (basedpyright)
make test # Run tests
make coverage # Tests with coverage report
make all # install + lint + test
```
# examples.html.md
# Common Usage Patterns
This page demonstrates practical patterns for common tasks. For API
fundamentals, see the [User Guide](guide.md).
## Query Analysis
### Audit which tables a query touches
```python
import postgast
sql = """
SELECT o.id, c.name, p.title
FROM orders o
JOIN customers c ON o.customer_id = c.id
JOIN products p ON o.product_id = p.id
WHERE o.created_at > '2024-01-01'
"""
tree = postgast.parse(sql)
tables = postgast.extract_tables(tree)
columns = postgast.extract_columns(tree)
functions = postgast.extract_functions(tree)
print("Tables:", tables) # ['orders', 'customers', 'products']
print("Columns:", columns) # ['o.id', 'c.name', 'p.title', ...]
print("Functions:", functions) # []
```
### Detect queries that use subqueries
```python
from postgast import parse, find_nodes
from postgast.pg_query_pb2 import SelectStmt
def has_subquery(sql: str) -> bool:
"""Return True if the SQL contains a nested SELECT."""
tree = parse(sql)
select_count = sum(1 for _ in find_nodes(tree, SelectStmt))
return select_count > 1
has_subquery("SELECT * FROM users")
# => False
has_subquery("SELECT * FROM (SELECT id FROM users) AS sub")
# => True
```
### Find all function calls in a query
```python
import postgast
sql = "SELECT upper(name), count(*) FROM users GROUP BY upper(name)"
tree = postgast.parse(sql)
print(postgast.extract_functions(tree))
# ['upper', 'count', 'upper']
# Unique function names:
print(set(postgast.extract_functions(tree)))
# {'upper', 'count'}
```
## Query Monitoring
### Group queries with normalization
Replace literal values with positional placeholders so that structurally
identical queries collapse into a single group:
```python
import postgast
queries = [
"SELECT * FROM users WHERE id = 42",
"SELECT * FROM users WHERE id = 99",
"SELECT * FROM orders WHERE status = 'pending'",
]
for q in queries:
print(postgast.normalize(q))
# SELECT * FROM users WHERE id = $1
# SELECT * FROM users WHERE id = $1
# SELECT * FROM orders WHERE status = $1
```
### Fingerprint queries for deduplication
Two queries are structurally equivalent when they have the same fingerprint,
regardless of literal values, whitespace, or formatting:
```python
import postgast
fp1 = postgast.fingerprint("SELECT * FROM users WHERE id = 1")
fp2 = postgast.fingerprint("select * from users where id = 999")
assert fp1.hex == fp2.hex # same structure
fp3 = postgast.fingerprint("SELECT * FROM orders WHERE id = 1")
assert fp1.hex != fp3.hex # different table
```
## SQL Formatting
### Pretty-print SQL
```python
import postgast
ugly = "select u.id,u.name,o.total from users u join orders o on u.id=o.user_id where o.total>100 order by o.total desc"
print(postgast.format_sql(ugly))
```
Output:
```sql
SELECT
u.id,
u.name,
o.total
FROM
users u
JOIN orders o ON u.id = o.user_id
WHERE
o.total > 100
ORDER BY
o.total DESC;
```
### Format an already-parsed tree
`format_sql` also accepts a `ParseResult`, so you can format after
making AST modifications:
```python
import postgast
tree = postgast.parse("CREATE VIEW v AS SELECT 1")
postgast.set_or_replace(tree)
print(postgast.format_sql(tree))
```
Output:
```sql
CREATE OR REPLACE VIEW v AS
SELECT
1;
```
## Batch Processing
### Process a SQL migration file
Use [`split()`](api.md#postgast.split) to break a multi-statement file into individual
statements, then analyze each one:
```python
import postgast
migration = """
CREATE TABLE users (
id serial PRIMARY KEY,
name text NOT NULL
);
CREATE INDEX idx_users_name ON users (name);
INSERT INTO users (name) VALUES ('alice'), ('bob');
"""
for stmt in postgast.split(migration):
tree = postgast.parse(stmt)
tables = postgast.extract_tables(tree)
print(f"Tables: {tables!r:30s} SQL: {stmt.strip()[:60]}...")
```
### Split with tolerance for invalid SQL
The `"scanner"` method splits on semicolons without parsing, so it works
even when the SQL contains syntax errors:
```python
import postgast
broken = "SELECT 1; INVALID SYNTAX HERE; SELECT 2"
stmts = postgast.split(broken, method="scanner")
print(stmts)
# ['SELECT 1', ' INVALID SYNTAX HERE', ' SELECT 2']
```
## DDL Tooling
### Generate rollback DROP statements
Automatically produce `DROP` statements from `CREATE` DDL for migration
rollback scripts:
```python
import postgast
creates = [
"CREATE FUNCTION public.add(a int, b int) RETURNS int LANGUAGE sql AS $$ SELECT a + b $$",
"CREATE VIEW active_users AS SELECT * FROM users WHERE active",
"CREATE TRIGGER audit_trg BEFORE UPDATE ON users FOR EACH ROW EXECUTE FUNCTION audit()",
]
for sql in creates:
print(postgast.to_drop(sql))
# DROP FUNCTION public.add(int, int)
# DROP VIEW active_users
# DROP TRIGGER audit_trg ON users
```
### Make CREATE statements idempotent
Add `OR REPLACE` to `CREATE FUNCTION`, `CREATE TRIGGER`, and
`CREATE VIEW` statements so they can be re-run safely:
```python
import postgast
sql = "CREATE VIEW active_users AS SELECT * FROM users WHERE active"
print(postgast.ensure_or_replace(sql))
# CREATE OR REPLACE VIEW active_users AS SELECT * FROM users WHERE active = true
# Already idempotent input is unchanged:
postgast.ensure_or_replace(postgast.ensure_or_replace(sql))
```
## Tree Walking
### Walk the AST to inspect structure
[`walk()`](api.md#postgast.walk) yields every node in the tree with its parent field
name, useful for debugging or generic transforms:
```python
import postgast
tree = postgast.parse("SELECT a FROM t WHERE x = 1")
for field_name, node in postgast.walk(tree):
if field_name:
print(f" {field_name}: {type(node).__name__}")
```
Output:
```default
stmts: RawStmt
stmt: SelectStmt
target_list: ResTarget
val: ColumnRef
fields: String
from_clause: RangeVar
where_clause: A_Expr
lexpr: ColumnRef
fields: String
rexpr: A_Const
```
### Collect information with the Visitor pattern
Create a [`Visitor`](api.md#postgast.Visitor) subclass with `visit_`
methods. Unhandled node types automatically recurse into their children:
```python
import postgast
class QueryAnalyzer(postgast.Visitor):
def __init__(self):
self.tables = []
self.columns = []
self.has_where = False
def visit_RangeVar(self, node):
self.tables.append(node.relname)
def visit_ColumnRef(self, node):
parts = []
for f in node.fields:
inner = getattr(f, f.WhichOneof("node"))
if hasattr(inner, "sval"):
parts.append(inner.sval)
self.columns.append(".".join(parts))
def visit_SelectStmt(self, node):
if node.HasField("where_clause"):
self.has_where = True
self.generic_visit(node) # continue into children
tree = postgast.parse("SELECT u.name FROM users u WHERE u.active = true")
analyzer = QueryAnalyzer()
analyzer.visit(tree)
print(analyzer.tables) # ['users']
print(analyzer.columns) # ['u.name', 'u.active']
print(analyzer.has_where) # True
```
### Control traversal depth
Omitting the call to `self.generic_visit(node)` in a handler stops
recursion into that node’s children. This lets you skip subtrees:
```python
import postgast
class TopLevelTables(postgast.Visitor):
"""Collect tables from the top-level FROM clause only, ignoring subqueries."""
def __init__(self):
self.tables = []
def visit_RangeVar(self, node):
self.tables.append(node.relname)
def visit_SubLink(self, _node):
pass # don't recurse into subqueries
tree = postgast.parse(
"SELECT * FROM orders WHERE customer_id IN (SELECT id FROM vip_customers)"
)
v = TopLevelTables()
v.visit(tree)
print(v.tables) # ['orders'] — vip_customers is skipped
```
### Use typed AST wrappers
Wrap a parse tree with `wrap()` for typed attribute access.
Works with `walk_typed()` and `TypedVisitor`:
```python
from postgast import parse, wrap, walk_typed, TypedVisitor
tree = wrap(parse("SELECT a, b FROM t"))
for field_name, node in walk_typed(tree):
if field_name:
print(f" {field_name}: {type(node).__name__}")
```
## Working with the Protobuf AST
### Access raw protobuf nodes
The parse tree is a standard protobuf `Message`. You can inspect it
using all the usual protobuf APIs:
```python
import postgast
tree = postgast.parse("SELECT id, name FROM users WHERE active = true")
# Navigate to the SelectStmt
raw_stmt = tree.stmts[0]
select = raw_stmt.stmt.select_stmt
# Inspect the target list (SELECT columns)
for target in select.target_list:
col = target.res_target.val.column_ref
name = col.fields[0].string.sval
print(f"Column: {name}")
# Inspect the FROM clause
table = select.from_clause[0].range_var
print(f"Table: {table.relname}") # 'users'
```
### Find specific node types
[`find_nodes()`](api.md#postgast.find_nodes) filters the walk to a single protobuf
message type:
```python
from postgast import parse, find_nodes
from postgast.pg_query_pb2 import FuncCall, RangeVar
tree = parse("SELECT lower(name), count(*) FROM users GROUP BY lower(name)")
# All table references
for rv in find_nodes(tree, RangeVar):
print(f"Table: {rv.relname}")
# All function calls
for fc in find_nodes(tree, FuncCall):
func_name = fc.funcname[0].string.sval
print(f"Function: {func_name}")
```
## PL/pgSQL Parsing
### Parse a PL/pgSQL function body
`parse_plpgsql()` returns a structured representation of a
PL/pgSQL function’s body, including declarations, assignments, and control
flow:
```python
import json
import postgast
sql = """
CREATE FUNCTION greet(name text) RETURNS text LANGUAGE plpgsql AS $$
DECLARE
result text;
BEGIN
result := 'Hello, ' || name;
RETURN result;
END;
$$
"""
parsed = postgast.parse_plpgsql(sql)
print(json.dumps(parsed, indent=2))
```
## Error Handling
### Catch and inspect parse errors
All functions raise [`PgQueryError`](api.md#postgast.PgQueryError) on invalid SQL. The
exception carries the error message, cursor position, and source location:
```python
import postgast
try:
postgast.parse("SELECT * FORM users")
except postgast.PgQueryError as e:
print(f"Error: {e.message}")
print(f"Position: {e.cursorpos}")
# Error: syntax error at or near "users"
# Position: 15
```
### Validate SQL before execution
Use `parse` as a fast syntax check without hitting the database:
```python
import postgast
def is_valid_sql(sql: str) -> bool:
try:
postgast.parse(sql)
return True
except postgast.PgQueryError:
return False
is_valid_sql("SELECT * FROM users") # True
is_valid_sql("SLECT * FORM users") # False
is_valid_sql("SELECT 1; DROP TABLE users; --") # True (valid SQL!)
```
## Tokenization
### Scan SQL into tokens
[`scan()`](api.md#postgast.scan) returns the raw token stream, useful for syntax
highlighting, keyword detection, or building custom splitters:
```python
import postgast
result = postgast.scan("SELECT id FROM users WHERE active = true")
for token in result.tokens:
# Extract the token text using byte positions
text = "SELECT id FROM users WHERE active = true"[token.start:token.end]
print(f"{text:12s} token={token.token} keyword={token.keyword_kind}")
```
# guide.html.md
# User Guide
## Parsing
Parse a SQL query into a protobuf AST:
```python
import postgast
tree = postgast.parse("SELECT id, name FROM users WHERE active = true")
# tree.stmts[0] contains the first (and only) statement
```
The returned `ParseResult` is a protobuf message. Each statement in
`tree.stmts` is a `RawStmt` wrapping a `Node` oneof.
## Deparsing
Convert a parse tree back into SQL text:
```python
sql = postgast.deparse(tree)
# "SELECT id, name FROM users WHERE active = true"
```
The deparsed SQL is canonicalized by libpg_query and may differ from the
original in whitespace, casing, or parenthesization while remaining
semantically equivalent.
## Normalization
Replace literal constants with positional placeholders:
```python
postgast.normalize("SELECT * FROM users WHERE id = 42 AND name = 'alice'")
# => "SELECT * FROM users WHERE id = $1 AND name = $2"
```
This is useful for grouping structurally equivalent queries.
## Fingerprinting
Compute a structural hash that identifies equivalent queries regardless of
literal values:
```python
fp = postgast.fingerprint("SELECT * FROM users WHERE id = 42")
fp.fingerprint # uint64 hash
fp.hex # hex string representation
```
## Splitting
Split a multi-statement SQL string into individual statements:
```python
postgast.split("SELECT 1; SELECT 2;")
# => ["SELECT 1", "SELECT 2"]
```
Two methods are available via the `method` parameter:
- `"parser"` (default) — uses the full PostgreSQL parser for accuracy
- `"scanner"` — faster, tolerates invalid SQL
```python
postgast.split("SELECT 1; SELECT 2;", method="scanner")
```
## Scanning
Tokenize a SQL string:
```python
result = postgast.scan("SELECT 1")
for token in result.tokens:
print(token.token, token.keyword_kind, token.start, token.end)
```
## Tree Walking
Walk all nodes in a parse tree with depth-first traversal:
```python
for field_name, node in postgast.walk(tree):
print(type(node).__name__, field_name)
```
Or use the visitor pattern:
```python
class TableCollector(postgast.Visitor):
def __init__(self):
self.tables = []
def visit_RangeVar(self, node):
self.tables.append(node.relname)
collector = TableCollector()
collector.visit(tree)
print(collector.tables)
```
## AST Helpers
Extract common information from parse trees:
```python
tree = postgast.parse("SELECT u.id, u.name FROM users u JOIN orders o ON u.id = o.user_id")
postgast.extract_tables(tree) # ["users", "orders"]
postgast.extract_columns(tree) # ["u.id", "u.name", "u.id", "o.user_id"]
postgast.extract_functions(tree) # []
```
Find specific node types:
```python
from postgast.pg_query_pb2 import RangeVar
for node in postgast.find_nodes(tree, RangeVar):
print(node.relname, node.alias.aliasname if node.alias.aliasname else "")
```
## DDL Helpers
Generate `DROP` statements from `CREATE` DDL:
```python
postgast.to_drop("CREATE FUNCTION add(a int, b int) RETURNS int AS $$ SELECT a + b $$ LANGUAGE sql")
# => "DROP FUNCTION add(int, int)"
```
Rewrite `CREATE` to `CREATE OR REPLACE`:
```python
postgast.ensure_or_replace("CREATE FUNCTION add(a int, b int) RETURNS int AS $$ SELECT a + b $$ LANGUAGE sql")
# => "CREATE OR REPLACE FUNCTION ..."
```
## Error Handling
All functions raise [`PgQueryError`](api.md#postgast.PgQueryError) on invalid SQL:
```python
try:
postgast.parse("SELECT FROM")
except postgast.PgQueryError as e:
print(e.message) # Error message
print(e.cursorpos) # Position in the SQL string
```
# upgrade-policy.html.md
# Upgrade Policy
This page describes how `postgast` tracks upstream dependencies and which
Python versions are supported.
## Python
`postgast` currently supports **Python 3.10** through the latest GA release
(currently 3.14).
The minimum supported version will move to **Python 3.12** in an upcoming
release. When this happens the major version will be bumped to signal the
change. Pin `postgast<2` if you need to stay on Python 3.10 or 3.11.
## PostgreSQL (libpg_query)
`postgast` always vendors the latest available version of
[libpg_query](https://github.com/pganalyze/libpg_query). The vendored
version determines which PostgreSQL grammar is used for parsing, deparsing,
and all other operations.
Because `postgast` delegates all parsing to `libpg_query`, the PostgreSQL
syntax it understands is dictated entirely by the vendored library version.
There is no separate PostgreSQL version knob to configure.
Backwards-incompatible changes to the PostgreSQL parser are exceedingly rare.
In practice a `libpg_query` upgrade means *new* syntax is accepted, not that
existing syntax breaks. Still, if you need to pin a specific parser version
you can pin the `postgast` version that vendors it.
## Versioning
`postgast` follows [Semantic Versioning](https://semver.org/):
- **Patch** releases contain bug fixes and `libpg_query` patch updates.
- **Minor** releases add new features or upgrade `libpg_query` to a new
PostgreSQL major version.
- **Major** releases include breaking API changes or Python support-range
changes (such as dropping a Python version).
## What this means in practice
- **Most users** can track the latest `postgast` release with no issues.
`libpg_query` parser upgrades almost never break existing SQL.
- **Users on older Python versions** should pin to the last major version that
supports their interpreter once the minimum is raised.
- **Users who need a specific PostgreSQL parser version** should pin the
`postgast` version that vendors the corresponding `libpg_query` release.