PolarDB supports three data types for working with JSON data: json, jsonb, and jsonpath. Use json or jsonb to store JSON documents, and jsonpath to express queries over those documents. For most workloads, jsonb is the better choice because it processes queries faster and supports indexing.
Choose between json and jsonb
Both types accept the same input values but store data differently:
json | jsonb | |
|---|---|---|
| Storage format | Exact copy of input text | Decomposed binary format |
| Input speed | Faster | Slightly slower (conversion overhead) |
| Query speed | Slower (reparses on every execution) | Significantly faster (no reparsing) |
| Indexing | Not supported | Supported (GIN, B-tree, hash) |
| Whitespace | Preserved | Not preserved |
| Key order | Preserved | Not preserved |
| Duplicate keys | All kept; last value is operative | Only last value kept |
Use jsonb for most applications. Use json only if you need to preserve key ordering or whitespace exactly as input — for example, to satisfy legacy assumptions about object key ordering.
JSON primitive types and PostgreSQL types
PolarDB maps JSON primitive types to PostgreSQL types when storing JSON data:
| JSON primitive type | PostgreSQL type | Notes |
|---|---|---|
string | text | \u0000 is disallowed; Unicode escapes for characters unavailable in the database encoding are also disallowed |
number | numeric | NaN and infinity are disallowed |
boolean | boolean | Only lowercase true and false are accepted |
null | (none) | SQL NULL is a different concept |
jsonb rejects numbers outside the range of the PostgreSQL numeric type. json does not perform this check.
JSON strings must be encoded in UTF-8 as specified by RFC 7159. Thejsontype allows Unicode escapes regardless of database encoding and checks only for syntactic correctness (four hex digits after\u). Thejsonbtype is stricter: it disallows Unicode escapes for characters that cannot be represented in the database encoding, rejects\u0000, and requires correct Unicode surrogate pairs. Valid Unicode escapes are converted to the equivalent single character on storage.
Input and output syntax
JSON input and output syntax follows RFC 7159. The following are all valid json (or jsonb) expressions:
-- Scalar/primitive value
-- Primitive values can be numbers, quoted strings, true, false, or null
SELECT '5'::json;
-- Array of zero or more elements (elements need not be of the same type)
SELECT '[1, 2, "foo", null]'::json;
-- Object containing key/value pairs
-- Object keys must always be quoted strings
SELECT '{"bar": "baz", "balance": 7.77, "active": false}'::json;
-- Arbitrarily nested arrays and objects
SELECT '{"foo": [true, "bar"], "tags": {"a": 1, "b": null}}'::json;json outputs input text as-is; jsonb normalizes it. For example:
SELECT '{"bar": "baz", "balance": 7.77, "active":false}'::json;
-- json
-- -------------------------------------------------
-- {"bar": "baz", "balance": 7.77, "active":false}
SELECT '{"bar": "baz", "balance": 7.77, "active":false}'::jsonb;
-- jsonb
-- --------------------------------------------------
-- {"bar": "baz", "active": false, "balance": 7.77}jsonb also normalizes numbers: values entered with E notation are printed without it.
SELECT '{"reading": 1.230e-5}'::json, '{"reading": 1.230e-5}'::jsonb;
-- json | jsonb
-- -----------------------+-------------------------
-- {"reading": 1.230e-5} | {"reading": 0.00001230}Trailing fractional zeroes are preserved in jsonb output even though they are semantically insignificant.
Designing JSON documents
JSON's flexible schema is useful when requirements change frequently, but documents with a predictable structure are easier to query. Even when structure is unenforced, maintaining consistency makes it simpler to summarize sets of documents in a table.
Keep JSON documents small. Any row update acquires a row-level lock on the entire row, so large documents increase lock contention. Structure each document as an atomic unit that cannot reasonably be subdivided further.
jsonb containment and existence
jsonb supports containment (@>) and existence (?) operators for querying nested data. These operators have no equivalent for json.
Containment
The @> operator tests whether one jsonb document is contained within another. The contained object must match the containing object in structure and data, though non-matching array elements or key/value pairs in the containing object are ignored. Array element order does not matter, and duplicate array elements are considered only once.
-- Simple scalar values contain only the identical value
SELECT '"foo"'::jsonb @> '"foo"'::jsonb; -- true
-- The right-side array is contained within the left-side array
SELECT '[1, 2, 3]'::jsonb @> '[1, 3]'::jsonb; -- true
-- Array element order is not significant
SELECT '[1, 2, 3]'::jsonb @> '[3, 1]'::jsonb; -- true
-- Duplicate array elements don't matter
SELECT '[1, 2, 3]'::jsonb @> '[1, 2, 2]'::jsonb; -- true
-- A single-pair object on the right is contained within the object on the left
SELECT '{"product": "PostgreSQL", "version": 9.4, "jsonb": true}'::jsonb @> '{"version": 9.4}'::jsonb; -- true
-- A nested array is NOT considered contained at the outer level
SELECT '[1, 2, [1, 3]]'::jsonb @> '[1, 3]'::jsonb; -- false
-- But matching the nesting level works
SELECT '[1, 2, [1, 3]]'::jsonb @> '[[1, 3]]'::jsonb; -- true
-- Containment does not cross object nesting boundaries
SELECT '{"foo": {"bar": "baz"}}'::jsonb @> '{"bar": "baz"}'::jsonb; -- false
-- A top-level key with an empty object is contained
SELECT '{"foo": {"bar": "baz"}}'::jsonb @> '{"foo": {}}'::jsonb; -- trueAs a special exception, an array may contain a primitive value (but not the reverse):
-- An array contains a primitive string value
SELECT '["foo", "bar"]'::jsonb @> '"bar"'::jsonb; -- true
-- Non-containment: a primitive does not contain an array
SELECT '"bar"'::jsonb @> '["bar"]'::jsonb; -- falseBecause containment is nested, queries can skip explicit sub-object selection. For example, to find documents where a tags array contains sub-objects with both "term":"paris" and "term":"food":
SELECT doc->'site_name' FROM websites
WHERE doc @> '{"tags":[{"term":"paris"}, {"term":"food"}]}';Existence
The ? operator tests whether a string appears as an object key or array element at the top level of a jsonb value. Existence is not nested.
-- String exists as an array element
SELECT '["foo", "bar", "baz"]'::jsonb ? 'bar'; -- true
-- String exists as an object key
SELECT '{"foo": "bar"}'::jsonb ? 'foo'; -- true
-- Object values are not checked
SELECT '{"foo": "bar"}'::jsonb ? 'bar'; -- false
-- Existence matches only at the top level
SELECT '{"foo": {"bar": "baz"}}'::jsonb ? 'bar'; -- false
-- A string matches a primitive JSON string
SELECT '"foo"'::jsonb ? 'foo'; -- trueJSON objects are better suited than arrays for containment and existence tests because they are internally optimized for key lookup and do not require a linear search.
jsonb indexing
Generalized Inverted Index (GIN) indexes efficiently search for keys or key/value pairs across large numbers of jsonb documents. Two operator classes are available with different performance characteristics.
Choose a GIN operator class
jsonb_ops (default) | jsonb_path_ops | |
|---|---|---|
| Supported operators | @>, ?, ?&, ?| | @>, @?, @@ |
| Key-exists operators | Supported | Not supported |
| Index size | Larger | Smaller |
| Search specificity | Lower | Higher |
| Indexed items | Independent item per key and per value | One item per value (hash of value + key path) |
| Empty-value structures | Indexed | Not indexed (requires full scan) |
Use jsonb_ops when queries use ?, ?&, or ?|. Use jsonb_path_ops when queries use only @>, @?, or @@ — it produces a smaller index with better search specificity, especially when queried keys appear frequently in the data.
jsonb_path_ops does not create index entries for JSON structures that contain no values (for example, {"a": {}}). Queries for such structures require a full index scan.
Create a GIN index
Default operator class (`jsonb_ops`):
CREATE INDEX idxgin ON api USING gin (jdoc);`jsonb_path_ops` operator class:
CREATE INDEX idxginp ON api USING gin (jdoc jsonb_path_ops);Expression index on a sub-key:
CREATE INDEX idxgintags ON api USING gin ((jdoc -> 'tags'));Use expression indexes when queries consistently target a specific key. A jdoc -> 'tags' expression index is smaller and faster than a full-document GIN index for queries like jdoc -> 'tags' ? 'qui'.
Query examples
Given a jsonb column jdoc in a table api:
-- Find documents where key "company" has value "Magnafone"
-- Uses the GIN index on jdoc
SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc @> '{"company": "Magnafone"}';
-- Find documents where "tags" contains key or array element "qui"
-- Cannot use a plain GIN index on jdoc; requires an expression index on (jdoc -> 'tags')
SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc -> 'tags' ? 'qui';
-- jsonpath matching with GIN index support
SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc @@ '$.tags[*] == "qui"';
SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc @? '$.tags[*] ? (@ == "qui")';
-- Containment as an alternative to the existence query above
SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc @> '{"tags": ["qui"]}';GIN indexes also support the @? and @@ operators for jsonpath matching. When processing these operators, a GIN index extracts clauses of the form accessors_chain = constant from the jsonpath pattern and searches based on the keys and values mentioned in those clauses. The accessors chain may include .key, [*], and [index] accessors. The jsonb_ops operator class also supports .* and .** accessors, but the jsonb_path_ops operator class does not.
B-tree and hash indexes
jsonb also supports B-tree and hash indexes, useful primarily for equality checks on complete JSON documents.
The B-tree ordering for jsonb values is:
Object > Array > Boolean > Number > String > Null
Object with n pairs > object with n - 1 pairs
Array with n elements > array with n - 1 elementsObjects with equal numbers of pairs are compared in the order:
key-1, value-1, key-2 ...Note that shorter keys are stored before longer keys, which can produce unintuitive ordering results:
{ "aa": 1, "c": 1} > {"b": 1, "d": 1}Arrays with equal numbers of elements are compared element by element:
element-1, element-2 ...Primitive JSON values are compared using the same rules as the underlying PostgreSQL type. Strings use the default database collation.
Transforms
Additional extensions implement transforms for jsonb for use in procedural languages.
jsonb_plperl is a trusted extension — non-superusers with CREATE privilege on the current database can install it. All other transform extensions require superuser privilege.
jsonpath type
The jsonpath type implements the SQL/JSON path language in PostgreSQL. It stores a parsed binary representation of a path expression, which the path engine uses to retrieve items from JSON data.
Syntax conventions
SQL/JSON path semantics follow SQL, but the syntax uses JavaScript conventions:
.for member access[]for array accessArrays are 0-relative (unlike SQL arrays, which start from 1)
Path expressions are written as SQL string literals enclosed in single quotes. Any single quote within the value must be doubled. Embedded string literals within path expressions follow JavaScript/ECMAScript conventions — they use double quotes, and backslash escapes are supported.
Supported escape sequences: \b, \f, \n, \r, \t, \v for ASCII control characters; \u*NNNN* for a Unicode code point. Two additional sequences not in standard JSON are also supported: \x*NN* (two hex digits) and \u{*N...*} (one to six hex digits).
jsonpath variables
| Variable | Description |
|---|---|
$ | The context item — the JSON value being queried |
$varname | A named variable; its value is set by the vars parameter of JSON processing functions |
@ | The result of path evaluation in filter expressions |
jsonpath accessor operators
| Accessor | Description |
|---|---|
.key / ."$varname" | Member accessor — returns the object member with the specified key. Enclose the key in double quotes if it starts with $ or does not meet JavaScript identifier rules. |
.* | Wildcard member accessor — returns all members at the top level of the current object |
.** | Recursive wildcard member accessor — returns all member values at all nesting levels of the current object. This is a PostgreSQL extension of the SQL/JSON standard. |
.**{level} / .**{start_level to end_level} | Like .**, but restricted to specified nesting levels. Level zero is the current object. Use the last keyword for the deepest level. This is a PostgreSQL extension of the SQL/JSON standard. |
[subscript, ...] | Array element accessor. Use index for a single element or start_index to end_index for a slice. Index zero is the first element. Use last for the last element. |
[*] | Wildcard array element accessor — returns all array elements |
Path expression elements
A path expression is a sequence of the following elements:
JSON primitive literals: Unicode text, numeric,
true,false, ornullPath variables
Accessor operators
jsonpathoperators and methodsParentheses for filter expressions or to define evaluation order