JavaScript Object Notation (JSON) data types are widely used on the Internet and Internet of Things (IoT). For more information about the protocols used for JSON, visit Introducing JSON. AnalyticDB for PostgreSQL supports JSON data types. AnalyticDB for PostgreSQL V6.0 also supports JSON-B data types. This topic describes how to manage JSON and JSON-B data, including:

Differences and similarities between JSON and JSON-B

JSON and JSON-B data types are similar in usage but different in storage methods. JSON data is stored as an exact copy of the input text, while JSON-B data is stored in a binary format. The JSON-B data type is more efficient and processes faster than the JSON data type. Also, the JSON-B data type supports indexing. Therefore, the JSON-B data type is preferred for AnalyticDB for PostgreSQL V6.0.

JSON input and output syntax

For more information, visit RFC 7159.

A JSON value must be an object, array, number, string, or one of the following literal names in lowercase: true, null, and false. The following statements are valid JSON expressions:

-- A simple scalar or value
-- A simple value must be a number, a string enclosed in a pair of quotation marks, or a literal name (true, false, or null).
SELECT '5'::json;

-- An array of zero or more elements (The elements can be of the same type or different types.)
SELECT '[1, 2, "foo", null]'::json;

-- An object that contains key-value pairs
-- The key in a key-value pair of an object must be a string enclosed in a pair of quotation marks.
SELECT '{"bar": "baz", "balance": 7.77, "active": false}'::json;

-- An array or object nested in each other
SELECT '{"foo": [true, "bar"], "tags": {"a": 1, "b": null}}'::json;

The preceding JSON values can all be converted into JSON-B values. Example:

-- A simple scalar or value in the JSON-B format
SELECT '5'::jsonb;

JSON operators

The following table describes the operators available for use with JSON and JSON-B data types.

Operator Right operand type Description Example Result
-> int Obtains a JSON array element indexed from zero. '[{"a":"foo"}, {"b":"bar"}, {"c":"baz"}]'::json->2 {"c":"baz"}
-> text Obtains a JSON object field based on a key. '{"a": {"b":"foo"}}'::json->'a' {"b":"foo"}
->> int Obtains a JSON array element as text. '[1,2,3]'::json->>2 3
->> text Obtains a JSON object field as text. '{"a":1,"b":2}'::json->>'b' 2
#> text[] Obtains a JSON object from a specified path. '{"a": {"b":{"c": "foo"}}}'::json#>'{a,b}' {"c": "foo"}
#>> text[] Obtains a JSON object as text from a specified path. '{"a":[1,2,3], "b":[4,5,6]}'::json#>>'{a,2}' 3

The following table describes the operators available for use with JSON-B data types.

JSON-B operators

The following table describes the operators available for use with JSON-B data types.

Operator Right operand type Description Example
= jsonb Specifies whether two JSON values are the same. '[1,2]'::jsonb= '[1,2]'::jsonb
@> jsonb Specifies whether the left JSON value contains the right JSON value. '{"a":1, "b":2}'::jsonb @> '{"b":2}'::jsonb
<@ jsonb Specifies whether the left JSON value is contained within the right JSON value. '{"b":2}'::jsonb <@ '{"a":1, "b":2}'::jsonb
? text Specifies whether the string exists as a key or as a string within the JSON value '{"a":1, "b":2}'::jsonb ? 'b'
? | text[] Specifies whether any of the right array strings exist as keys or as strings within the JSON value. '{"a":1, "b":2, "c":3}'::jsonb ? | array['b', 'c']
? & text[] Specifies whether all of the right array strings exist as keys or as strings within the JSON value. '["a", "b"]'::jsonb ? & array['a', 'b']

JSON creation functions

The following table describes the functions used to create JSON values.

Function Description Example Result
to_json (anyelement) Returns a value as a valid JSON object. Arrays and composites are converted recursively to arrays and objects. If a cast function is provided, that cast function is invoked to convert the input value into a JSON object. Otherwise, a scalar value is produced. If the scalar value is not a number, Boolean value, or null, it is represented by JSON text with quotation marks and escape characters that are used to make it a valid JSON string. to_json ('Fred said "Hi."'::text) "Fred said \"Hi. \""
array_to_json (anyarray [, pretty_bool]) Returns an array as a JSON array. If you enter a multidimensional array, a JSON array of arrays is returned.
Note If the value of the pretty_bool parameter is true, line feeds are added between dimension-1 elements.
array_to_json ('{{1,5},{99,100}}'::int[]) [[1,5],[99,100]]
row_to_json (record [, pretty_bool]) Returns the row as a JSON object.
Note If the value of the pretty_bool parameter is true, line feeds are added between dimension-1 elements.
row_to_json (row(1,'foo')) {"f1":1,"f2":"foo"}

JSON processing functions

The following table describes the functions used to process JSON values.

Function Return value type Description Example Return result example
json_each(json) set of key text, value json set of key text, value jsonb Expands the outermost JSON object into a set of key-value pairs. select * from json_each('{"a":"foo", "b":"bar"}')
 key | value
-----+-------
 a   | "foo"
 b   | "bar"
json_each_text(json) set of key text, value text Expands the outermost JSON object into a set of key-value pairs. The return values are of the TEXT type. select * from json_each_text('{"a":"foo", "b":"bar"}')
 key | value
-----+-------
 a   | foo
 b   | bar
json_extract_path(from_json json, VARIADIC path_elems text[]) json Returns the JSON value specified by the path_elems parameter. This function is equivalent to the #> operator. json_extract_path('{"f2":{"f3":1},"f4":{"f5":99,"f6":"foo"}}','f4')
{"f5":99,"f6":"foo"}
json_extract_path_text(from_json json, VARIADIC path_elems text[]) text Returns the JSON value specified by the path_elems parameter as JSON text. This function is equivalent to the #>> operator. json_extract_path_text('{"f2":{"f3":1},"f4":{"f5":99,"f6":"foo"}}','f4', 'f6')
foo
json_object_keys(json) setof text Returns a set of keys in the outermost JSON object. json_object_keys('{"f1":"abc","f2":{"f3":"a", "f4":"b"}}')
 json_object_keys
------------------
 f1
 f2
json_populate_record(base anyelement, from_json json) anyelement Expands the object in the from_json parameter into a row with columns that match the record type defined by the base parameter. select * from json_populate_record(null::myrowtype, '{"a":1,"b":2}')
 a | b
---+---
 1 | 2
json_populate_recordset(base anyelement, from_json json) set of anyelement Expands the outermost array of objects in the from_json parameter into a set of rows with columns that match the record type defined by the base parameter. select * from json_populate_recordset(null::myrowtype, '[{"a":1,"b":2},{"a":3,"b":4}]')
 a | b
---+---
 1 | 2
 3 | 4
json_array_elements(json) set of json Expands a JSON array into a set of JSON values. select * from json_array_elements('[1,true, [2,false]]')
   value
-----------
 1
 true
 [2,false]

Create a JSON-B function index

You can create GIN and B-tree indexes on JSON-B columns. You can execute the following statements to create a GIN index on a JSON-B column:

CREATE INDEX idx_name ON table_name USING gin (idx_col);
CREATE INDEX idx_name ON table_name USING gin (idx_col jsonb_path_ops);
Note You can use one of the following operators to create a GIN index on a JSON-B column: the default jsonb_ops operator and the jsonb_path_ops operator. The difference between a jsonb_ops and a jsonb_path_ops GIN index is that the former creates independent index items for each key and value in the data, while the latter creates index items only for each value in the data.

Examples

Create a table.

create table tj(id serial, ary int[], obj json, num integer);
=> insert into tj(ary, obj, num) values('{1,5}'::int[], '{"obj":1}', 5);
INSERT 0 1
=> select row_to_json(q) from (select id, ary, obj, num from tj) as q;
                row_to_json
-------------------------------------------
 {"f1":1,"f2":[1,5],"f3":{"obj":1},"f4":5}
(1 row)
=> insert into tj(ary, obj, num) values('{2,5}'::int[], '{"obj":2}', 5);
INSERT 0 1
=> select row_to_json(q) from (select id, ary, obj, num from tj) as q;
                row_to_json
-------------------------------------------
 {"f1":1,"f2":[1,5],"f3":{"obj":1},"f4":5}
 {"f1":2,"f2":[2,5],"f3":{"obj":2},"f4":5}
(2 rows)
Note JSON data cannot be used as partition keys and does not support JSON aggregate functions.

Join multiple tables.

create table tj2(id serial, ary int[], obj json, num integer);
=> insert into tj2(ary, obj, num) values('{2,5}'::int[], '{"obj":2}', 5);
INSERT 0 1
=> select * from tj, tj2 where tj.obj->>'obj' = tj2.obj->>'obj';
 id |  ary  |    obj    | num | id |  ary  |    obj    | num
----+-------+-----------+-----+----+-------+-----------+-----
  2 | {2,5} | {"obj":2} |   5 |  1 | {2,5} | {"obj":2} |   5
(1 row)
=> select * from tj, tj2 where json_object_field_text(tj.obj, 'obj') = json_object_field_text(tj2.obj, 'obj');
 id |  ary  |    obj    | num | id |  ary  |    obj    | num
----+-------+-----------+-----+----+-------+-----------+-----
  2 | {2,5} | {"obj":2} |   5 |  1 | {2,5} | {"obj":2} |   5
(1 row)

Create a JSON function index.

CREATE TEMP TABLE test_json (
       json_type text,
       obj json
);
=> insert into test_json values('aa', '{"f2":{"f3":1},"f4":{"f5":99,"f6":"foo"}}');
INSERT 0 1
=> insert into test_json values('cc', '{"f7":{"f3":1},"f8":{"f5":99,"f6":"foo"}}');
INSERT 0 1
=> select obj->'f2' from test_json where json_type = 'aa';
 ? column?
----------
 {"f3":1}
(1 row)
=> create index i on test_json (json_extract_path_text(obj, '{f4}'));
CREATE INDEX
=> select * from test_json where json_extract_path_text(obj, '{f4}') = '{"f5":99,"f6":"foo"}';
 json_type |                 obj
-----------+-------------------------------------------
 aa        | {"f2":{"f3":1},"f4":{"f5":99,"f6":"foo"}}
(1 row)

Create a JSON-B function index.

-- Create a test table and generate data.
CREATE TABLE jtest1 (
    id int,
    jdoc json
);

CREATE OR REPLACE FUNCTION random_string(INTEGER)
RETURNS TEXT AS
$BODY$
SELECT array_to_string(
    ARRAY (
        SELECT substring(
            '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'
            FROM (ceil(random()*62))::int FOR 1
        )
        FROM generate_series(1, $1)
    ),
    ''
)
$BODY$
LANGUAGE sql VOLATILE;

insert into jtest1 select t.seq, ('{"a":{"a1":"a1a1", "a2":"a2a2"}, 
"name":"'||random_string(10)||'","b":"bbbbb"}')::json from
generate_series(1, 10000000) as t(seq);

CREATE TABLE jtest2 (
    id int,
    jdoc jsonb
);

CREATE TABLE jtest3 (
    id int,
    jdoc jsonb
);

insert into jtest2 select id, jdoc::jsonb from jtest1;
insert into jtest3 select id, jdoc::jsonb from jtest1;

-- Create an index.
CREATE INDEX idx_jtest2 ON jtest2 USING gin(jdoc);
CREATE INDEX idx_jtest3 ON jtest3 USING gin(jdoc jsonb_path_ops);

-- Execute a query plan without an index.
EXPLAIN ANALYZE SELECT * FROM jtest1 where jdoc @> '{"name":"N9WP5txmVu"}';
                                                              QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
 Gather Motion 2:1  (slice1; segments: 2)  (cost=0.00..162065.73 rows=10100 width=88) (actual time=1343.248..1777.605 rows=1 loops=1)
   ->  Seq Scan on jtest2  (cost=0.00..162065.73 rows=5050 width=88) (actual time=0.042..1342.426 rows=1 loops=1)
         Filter: (jdoc @> '{"name": "N9WP5txmVu"}'::jsonb)
 Planning time: 0.172 ms
   (slice0)    Executor memory: 59K bytes.
   (slice1)    Executor memory: 91K bytes avg x 2 workers, 91K bytes max (seg0).
 Memory used:  2047000kB
 Optimizer: Postgres query optimizer
 Execution time: 1778.234 ms
(9 rows)

-- Execute a query plan by using the jsonb_ops operator.
EXPLAIN ANALYZE SELECT * FROM jtest2 where jdoc @> '{"name":"N9WP5txmVu"}';
                                                           QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
 Gather Motion 2:1  (slice1; segments: 2)  (cost=88.27..13517.81 rows=10100 width=88) (actual time=0.655..0.659 rows=1 loops=1)
   ->  Bitmap Heap Scan on jtest2  (cost=88.27..13517.81 rows=5050 width=88) (actual time=0.171..0.172 rows=1 loops=1)
         Recheck Cond: (jdoc @> '{"name": "N9WP5txmVu"}'::jsonb)
         ->  Bitmap Index Scan on idx_jtest2  (cost=0.00..85.75 rows=5050 width=0) (actual time=0.217..0.217 rows=1 loops=1)
               Index Cond: (jdoc @> '{"name": "N9WP5txmVu"}'::jsonb)
 Planning time: 0.151 ms
   (slice0)    Executor memory: 69K bytes.
   (slice1)    Executor memory: 628K bytes avg x 2 workers, 632K bytes max (seg1).  Work_mem: 9K bytes max.
 Memory used:  2047000kB
 Optimizer: Postgres query optimizer
 Execution time: 1.266 ms
(11 rows)

-- Execute the query plan by using the jsonb_path_ops operator.
EXPLAIN ANALYZE SELECT * FROM jtest3 where jdoc @> '{"name":"N9WP5txmVu"}';
                                                           QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
 Gather Motion 2:1  (slice1; segments: 2)  (cost=84.28..13513.81 rows=10101 width=88) (actual time=0.710..0.711 rows=1 loops=1)
   ->  Bitmap Heap Scan on jtest3  (cost=84.28..13513.81 rows=5051 width=88) (actual time=0.179..0.181 rows=1 loops=1)
         Recheck Cond: (jdoc @> '{"name": "N9WP5txmVu"}'::jsonb)
         ->  Bitmap Index Scan on idx_jtest3  (cost=0.00..81.75 rows=5051 width=0) (actual time=0.106..0.106 rows=1 loops=1)
               Index Cond: (jdoc @> '{"name": "N9WP5txmVu"}'::jsonb)
 Planning time: 0.144 ms
   (slice0)    Executor memory: 69K bytes.
   (slice1)    Executor memory: 305K bytes avg x 2 workers, 309K bytes max (seg1).  Work_mem: 9K bytes max.
 Memory used:  2047000kB
 Optimizer: Postgres query optimizer
 Execution time: 1.291 ms
(11 rows)

			

The following example shows how to use Python to access a database:

#! /bin/env python
import time
import json
import psycopg2
def gpquery(sql):
    conn = None
    try:
        conn = psycopg2.connect("dbname=sanity1x2")
        conn.autocommit = True
        cur = conn.cursor()
        cur.execute(sql)
        return cur.fetchall()
    except Exception as e:
        if conn:
            try:
                conn.close()
            except:
                pass
            time.sleep(10)
        print e
    return None
def main():
    sql = "select obj from tj;"
    #rows = Connection(host, port, user, pwd, dbname).query(sql)
    rows = gpquery(sql)
    for row in rows:
        print json.loads(row[0])
if __name__ == "__main__":
    main()