In the game industry, application code commonly serializes data using Protobuf — sometimes with additional zlib compression — and writes it into BLOB columns. Reading that data through SQL normally requires external ETL components or deserializing it in the application layer, which is inconvenient for software debugging and development. The readable Protobuf feature in PolarDB for MySQL eliminates both requirements. Use the PROTO_TO_JSON() function to convert BLOB data to JSON inline, then use standard MySQL JSON functions (JSON_EXTRACT(), JSON_UNQUOTE()) to filter, index, or create virtual columns on top of it — all within SQL.
Prerequisites
Before you begin, make sure you have:
A PolarDB for MySQL 8.0 cluster running revision version 8.0.2.2.5 or later. To check your version, see Query the engine version.
How it works
PROTO_TO_JSON(blob_field) takes a BLOB column that holds Protobuf-serialized data and returns a JSON string. The function handles both raw Protobuf bytes and zlib-compressed Protobuf bytes automatically — no change to your query is needed between the two cases.
After converting BLOB data to JSON, use standard MySQL JSON functions to query fields, build indexes, or generate virtual columns on the Protobuf data.
Configure a Protobuf schema for a column
Before calling PROTO_TO_JSON(), associate a Protobuf schema with the BLOB column using ALTER TABLE ... ALTER COLUMN. This step tells PolarDB for MySQL how to interpret the binary data in that column.
Syntax:
ALTER TABLE table_name ALTER COLUMN column_name
[PROTO_NAME = 'protobuf_schema_name']
PROTO_TEXT = 'protobuf_schema_definition'
PROTO_MESSAGE = 'protobuf_message'
[COMPRESSION = 'zlib']Parameters:
| Parameter | Required | Description |
|---|---|---|
PROTO_NAME | No | A name for the Protobuf schema. |
PROTO_TEXT | Yes | The full Protobuf schema definition. |
PROTO_MESSAGE | Yes | The top-level Protobuf message type to deserialize. |
COMPRESSION | No | Set to zlib if the data was compressed with zlib before being written to the column. Omit this parameter for uncompressed data. |
Zlib-compressed data can also be decompressed manually using UNCOMPRESS(), which returns the raw bytes as hexadecimal.Remove a Protobuf schema from a column
To remove the schema, set all schema parameters to empty strings:
ALTER TABLE table_name ALTER COLUMN column_name
PROTO_NAME=""
PROTO_TEXT=""
PROTO_MESSAGE='';Before removing the schema, make sure the column is no longer referenced by any indexes or virtual columns.
View the Protobuf schema for a column
Enable schema display for the session:
SET display_readable_proto_info = true;Show the column definitions:
SHOW columns FROM table_name;
Examples
The following examples use the addressbook.proto schema from the Protobuf community to walk through a complete workflow: create a table, configure the schema, insert data, and query it.
Schema used in all examples:
syntax = "proto2";
package tutorial;
message Person {
optional string name = 1;
optional int32 id = 2;
optional string email = 3;
enum PhoneType {
MOBILE = 0;
HOME = 1;
WORK = 2;
}
message PhoneNumber {
optional string number = 1;
optional PhoneType type = 2 [default = HOME];
}
repeated PhoneNumber phones = 4;
}
message AddressBook {
repeated Person people = 1;
}Step 1: Create the table
CREATE TABLE t1 (c1 INT, c2 BLOB);c2 holds Protobuf-serialized data.
Step 2: Attach the schema to the column
Choose the statement that matches whether your data is compressed.
Without zlib compression:
ALTER TABLE t1 ALTER COLUMN c2
PROTO_NAME="AddressBook"
PROTO_TEXT="syntax = \"proto2\";\n\npackage tutorial;\n\nmessage Person {\n optional string name = 1;\n optional int32 id = 2;\n optional string email = 3;\n\n enum PhoneType {\n MOBILE = 0;\n HOME = 1;\n WORK = 2;\n }\n\n message PhoneNumber {\n optional string number = 1;\n optional PhoneType type = 2 [default = HOME];\n }\n\n repeated PhoneNumber phones = 4;\n}\n\nmessage AddressBook {\n repeated Person people = 1;\n}"
PROTO_MESSAGE='AddressBook';With zlib compression:
ALTER TABLE t1 ALTER COLUMN c2
PROTO_NAME="AddressBook"
PROTO_TEXT="syntax = \"proto2\";\n\npackage tutorial;\n\nmessage Person {\n optional string name = 1;\n optional int32 id = 2;\n optional string email = 3;\n\n enum PhoneType {\n MOBILE = 0;\n HOME = 1;\n WORK = 2;\n }\n\n message PhoneNumber {\n optional string number = 1;\n optional PhoneType type = 2 [default = HOME];\n }\n\n repeated PhoneNumber phones = 4;\n}\n\nmessage AddressBook {\n repeated Person people = 1;\n}"
PROTO_MESSAGE='AddressBook'
COMPRESSION='zlib';Step 3: Insert data
Without zlib compression:
INSERT INTO t1 VALUES(1, X'0a380a0b56697375616c50726f746f10011a1776697375616c70726f746f40706f6c617264622e636f6d220e0a0a313233343536373839301002');With zlib compression:
INSERT INTO t1 VALUES(1, X'3C000000785ee3b2e0e20ecb2c2e4dcc0928ca2fc9176094122f03730b405c8782fc9cc4a29424bde4fc5c253e2e2e432363135333730b4b03012600183d10de');To verify the compressed bytes, decompress them with UNCOMPRESS():
SELECT HEX(UNCOMPRESS(X'3C000000785ee3b2e0e20ecb2c2e4dcc0928ca2fc9176094122f03730b405c8782fc9cc4a29424bde4fc5c253e2e2e432363135333730b4b03012600183d10de')) AS UNCOMPRESS_DATA;Result:
+----------------------------------------------------------------------------------------------------------------------+
| UNCOMPRESS_DATA |
+----------------------------------------------------------------------------------------------------------------------+
| 0A380A0B56697375616C50726F746F10011A1776697375616C70726F746F40706F6C617264622E636F6D220E0A0A313233343536373839301002 |
+----------------------------------------------------------------------------------------------------------------------+Step 4: Read and query the data
Read BLOB data without PROTO_TO_JSON()
Without the readable Protobuf feature, querying the c2 column directly returns binary data that cannot be interpreted:
Without zlib compression:
SELECT c2 FROM t1\GResult:
*************************** 1. row ***************************
c2:
8
VisualProtovisualproto@polardb.com"
1234567890With zlib compression:
SELECT c2 FROM t1\GResult:
*************************** 1. row ***************************
c2: < x^...The compressed output is binary garbage that cannot be directly read.
Read BLOB data as JSON
PROTO_TO_JSON() works for both compressed and uncompressed data — no change needed in your query:
SELECT PROTO_TO_JSON(c2) FROM t1;Result:
+------------------------------------------------------------------------------------------------------------------------------------------+
| PROTO_TO_JSON(c2) |
+------------------------------------------------------------------------------------------------------------------------------------------+
| {"people": [{"id": 1, "name": "VisualProto", "email": "visualproto@polardb.com", "phones": [{"type": "WORK", "number": "1234567890"}]}]} |
+------------------------------------------------------------------------------------------------------------------------------------------+PROTO_TO_JSON() can read both zlib-compressed and non-compressed data.Extract a specific field
Use JSON_EXTRACT() to pull out individual fields from the JSON output:
SELECT JSON_EXTRACT(PROTO_TO_JSON(c2), '$.people[0].name') FROM t1;Result:
+-----------------------------------------------------+
| json_extract(PROTO_TO_JSON(c2), '$.people[0].name') |
+-----------------------------------------------------+
| "VisualProto" |
+-----------------------------------------------------+Create an index on a Protobuf field
Build a functional index on the email field:
CREATE INDEX i_email ON t1 ((CAST(JSON_UNQUOTE(JSON_EXTRACT(PROTO_TO_JSON(c2), '$.people[0].email')) AS CHAR(100))));Verify the index is used by running EXPLAIN:
EXPLAIN SELECT * FROM t1 WHERE CAST(JSON_UNQUOTE(JSON_EXTRACT(PROTO_TO_JSON(c2), '$.people[0].email')) AS CHAR(100)) = 'visualproto@polardb.com';Result:
+----+-------------+-------+------------+------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+---------+---------+-------+------+----------+-------+
| 1 | SIMPLE | t1 | NULL | ref | i_email | i_email | 403 | const | 1 | 100.00 | NULL |
+----+-------------+-------+------------+------+---------------+---------+---------+-------+------+----------+-------+Create a virtual column
Generate a virtual column that exposes a Protobuf field as a regular column:
ALTER TABLE t1 ADD COLUMN c3 VARCHAR(100) AS (JSON_EXTRACT(PROTO_TO_JSON(`c2`), _utf8mb4'$.people[0].email'));Verify the schema:
DESC t1;Result:
+-------+--------------+------+-----+---------+-------------------+
| Field | Type | Null | Key | Default | Extra |
+-------+--------------+------+-----+---------+-------------------+
| c1 | int(11) | YES | | NULL | |
| c2 | blob | YES | | NULL | |
| c3 | varchar(100) | YES | | NULL | VIRTUAL GENERATED |
+-------+--------------+------+-----+---------+-------------------+c3 is a virtual generated column backed by the Protobuf data in c2.