AnalyticDB for PostgreSQL integrates the graph analysis engine feature as an extension to allow you to perform efficient queries and operations on graph data by using the Cypher query language. Graph analysis engines can process highly interconnected datasets and are widely used in scenarios such as social networks, fraud detection, recommendation engines, knowledge graphs, and network/IT O&M. For example, graph query capabilities in a typical social network can help quickly solve complex social relationship analysis issues, significantly improving query efficiency and flexibility.
What is a graph?
The basic components of a graph are nodes and edges. Nodes represent objects or entities, and edges describe the relationships between nodes.
As shown in the preceding figure, the nodes on the left side represent various entities, including Product A, Company A, Employee A, Employee E, and Employee E's Friend A. Edges are used to represent the relationships between these nodes, such as employment relationships between companies and employees, friendship relationships between persons, and R&D relationships between products and employees. Properties can be attached to nodes and edges to enrich information expression. For example, you can add employee numbers to nodes or record employment start times on edges. A graph with properties is called a property graph.
Graphs can be used to describe entities and their relationships in an abstract and intuitive manner. Graph models support various node types, edge types, and properties, providing high expression capabilities and assisting in handling highly interconnected data scenarios.
Graph analysis engine elements
Element | Description |
Graph | A graph is a data structure composed of nodes and edges, which represent entities and their relationships, respectively. A typical example of a graph structure is a social network. Each person can be represented as a node, whereas relationships between individuals, such as friends, family members, and colleagues, are represented by edges. |
Node | Nodes are core elements of a graph analysis engine that represent entities in a database. Each node can be attached with labels to store entity information. For example, nodes in a social network represent users, companies, or organizations, whereas labels include information such as a user's age, a company's name, or an organization's address. |
Edge | Edges are used to connect nodes and can describe relationship characteristics by using labels, such as weight and direction. For example, edges in a social network represent the relationships between users, such as following, friendship, or fans. Weight reflects the strength of the relationships, such as the interaction frequency. Direction indicates the directionality of the relationships, such as the direction of the "following" relationship from one user to another. |
Label | Labels are a type of property used to categorize and identify nodes or edges to provide semantic information to data and improve query efficiency and understanding. For example, node labels in a social network are used to distinguish different entity types, such as Person or Company, whereas edge labels describe the nature of relationships, such as Knows or Works In. |
Property graph | If the nodes or edges in a graph contain labels, this type of graph structure is called a property graph. |
Pattern | In the Cypher language, queries are centered around patterns. Patterns are defined to match specific graph structures. When structures that match a pattern are found or created, the results can be used for further data processing or analysis. |
Advantages
Compared with the table schema of traditional relational databases, the graph structure of AnalyticDB for PostgreSQL is more flexible in data modeling and operations. It constructs data by using nodes and edges and implements data access and operations with nodes as the core. It supports the create, read, update, and delete operations on graph data.
For example, in graph data operations, a node can quickly access its directly associated adjacent nodes by using all of its outgoing edges. This operation, based on nodes and edges, can intuitively express complex relationships between entities, thereby efficiently handling highly interconnected data scenarios.
Difference | Graph analysis engine | Relational database |
Focus | Focuses on data entities and their relationships. | Focuses on the data storage classification. Tables must be predefined with schemas. |
Flexibility | Provides high flexibility. When data significantly changes, you simply need to add new nodes, edges, and properties, and assign corresponding types to complete the update. | Provides low flexibility. When data significantly changes, you must adjust table schemas or create multiple tables, which incurs high costs due to data schema modifications. |
Intuitiveness and complexity | Using graphs to express real-world relationships is more direct and natural. Data analysis and queries can be directly performed based on the topologies of nodes and edges. The required data can be quickly located by using intuitive connection relationships, simplifying the processing of complex relationships. | You must create entity tables and relationship tables, and then associate data by using complex mappings. This process requires a high level of abstract thinking. |