founded by the Linux Foundation in October, 2015, Hyperledger Project is an open-source blockchain R & D incubation Project dedicated to the collaborative development of distributed ledger based on blockchain. The Fabric project aims to build a platform that provides distributed ledger solutions.
Status quo of blockchain
bitcoin can be said to be the first practical project of the blockchain, but most of the introduction materials for the blockchain are concise and comprehensive ", when it comes to blockchain, some familiar words may be" data tamper-free "," Public chain "," distributed data "," consensus mechanism ", etc, however, the sad thing for this technology itself is that most bitcoin users do not believe in the reliability of this technology, but more because the platform has a large number of users and transaction amounts. Here, I would like to briefly describe my understanding of blockchain and its suitable application scenarios, so that readers can understand what blockchain is and what it can do more easily, then, it explains how blockchain technically solves problems in actual business.
Business expectation-credit problem
first of all, let's start with Bitcoin. People should be familiar with the term Bitcoin computing power proof (POW), not to mention that it consumes a lot of resources. From the perspective of consensus mechanism, with more than 50% computing power, you can control the entire bitcoin. It is a highly risky mechanism both in terms of technology and business. However, no one in the magical financial circle will touch this bottom line, once someone has more than 50% computing power, Bitcoin may not be able to play any more :) What are the actual business needs? For example, in the bank settlement and clearing system, if cross-bank transactions occur in the traditional bank transaction system, then the transaction data needs a clearing system to conduct regular reconciliation to ensure that the transaction data of both parties are synchronized and correct, it may take T +1 for cross-bank transfer. The main reason is that the systems and data of both parties are independent of each other and the data are not "trusted", therefore, a clearing system is required to verify transaction data, and the blockchain can be said to be a technology generated to solve this "trust" problem. The data is authenticated while the two parties conduct transactions, in this way, real-time transfer and other functions can be achieved without settlement after transactions. Among them, Weizhong Bank released its blockchain solution. As for the open-source blockchain source code of Weizhong Bank, I have not studied it in depth and will not comment on it. For more business scenario cases, we will take the following examples to describe what technical measures are needed to solve the "credit" problem.
What are the technical features required to achieve data trustworthiness?
Taking the above-mentioned clearing system as an example, some people may suggest that using the same distributed database can achieve real-time data synchronization. Indeed, one of the features of blockchain is distributed, but what is the difference between it and traditional distributed databases? When returning to the business, if the banks of both sides use such a distributed database when conducting transactions, don't you worry that the other side secretly changes the data? If you say that there are logs that can be traced to the modification records, the logs are easy to be modified first, and the logs cannot be redeemed, which may cost millions or even more than 100 million yuan, therefore, traditional distributed databases are "untrusted" for enterprises. Therefore, the blockchain needs to ensure that data cannot be tampered with, users have identity authentication, transactions can be traced, transactions have a series of features such as weights.
Analyze technical principles
the features of the blockchain mentioned above, some of which are complementary and cannot be divided so clearly in terms of functions, so how should these features be implemented, next, we will elaborate on some specific implementations of Fabric.
to achieve data tamper-proofing, the data structure is the first reason why blockchain is called blockchain. As shown in the following figure, each storage unit contains the hash value of the previous storage unit (the mapping between the hash values in the figure is not completely accurate and is only indicated) and its own transaction data block, from the appearance, it is like connecting all data blocks together, which is called "block chain", forming a chain-like transaction record that can be traced. The data of this chain structure is called Ledger data, which stores all transaction records. In addition, there is a "world state", which is essentially a Key-Value database, maintaining the final state of transaction data, it is easy to query and other operations, and each data has its corresponding version number.
Main modules of Fabric
in general, various blockchain implementation solutions on the market are based on this data structure. However, data structure alone cannot guarantee that data cannot be tampered with. Another important factor is the consensus mechanism, A good consensus mechanism is the foundation to ensure the operation of the whole business, which is equivalent to the contract or agreement signed by both parties. Only when both parties abide by the Treaty can they cooperate to carry out the business. For example, common POW,POS,PBFT and so on all belong to the consensus mechanism, and their principles and disadvantages are not described in detail here. We mainly explain the design scheme and principles of Fabric application in detail, we have explained the concepts of some specific nouns.
First, let's talk about the concept of "intelligent contract. In A traditional centralized system, for example, if Alipay user A transfers 100 yuan to B, then assume that A has 100 yuan at the beginning and B has 0 yuan, then calling the transfer function in Alipay may be such A process, calling transfer(A,B,100), the function may read the account balances of user A and User B, so we can implement input(A,B,100),read(A:100,B:0),write(A:0,B:100), then this is completed only after being executed in the Alipay system, then how to form A contract? For example, when A and B are signing A contract, both parties need to sign and approve the contract. In the procedure, it is equal, User A executes transfer(A,B,100) locally to obtain input(A,B,100),read(A:100,B:0), and write(A:0,B:100) and sign it for authentication. User B performs transfer(A,B,100) locally to obtain input(A,B,100). read (A:100,B:0),write(A:0,B:100) and sign and authenticate them, then both parties send the results to the other party, then, check whether the results of the other party are consistent and verify whether the signature is correct. After the contract is reached, the result is written to the local device. Generally speaking, a core code is extracted, and all participants execute the code and perform signature authentication and comparison on the results, which is called executing intelligent contracts, the common code is "contract".
Let's talk about the concept of "endorsement strategy. So, according to the common code mentioned above, do all users have to execute it in the transaction? For example, users A and B transfer money, then users C and D obviously do not need to require them to participate in the execution of functional contracts, then the endorsement strategy is to specify the signature endorsement of the members required for the result of the intelligent contract to be considered as a successful transaction.
- Peer node
*** This node is the subject involved in the transaction. It can be said to represent each member involved in the chain. It is responsible for storing the complete Ledger data, namely the blockchain data, responsible for the execution of intelligent contracts in the consensus phase, where all Peer nodes maintain complete Ledger data, which is called Committer, and determine which Peer is based on the specific business division endorsement strategy.
- Orderer node
*** This node collects transaction requests for sorting and packaging to produce new blocks. The main function is to sort transactions to ensure data consistency on Peer nodes, ACL is also included for access control.
- CA node
*** This node is responsible for authorization and authentication of all nodes added to the chain, including the upper-layer client. Each node has its certificate for identity identification in the transaction process.
*** Fabric provides an SDK for the client to make it easier for developers to connect to transactions in the blockchain. The transaction is initiated through the SDK.
Application in supply chain finance the functions of each module are briefly described above. Of course, more functions of service support are included in the actual situation. Then, the supply chain finance is used as an example to better understand the significance of each node.
A simple supply chain model, a core enterprise purchases 1000W goods from its suppliers, and settles the bill six months after receiving the goods according to the credit sales contract, after half a year's account period, the supplier's funds cannot be turned over, so the bank acceptance bill issued by the core enterprise is used for mortgage financing, then 95% of the amount of the bill will be transferred to the supplier immediately after the bank passes the audit, after half a year, the core enterprise will directly transfer the payment to the bank, thus forming a financing transaction of the supply chain.
Most of the actual businesses are operated online and offline, which consumes a lot of manpower and time. How can we turn such businesses into online electronic ones? Some people may say that it would be good for banks to provide such platform services. Assuming that this platform is not only involved by this bank, if all banks or enterprises can be carried out on the same platform, then it seems inappropriate to hand over services to a certain bank. OK, let's use Fabric to implement such a system. Let's see how the deployment is distributed.
The above is an ideal model. Of course, such a deployment solution may not be true in practice. For example, vendors may not have the ability to access servers inside the system. We still take this as an example to illustrate the significance of the node. In the figure, each participant deploys a Peer node locally and accesses the client of the business system, so each Peer node maintains all transaction data, you can only check the data locally. You can also check the Peer node comparison data of others, the Central CA node is responsible for issuing certificates to each node, including the client, so that they can authenticate each other in the transaction process to prevent malicious external access to view data or participate in transactions, orderer nodes are connected with all Peer nodes to obtain transaction results for sorting control, the overall transaction process is involved here, which is explained by referring to the official sample diagram.
To describe the transaction process in the preceding figure, the client initiates a transaction request. The endorsement policy in the preceding figure requires Peer1, Peer2, and Peer3 to participate in the transaction. Therefore, the client sends the request to Pee1, peer2 and Peer3. After receiving the transaction request, the three peers execute the corresponding intelligent contract, sign the result, and return the output results to the client respectively, after receiving all the execution results, the client packages them and sends them to the Orderer. The Orderer sorts the received transactions in the transaction pool and packages them together to generate a new block, orderer send the new block to all Peer nodes. After receiving the new block, each Peer node verifies whether the signature of each transaction result conforms to the endorsement policy, and check whether the Read-Write Set is the same as the local version. If all the conditions are met, the new block is written into the local Ledger to complete the transaction.
The above is a relatively rough description of the transaction process, but there are still many details to be handled in practice. In addition, some people may ask, where is the consensus node? Why are there Orderer such central nodes? If you think about it carefully, you will find that the consensus mechanism has been integrated into the entire transaction process, which is also the advantage of this design. Let's analyze, assuming that Orderer node is a malicious node, can we control transactions to generate "false accounts? Let's take a look at the function of Orderer, which receives transaction data, sorts it and packages it into blocks, assuming that Orderer want to fake data, what it needs to bypass is the verification of the endorsement policy before each Peer node writes data, then the data must contain the node signature required in the endorsement policy, however, Orderer cannot obtain the private key of each Peer node and cannot generate the corresponding signature. Therefore, Orderer cannot control the fraud of the transaction chain, it can be said that Orderer is a tool service that is not involved in any business process. It only concerns the stability of the service. If you need to keep data confidential to Orderer nodes, you need to implement data encryption. Because of the setting of its endorsement policy, specific business scenarios that can be accurately met are not subject to any form of malicious node intrusion, which is also different from POW or Byzantine fault tolerance, they may be manipulated by malicious nodes under certain conditions.
Core basic services
after having a certain understanding of the main modules and processes of Fabric, we will continue to explore the detailed functions in it. In order to make the whole framework work, we certainly need to use more technical means, here we mainly talk about several core functions.
- Gossip Protocol
in the preceding transaction flow chart, Orderer sort and package the transaction data and distribute it to each Peer node, if hundreds or even more Peer nodes are distributed by Orderer nodes, first of all, whether a single point of pressure can be tolerated, and then how to synchronize if a failure occurs. In Fabric implementation, Peer nodes are synchronized to each other instead of Orderer nodes to distribute messages. Each Peer node maintains the information of other Peer nodes, randomly exchange block information with some other Peer nodes, and use Peer-to-Peer technology to speed up data transmission during transmission, orderer nodes only send packaged blocks to specific Leader Peer (which can be manually specified or selected by the Orderer), and then Peer nodes exchange data with each other through the Gossip protocol to achieve final consistency.
according to the Gossip protocol described above, it can be seen that the writing time of each Peer node may be inconsistent, so the client performs business logic judgment (such as transaction logic) how do I know if a specific transaction data has been written to a Peer node? In fact, each Peer maintains a Eventhub connection with the client. After the Peer node completes the transaction, if the block is written into the ledger, it will send a message to each client. However, note that, callbacks are always untrusted, and messages may be lost. Fabric does not guarantee the final arrival of messages.
- Read-Write Set
before the Peer node writes a block into the ledger, the endorsement policy is verified as described above to prevent malicious node intrusion and achieve a chain with weights and real-name transactions. In addition to verifying the signature verification of each node, it is necessary to check whether the output results of each node are consistent. How to measure whether the results are consistent? The concept of read/write set is proposed here. A program is converted into IO. If the same Input is used to obtain consistent Output, then we can agree that the function properties are the same in this particular case. In this case, we do not care about Input. As long as the data written or modified is consistent, a consensus can be reached. Therefore, Write Set is used to save the data Set that needs to be written or modified, this is used to compare whether the result Set of each node is consistent, and what data is Read in the execution contract of each node is stored in the Read Set, and the current version of this data is recorded in the Read Set, before the Peer node writes data to the block, it also checks whether the version of the data Read in the Read Set is the same as that in the current data environment to prevent confusion caused by concurrent transactions.
employees who have just come into contact with the blockchain may have a concept that the blockchain should be guaranteed to be fair, fair, and open. Therefore, a concept of "public chain" is formed, such as bitcoin, where all employees can participate, transparent to all. However, the blockchain is not limited to the public chain. For most business scenarios, it should belong to the consortium chain, that is, the business with specific members participating and weighted allocation, for example, in the reconciliation link between banks, in the transactions between banks A, B and C, of course, the transactions between banks A and B are not willing to be disclosed to Bank C, while A, B, all transactions of Bank C may have to be reported to the central bank, which shows that the "public chain" here is not advisable. How to ensure fairness, justice and openness? First, the code must be open-source to members. All services can be built by themselves. Stakeholders can jointly review the "smart contract". All members have a consensus endorsement policy. Mutual authorization or authorization by a trusted third-party authentication center. Then the most basic certification system is particularly important. Let's take a look at how Fabric implements its certification system.
- PKI(Public Key Infrastructure)
- X.509 certificate
Membership Service Providers(MSP) when dividing the member structure, Fabric uses MSP to define a member. In the best instance recommendation, an enterprise or organization can be a separate MSP, for example, the supply chain case mentioned above, the example shows that the core enterprise is an MSP, and the bank and the supplier each represent an MSP, so there can be multiple Peer nodes under an MSP, different authorizations have different functions. The specific application scenarios of MSP are as follows.
- When deploying a smart contract or initializing a contract, you must have a certificate that is authorized by the corresponding CA. The default certificate is PeerAdmin.
- When registering a certificate for a new node or user, the CA must grant the permission to the operation certificate (typically the Admin user).
- In the endorsement policy, you can use MSP to represent the endorsement member. You can set a single Peer node to reach an agreement on behalf of its MSP (you can also require all Peer nodes to pass the agreement).
- When communicating with Peer nodes across MSPs, you first collect the Peer list in the MSP through the specified Anchor Peer in each MSP, and then interact with the Peer list through the Anchor Peer in each MSP, after synchronizing the Peer list under other MSPs to the internal Peer, the Peer nodes communicate randomly through the Gossip protocol.
- Each MSP has its own independent CA node and provides it with all certificate requirements. Each MSP shares the ROOT certificate of its CA node for mutual authentication.
- Anonymous transactions. In a transaction, each user certificate that participates in endorsement is included, which can be considered as a public real-name transaction. All members in the chain can see who participates in each transaction, but how can we achieve anonymous transactions? In Fabric version 0.6, there are concepts of Ecert and Tcert. Ecert is a user's certificate, while Tcert is used for anonymous transactions. Users can apply for a batch of Tcert for transactions from CA, the Tcert does not contain the user's information. When the verification information needs to be verified, the user can be authenticated by the CA. (This feature has not been implemented in version 1.0)
- Revoke the certificate. In PKI system, its biggest advantage is Off line, that is, after the certificate is issued, it can be authenticated locally without the existence of CA nodes, however, a big problem is that if you want to notify each node of the message of canceling the certificate when you cancel the certificate, the current practice is to keep the CA nodes online and communicate with each node. (To obtain Tcert, the CA node must be online.)
the relevant configuration information of each blockchain also includes the division of MSP. The description is relatively complex and will not be described here. If you are interested, please refer to the official documentation :)
difficulties and issues to be solved
the above article mainly gives readers a basic understanding of the overall framework of Fabric, and there are still many minor issues that cannot be discussed one by one. Of course, the blockchain technology is not perfect when it has not been applied to the market on a large scale, and there are many difficult problems to be solved in Fabric.
In the practice of official recommendation, the isolation of data is based on the granularity of the ledger. Unassociated transactions are in different ledger, but in actual business, there are always scenarios where data isolation is required in a single ledger. We have seen relevant design documents published earlier, but we are not sure whether the feature can be completed properly before it is officially released, currently, data can only be encrypted and isolated in the business logic.
When two data are isolated through account books, it is difficult to interact with each other. When cross-account books are called, the first problem to be solved is how to integrate authentication models.
At present, the cost of accessing the blockchain is still very high. Even if most functions of the Fabric project cannot be visually configured, you need to know more about the underlying details to correctly build the environment and configure it.
I can see that the last friends believe that they have some research on blockchain or are very interested in it. We also welcome you to discuss and discuss. If you have more questions about Fabric, I will try my best to answer them.
Finally, you are welcome to join us to promote the development of the blockchain :)