7 Best Practices of Using Alibaba Cloud Container Service to Develop Blockchain Hyperledger Fabric Solutions

In this article, we will be exploring the best practices of using Alibaba Cloud Container Service to develop blockchain applications and solutions based on Hyperledger Fabric.

Alibaba Cloud Container Service allows users to deploy and configure a blockchain Hyperledger automatically with the blockchain solution, or through a manual, self-built approach. In this article, we will be exploring the best practices of using Alibaba Cloud Container Service to develop blockchain applications and solutions based on Hyperledger Fabric.

Best Practice 1: Generate Sample YAML for Hyperledger Fabric Deployment in Kubernetes

The Blockchain Solution for Kubernetes clusters of Alibaba Cloud Container Service is released in the application directory of Container Service in the form of a Helm Chart. The Blockchain Solution can be used to configure and deploy blockchain networks in Kubernetes clusters of Alibaba Cloud Container Service in one click. It can also be used as an auxiliary YAML generation tool to generate custom sample YAML for Kubernetes. If you are a novice user of Kubernetes and Hyperledger Fabric, this skill provides you with a verified sample YAML for reference to quickly develop custom YAML as needed.

This skill requires the use of the helm install --dry-run command and the schelm tool.

The procedure is as follows:

Install Golang and GIT on the master node of a Kubernetes cluster of Alibaba Cloud Container Service (the root account is used for the following operation).

yum install golang
yum install git

Add the bin directory of Golang to the PATH environment variable.

vi ~/.bash_profile

Add the following value to PATH:

PATH=$PATH:$HOME/bin:$HOME/go/bin

Save and exit. Run the following command to make the variable take effect:

source ~/.bash_profile

Run the following command to install schelm:

go get -u github.com/databus23/schelm

Generate the default YAML used to deploy Hyperledger Fabric.

helm install --name blockchain-network01 --dry-run --debug incubator/acs-hyperledger-fabric 2>&1 | schelm -f output/

The command is successfully executed if the following command output is returned and an output folder is generated:

2017/12/29 11:16:41 Creating output/acs-hyperledger-fabric/templates/ca-deploy-svc.yaml
2017/12/29 11:16:41 Creating output/acs-hyperledger-fabric/templates/fabric-network-generator-deploy-svc.yaml
2017/12/29 11:16:41 Creating output/acs-hyperledger-fabric/templates/kafka-deploy-svc.yaml
2017/12/29 11:16:41 Creating output/acs-hyperledger-fabric/templates/orderer-deploy-svc.yaml
2017/12/29 11:16:41 Creating output/acs-hyperledger-fabric/templates/peer-deploy-svc.yaml
2017/12/29 11:16:41 Creating output/acs-hyperledger-fabric/templates/zookeeper-deploy-svc.yaml
2017/12/29 11:16:41 Creating output/acs-hyperledger-fabric/templates/fabric-image-downloader-ds.yaml
2017/12/29 11:16:41 Creating output/acs-hyperledger-fabric/templates/cli-pod.yaml
2017/12/29 11:16:41 Creating output/acs-hyperledger-fabric/templates/fabric-init-pod.yaml
2017/12/29 11:16:41 Creating output/acs-hyperledger-fabric/templates/fabric-utils-pod.yaml

If no command output is returned or no output folder is generated, rerun the command with schelm removed and check the error message, for example:

helm install --name blockchain-network01 --dry-run --debug incubator/acs-hyperledger-fabric

The following is a list of properly generated YAML files:

|____output
| |____acs-hyperledger-fabric
| | |____templates
| | | |____kafka-deploy-svc.yaml
| | | |____peer-deploy-svc.yaml
| | | |____fabric-network-generator-deploy-svc.yaml
| | | |____ca-deploy-svc.yaml
| | | |____orderer-deploy-svc.yaml
| | | |____fabric-image-downloader-ds.yaml
| | | |____fabric-init-pod.yaml
| | | |____cli-pod.yaml
| | | |____fabric-utils-pod.yaml
| | | |____zookeeper-deploy-svc.yaml

If you want to customize blockchain network configuration, you can compile a custom value yaml file in accordance with the Blockchain Solution configuration and deployment. For example:

# sample network01.yaml
fabricNetwork: network01
fabricChannel: tradechannel
orgNum: 3
ordererNum: 4
ordererDomain: shop
peerDomain: shop
externalAddress: 11.22.33.44
caExternalPortList: ["31054", "31064", "31074"]
ordererExternalPortList: ["31050", "31060", "31070", "31080"]
peerExternalGrpcPortList: ["31051", "31061", "31071", "31081", "31091", "31101"]
peerExternalEventPortList: ["31053", "31063", "31073", "31083", "31093", "31103"]

Run the following command to pass the values yaml file to generate a YAML deployment file for Kubernetes.

helm install --name blockchain-network01 --values network01.yaml --dry-run --debug incubator/acs-hyperledger-fabric 2>&1 | schelm -f output/

Note: The Blockchain Solution can be deployed only in Alibaba Cloud environment. The YAML file generated using the method described above cannot be directly used in non-Alibaba Cloud environments. It is mainly used for reference.

Best Practice 2: Select Storage for the Deployed Blockchain

Currently, the Blockchain Solution uses Network Attached Storage (NAS) based on the Network File System (NFS) protocol to:

Share blockchain network configurations, certificates, and keys among different blockchain nodes.
Provide data persistence storage for main blockchain nodes.

You can select different types of storage based on your service and technical needs, including:

Block storage, such as Alibaba Cloud disk. It is applicable to data persistence storage for nodes.
File storage, such as Alibaba Cloud NAS. It is applicable to access based on file paths and supports flexible attaching and usage.
Object storage, such as Alibaba Cloud OSS. It is applicable to access based on file objects.

Container Service provides flexible support for the preceding storage types. For attaching and use instructions, see Storage Management.

Best Practice 3: Select a Billing Method for the Deployed Blockchain

Container Service

When you deploy a blockchain in a Kubernetes cluster of Alibaba Cloud Container Service, Container Service is free-of-charge in most cases. The main billing items are underlying resources and services, such as ECS, storage, Server Load Balancer, and Internet addresses. For details, see Billing.

ECS

By default, the ECS instance that is automatically created along with a Kubernetes cluster is billed in Pay-As-You-Go mode. If you consider switching to the Subscription mode to save costs, note that currently you cannot select existing ECS instances when creating a Kubernetes cluster and cannot add existing ECS instances to a Kubernetes cluster. Kubernetes clusters of Container Service are planned to support adding existing ECS instances to Kubernetes clusters. Follow the product documentation to get more updates.

Alibaba Cloud Disk

If you buy cloud disks for blockchains, the cloud disks are billed in Pay-As-You-Go mode by default. You can attach and detach the cloud disks to/from different ECS instances repeatedly. If you attach a cloud disk to an ECS instance and switch the instance to the Subscription mode, the cloud disk is also billed in Subscription mode. In this case, the cloud disk cannot be repeatedly attached and detached. For details, see Billing

The pricing of cloud disks is the same in Pay-As-You-Go and Subscription mode. You can keep the default Pay-As-You-Go.

Best Practice 4: Select a Region for the Deployed Blockchain

When you plan the production deployment of a blockchain system, you should choose a Container Service cluster in a specific region for the deployment. In theory, the Container Service provides the same capability in all regions that support this service. You can select a region based on your service or technical needs. If end users are spread out, you can select a region with a relatively central geographic location. If end users are relatively concentrated, you can select a region closest to them.

Best Practice 5: Allow Access to Blockchain Services from Public Networks

After a Hyperledger Fabric blockchain network is deployed in a Kubernetes cluster, corresponding services (such as CA, peer, and orderer) can be accessed within the cluster by using service names. kube-proxy resolves service names into the cluster IP address and routes access requests to pods for processing.

To enable access to the services of blockchain network nodes from outside the cluster, you need to define NodePorts in an external access control list (see the configuration document) and configure public IP addresses that permit external access.

For how to create and bind an EIP to a worker node, see Environment Preparation of the Blockchain Solution. The described method is applicable to development and test environments. It is easy to use with few configurations. You can enable access to NodePorts within a specified range after simple configuration of VPC security group rules.

The method has the following limitations:

The entire cluster cannot provide services externally if the worker node is faulty.
The NodePorts within the specified range are exposed to public networks regardless of whether the ports have running services.

If you plan production deployment, we recommend that you create a Server Load Balancer instance to distribute external access requests to all the worker nodes to achieve high availability, load balancing, and port security.

The procedure is as follows:

Create a Server Load Balancer instance.
Record the public IP address of the Server Load Balancer instance as the external access address (externalAddress) of the Blockchain Solution.
Add all the worker nodes in batches as the backend servers of the Server Load Balancer instance.
Select TCP for the listening port of the Server Load Balancer instance and specify the same NodePort as the frontend and backend ports. Use the default settings for other configuration items.
Create listening ports for the NodePorts of services in the blockchain network that receive access requests from the Internet in accordance with Step 4.

Best Practice 6: Troubleshooting the "Connection Reset by Peer" Message in the Blockchain Log

If Server Load Balancer external listening is enabled for the NodePorts of all the services in a blockchain network, the logs (which can be viewed using the kubectl logs or docker logs command) of some services (such as CA and orderer) may contain many messages similar to the following:

2017-12-28 08:09:16.321 UTC [grpc] Printf -> DEBU 173 grpc: Server.Serve failed to complete security handshake from "172.20.3.1:50028": read tcp 172.20.3.160:7050->172.20.3.1:50028: read: connection reset by peer
2017-12-28 08:09:16.494 UTC [grpc] Printf -> DEBU 174 grpc: Server.Serve failed to complete security handshake from "172.20.3.1:24688": read tcp 172.20.3.160:7050->172.20.3.1:24688: read: connection reset by peer
2017-12-28 08:09:16.599 UTC [grpc] Printf -> DEBU 175 grpc: Server.Serve failed to complete security handshake from "172.20.3.1:58678": read tcp 172.20.3.160:7050->172.20.3.1:58678: read: connection reset by peer

A large number of such messages may affect log O&M and error diagnosis.

After analysis and research, we confirm that such messages are caused by TCP health check of Server Load Balancer.

The Health checks of TCP listeners document of Server Load Balancer has the following description:

The health check process of a TCP listener is as follows:

1. The LVS node server sends a TCP SYN packet to the intranet IP address and health check port of a backend ECS instance.
2. After receiving the request, the backend server returns a TCP SYN and ACK packet if the corresponding port is listening normally.
3. If the LVS node server does not receive the packet from the backend ECS instance within the specified response timeout period, the server determines that the service does not respond and health check fails. Then, the server sends an RST packet to the backend ECS instance to terminate the TCP connection.
4. If the LVS node server receives the packet from the backend ECS instance within the specified response timeout period, the server determines that the service runs properly and the health check succeeds. Then, the server sends an RST packet to the backend ECS instance to terminate the TCP connection.

Note: In general, TCP three-way handshakes are conducted to establish a TCP connection. After the LVS node server receives the SYN and ACK data packet from the backend ECS instance, the LVS node server sends an ACK data packet, and then immediately sends an RST data packet to terminate the TCP connection.
This process may cause backend server to think an error occurred in the TCP connection, such as an abnormal exit, and then report a corresponding error message, such as Connection reset by peer.
Solution:
- Use HTTP health checks.
- If you have enabled the function of obtaining real IP addresses, you can ignore the connection errors caused by accessing the preceding SLB CIDR block.

By referring to the preceding document and the handling suggestions provided by What can I do if health checks generate an excessive number of logs?, we come up with the following handling methods applicable to Hyperledger Fabric:

TCP is required for access to the service port of Hyperledger Fabric. For Server Load Balancer, TCP health check cannot be disabled in the same way as HTTP health check.
The internal source code of Hyperledger Fabric does not filter or suppress log messages of specified types. In this case, you can increase the health check interval.
Modify the health check attributes of the listening port of the Server Load Balancer instance. Increase the default interval (2 seconds; maximum value: 50 seconds) to decrease the frequency of relevant log generation.

Best Practice 7: Troubleshooting the Error "x509: Certificate Has Expired or Is Not Yet Valid" during Blockchain SDK Application Access

When a blockchain network is deployed in a Kubernetes cluster by using the Blockchain Solution and SDK applications are tested, the x509: certificate has expired or is not yet valid error occurs occasionally during invoke chaincode. The error message in the Node.js SDK application is as follows:

[2017-12-15 19:49:06.410] [INFO] Helper - Successfully loaded member from persistence
[2017-12-15 19:49:06.410] [DEBUG] invoke-chaincode - Sending transaction "{"_nonce":{"type":"Buffer","data":[78,166,195,28,221,100,129,247,157,103,26,157,64,16,173,220,177,46,245,144,92,3,233,64]},"_transaction_id":"8e77f6a9a2b7ce5db73350f1493e3ef4fe9e7137d370e4e9d9b9e50c25aab92a"}"
[2017-12-15 19:49:06.416] [DEBUG] Helper - [crypto_ecdsa_aes]: ecdsa signature:  Signature {
  r: <BN: 19b5aeb922b56a83e1bc79b5a24a18d98104ad30a62682738e9b6f6cd2d741a3>,
  s: <BN: 681eb988eb292b78d6063fdc61157db389828efefdb5d264d83bdcbc293a5aca>,
  recoveryParam: 0 }
error: [client-utils.js]: sendPeersProposal - Promise is rejected: Error: Failed to deserialize creator identity, err The supplied identity is not valid, Verify() returned x509: certificate has expired or is not yet valid
    at /Users/yushan/Temp/solution-blockchain-demo/balance-transfer-app/node_modules/grpc/src/node/src/client.js:554:15
error: [client-utils.js]: sendPeersProposal - Promise is rejected: Error: Failed to deserialize creator identity, err The supplied identity is not valid, Verify() returned x509: certificate has expired or is not yet valid
    at /Users/yushan/Temp/solution-blockchain-demo/balance-transfer-app/node_modules/grpc/src/node/src/client.js:554:15
[2017-12-15 19:49:06.583] [ERROR] invoke-chaincode - transaction proposal was bad
[2017-12-15 19:49:06.584] [ERROR] invoke-chaincode - transaction proposal was bad
[2017-12-15 19:49:06.585] [ERROR] invoke-chaincode - Failed to send Proposal or receive valid response. Response null or status is not 200. exiting...
[2017-12-15 19:49:06.585] [ERROR] invoke-chaincode - Failed to order the transaction. Error code: undefined

A large number of comparisons and tests and deep analysis of code show that the effective start time (NotBefore time) of the user certificate of the SDK application enroll is earlier than that of the certificate of fabric-ca that signs the user certificate. Here is an example:

User certificate of the SDK application, effective from 17 06:34:00, December 17, 2017

        Issuer: C=US, ST=California, L=San Francisco, O=org1.alibaba.com, CN=ca.org1.alibaba.com
        Validity
            Not Before: Dec 17 06:34:00 2017 GMT
            Not After : Dec 17 06:34:00 2018 GMT
        Subject: CN=Jim

Certificate of fabric-ca, effective from 17 06:34:32, December 17, 2017

        Issuer: C=US, ST=California, L=San Francisco, O=org1.alibaba.com, CN=ca.org1.alibaba.com
        Validity
            Not Before: Dec 17 06:34:32 2017 GMT
            Not After : Dec 15 06:34:32 2027 GMT
        Subject: C=US, ST=California, L=San Francisco, O=org1.alibaba.com, CN=ca.org1.alibaba.com

Further analysis identifies the direct cause being due to the 5-minute backdate fabric-ca introduces to NotBefore when signing the SDK user certificate. That is, NotBefore is equal to the actual signing time minus 5 minutes. The core code is as follows. It skips the method call chain and the possibility analysis of conditional branches. If you are interested, contact us for further exploration.

hyperledger/fabric-ca/vendor/github.com/cloudflare/cfssl/signer/signer.go

func FillTemplate(template *x509.Certificate, defaultProfile, profile *config.SigningProfile) error
...
    if backdate = profile.Backdate; backdate == 0 {
        backdate = -5 * time.Minute
    } else {
        backdate = -1 * profile.Backdate
    }

    if !profile.NotBefore.IsZero() {
        notBefore = profile.NotBefore.UTC()
    } else {
        notBefore = time.Now().Round(time.Minute).Add(backdate).UTC()
    }

The occasional occurrence of the error baffled us in the beginning. After analyzing the time consumed by the entire process, we found the cause.

The test based on the Blockchain Solution is a complete end-to-end process consisting of the following steps:

Call tools automatically based on user parameters.
Generate a certificate and network configurations.
Generate a YAML file dynamically for Hyperledger Fabric deployment in Kubernetes.
Create a blockchain network in Kubernetes based on the YAML file.
Download the certificate to the SDK application and configure application access to the blockchain network
Run the SDK application to test the blockchain network.

In a traditional Hyperledger Fabric example that uses an existing certificate, the effective start time of the fabric-ca certificate is much longer. Even if the actual signing time of the SDK user certificate is subtracted by the 5-minute backdate, it is not earlier than the effective start time of the fabric-ca certificate.

When Hyperledger Fabric is deployed manually, it takes more than 5 minutes to complete custom configurations and the entire end-to-end process manually. The consumed time may be dozens of minutes, several hours, or several days. The signing time of the SDK user certificate minus the 5-minute backdate is not earlier than the effective start time of the fabric-ca certificate.

The Blockchain Solution of Container Service for Kubernetes shortens the duration of custom configuration and end-to-end process implementation (including manual operations) to 2 to 3 minutes. In this case, the signing time of the SDK user certificate minus the 5-minute backdate may be earlier than the effective start time of the fabric-ca certificate. This may cause the preceding error.

The occasional occurrence is due to the time variation in completing manual operations during each test. For example, the test engineer is occupied by other things.

Based on the preceding analysis, we have worked out the following solutions:

In Hyperledger Fabric 1.0.3 and later versions, the cryptogen tool provides a fix to subtract 5 minutes from the effective start time (NotBefore) of the generated fabric-ca certificate. For details, visit https://jira.hyperledger.org/browse/FAB-6251.
If you use Hyperledger Fabric earlier than 1.0.3, after you create a blockchain network in a Kubernetes cluster, wait for more than 5 minutes before running the SDK application to avoid the error.
The error may occur only in test and development environments with fast end-to-end testing, which are different from actual use cases, especially service usage after production deployment. The error analysis and solution are intended to help you gain a deeper understanding of how Hyperledger Fabric works.

Conclusion

In this article, we have explored the best practices of using Alibaba Cloud Container Service to develop blockchain applications and solutions based on Hyperledger Fabric. The content and information in this article are contributed by Chen Kai, Dong Zhenhua, and Dai Jianwu from the Hyperledger community.

To learn more about this solution, visit the Container Service Blockchain Solution documentation page.

Community

7 Best Practices of Using Alibaba Cloud Container Service to Develop Blockchain Hyperledger Fabric Solutions

Best Practice 1: Generate Sample YAML for Hyperledger Fabric Deployment in Kubernetes

Best Practice 2: Select Storage for the Deployed Blockchain

Best Practice 3: Select a Billing Method for the Deployed Blockchain

Container Service

ECS

Alibaba Cloud Disk

Best Practice 4: Select a Region for the Deployed Blockchain

Best Practice 5: Allow Access to Blockchain Services from Public Networks

Best Practice 6: Troubleshooting the "Connection Reset by Peer" Message in the Blockchain Log

Best Practice 7: Troubleshooting the Error "x509: Certificate Has Expired or Is Not Yet Valid" during Blockchain SDK Application Access

Conclusion

Read previous post:

Read next post:

Alibaba Cloud Blockchain Service Team

You may also like

Comments

Alibaba Cloud Blockchain Service Team

Related Products

DevOps Solution

LedgerDB

Blockchain as a Service

ACK One