All Products
Search
Document Center

SAP Data Hub And Data Intelligence Deployment Guide

Last Updated: Apr 01, 2020

SAP Data Hub And Data Intelligence Deployment Guide

Version Control:

Version Revision Date Types Of Changes Effective Date
1.0 Release 2019/11/18
2.0 Add SAP Data Intelligence installation section 2020/03/27

Overview

SAP Data Hub lets you build data-driven processes and pipelines across complex enterprise landscapes with a unified governance,

SAP Data Hub fulfills the following requirements:

  • Enterprise-wide data insight from diverse landscapes, including both big data and enterprise data stores.
  • The need for ingestion, transformation, and processing capabilities within data lakes and big data stores.
  • Manage complex system landscapes with differing security and privacy requirements.

Starting from version 3.0, SAP Data Hub has been upgraded to SAP Data Intelligence. SAP data Intelligence is platform which can deliver data intelligence with AI and information management. It enables data-driven innovation and intelligent processes with data orchestration and machine learning services.

The following are new and enhanced features that are available in SAP Data Intelligence compared to the last release of SAP Data Hub 2.7:

  • More connectivity and integration methods
  • Improved metadata and data governance
  • Support of machine learning and data science
  • More system services

For more details, please kindly refer to https://www.sap.com/products/data-intelligence.html.

This series of tutorials describe how to deploy SAP Data Hub and Data Intelligence on Alibaba Cloud, consisting of several chapters. This chapter is the first part—deployment.

The content is suitable for the reader who has basic knowledge of SAP enterprise product and Alibaba Cloud, planning to apply SAP Data Intelligence on Alibaba Cloud.

Alibaba Cloud Basic Concepts

Overview of Alibaba Cloud

Alibaba Cloud is built on a global infrastructure providing all kinds of IaaS products and services. Alibaba Could services are available to use in different geographical regions across the globe. Before deploy your SAP Data Hub on Alibaba Cloud, following basic knowledge must be understood well:

  • Alibaba Cloud Elastic Compute Service (ECS)

Alibaba Cloud Elastic Compute Service (ECS) is a web service that provides resizable compute capacity in the cloud. Its simple web service interface allows you to obtain and configure computing capacity with minimal effort. You are able to quickly scale capacity up and down as your computing requirements change, and you only pay for capacity that you actually need.

You can use the standard Alibaba Cloud methods to deploy your ECS instances on Alibaba Cloud platform, including ECS Console (the Cloud Platform Console web UI) and REST API. You can read the following pages to get more useful information.

Alibaba Cloud Block Storage (Cloud Disk) provides persistent block-level storage volumes for use with Alibaba Cloud ECS instance on the Alibaba Cloud Platform. Cloud Disk volumes provide the consistent and low-latency performance needed to run your workloads. With Cloud Disk, you can scale your usage up or down within minutes – all while paying a low price for only what you provision.

  • Alibaba Cloud Virtual Private Cloud (VPC)

Virtual Private Cloud (VPC) creates an isolated network environment for users on Alibaba Cloud. You can select an IP address range, divide networks, and configure the routing list and gateway.SAP NetWeaver and the Alibaba Cloud services work together in particular ways to deliver combined business application and infrastructure capabilities to our customers.

  • Alibaba Object Storage Service (OSS)

Alibaba Cloud Object Storage Service (OSS) is a network-based data access service. OSS enables you to store and retrieve structured and unstructured data, including text files, images, audios, and videos.

Deployment Architecture

The following figure shows the minimum sizing architecture of SAP Data Hub deploying on Alibaba Cloud for development or testing environment. (notes: If you want to run SAP Data Hub for a formal production, please refer to SAP Data Hub Sizing Guide url at: https://help.sap.com/viewer/1f833eab23244ef2ad66fe982dd14873/2.7.latest/en-US)

  • A VPC network on Alibaba Cloud
  • An installation host (an ECS instance) with public IP address, 100M bandwidth, 4 GB main memory, 2 cores, 200 GB disk. Because of existing Alibaba container registry service could not meet SAP Data Hub installation requirement, a local private docker registry is required.

Notes: It is possible to have the private docker registry on any node which has internet access. In this tutorial, the private docker registry runs on the installation host to save ECS resource. For the details, refer to docs at https://help.sap.com/viewer/e66c399612e84a83a8abe97c0eeb443a/2.7.latest/en-US/40cc1c6cd72546378182f0de584ced05.html

arch

Setup Alibaba Cloud Enviroment

Create a VPC and vSwitch

In Alibaba Cloud console, click ‘Virtual Private Cloud’. Choose the region and click ‘create VPC’ on top of the page to enter VPC creation page. Region selection should be considered based on the latency requirement of your business workload. In this tutorial “China (Zhangjiakou)” is selected. Input the VPC name and VSwitch name, choose VSwitch Zone, then click ‘OK’.vpcCreationCreate VSwitch:vswitchCreation

Create Installation Host

In Alibaba Cloud console, click ‘Elastic Compute Service’. In ECS overview page, click ‘Create Instance’ as belowecsCreation

Create an instance with following configuration:

Configuration Value
Billing Method Pay-As-You-Go
Region Zhangjiakou
Instance Type ecs.c5.large
OS Image Public image, CentOS, 7.6 64-bit
Storage 200GB SSD

Afterwards, you will have one ECS created as below:installationHost

Create Kubernetes Cluster

In Alibaba Cloud console, click ‘Container Service for Kubernetes’. Click ‘Create Kubernetes Cluster’ on top of the ‘Clusters’ page to enter kubernetes service creation wizard. Select ‘Standard Managed Cluster’ and click ‘Create’ as below:clusterCreationWizzardThen input Cluster Name. Select same Region, VPC and VSwitch that used for Installation Host. In our example, we use below parameter and value for cluster creation.

Configuration Value
Kubernetes 1.14.8-aliyun.1 (notes: not earlier than this edition should be fine)
Container Runtime docker
Worker Instance Create Instance
Instance type x86-Architecture, 4 Cores 32G (This tutorial chooses ecs.r5.xlarge)
System Disk Ultra Disk (or Standard SSD would perform better), 100GiB
Operating System CentOS 7.6
Logon Type Password
Network Plugin Flannel
Configure SNAT Configure SNAT for VPC

ClusterCreationParameters1ClusterCreationParameters2ClusterCreationParameters3ClusterCreationParameters4ClusterCreationParameters5

Afterwards, you should be able to see the cluster is successfully created as below:ClusterCreationFinishClusterList

Join Instllation Host to Security Group of Kubernetes Cluster

Next action is to setup the connection between the installation host and the cluster.

  • Go to the cluster detail page which we have just created and click ‘Nodes’, then click any node of the list.ClusterNodeList

  • Click ‘Security Groups’ and save the Security Group IDSG

  • Go to security page of the installation host that we created before and then click ‘Add to Security Group’ button as below:ecsSG

    Create OSS Bucket

    In Alibaba Cloud console, click ‘Object Storage Service’. Then click ‘Create Bucket’ on the right side of the page. Input the Bucket Name and Region, choose default for other options. Remind the Internet Access of the Endpoint, which will be used with your AccessKeyId and AccessKeySecret during the setup process.

ossBucketCreationossendpoint

Prepare Installation Enviroment

After finish above steps, you should be able to connect to the installation host with its public IP and then you can use private IP of the nodes in the cluster for SSH logon.

Install Docker

Install docker on installation host according tohttps://kubernetes.io/docs/setup/production-environment/container-runtimes/#docker. In our example, following commands are executed:

  1. # Install Docker CE
  2. ## Set up the repository
  3. ### Install required packages.
  4. yum install yum-utils device-mapper-persistent-data lvm2
  5. ### Add Docker repository.
  6. yum-config-manager \
  7. --add-repo \
  8. https://download.docker.com/linux/centos/docker-ce.repo
  9. ## Install Docker CE.
  10. yum update && yum install docker-ce
  11. ## Create /etc/docker directory.
  12. mkdir /etc/docker
  13. # Setup daemon.
  14. cat > /etc/docker/daemon.json <<EOF
  15. {
  16. "exec-opts": ["native.cgroupdriver=systemd"],
  17. "log-driver": "json-file",
  18. "log-opts": {
  19. "max-size": "100m"
  20. },
  21. "storage-driver": "overlay2",
  22. "storage-opts": [
  23. "overlay2.override_kernel_check=true"
  24. ]
  25. }
  26. EOF
  27. mkdir -p /etc/systemd/system/docker.service.d
  28. # Restart Docker
  29. systemctl daemon-reload
  30. systemctl restart docker

Create Local private docker registry for SAP Data Hub

This part is for SAP Data Hub Installation. For SAP Data Intelligence installation, please kindly refer to Create a private safe docker registry on installation host.

  • On the installation host, run below command:

docker run -d -p 5000:5000 --restart always --name registry registry:2

  • For all worker nodes and the installation host, log on the machine(by SSH) to add folloing line to file /etc/docker/daemon.json file in order to permit insecure docker registry access.

"insecure-registries": ["ip_of_installation_host_machine:5000"]

registryEntry

  • Restart docker service on each node and installation host to make the change take effect:

systemctl restart docker

Install kubectl

Now you need to install kubectl on the installation host as below:

  1. #To download a version 1.14.8 version
  2. curl -LO https://storage.googleapis.com/kubernetes-release/release/v1.14.8/bin/linux/amd64/kubectl
  3. #make the kubectl binary executable
  4. chmod +x ./kubectl
  5. #Move the binary into your PATH
  6. sudo mv ./kubectl /usr/local/bin/kubectl
  • In case you met error as The connection to the server was refused - did you specify the right host or port?, please go to the cluster you created and copy the kubeconfig content to ~/.kube/config on installation host.

KubeConfig

  • After the configuration is done, kubectl can be used to access Kubernetes clusters from installation host as below:

kubenodelist

Install helm

Download and install helm 2.14 on the installation host as below:

  1. wget https://get.helm.sh/helm-v2.14.3-linux-amd64.tar.gz
  2. tar -xzvf helm-v2.14.3-linux-amd64.tar.gz
  3. cd linux-amd64/
  4. mv helm /usr/local/bin/helm

Create a Private Safe Docker Registry on Installation Host for SAP Data Intelligence

In contrast to SAP Data Hub, SAP Data Intelligence requires safe docker registry mandatorily.

  • Prepare a domain name for docker registry on Alibaba Cloud according to the instructions in the link below:

    https://www.alibabacloud.com/help/doc-detail/54068.htm

    In this tutorial, we have purchased the domain name docker-registry-on-alicloud.io.

  • Prepare a certificate for the domain name on Alibaba Cloud according to the instructions in the link below:

    https://www.alibabacloud.com/help/doc-detail/129370.htm

  • Prepare a DNS service on Alibaba Cloud according to the instructions in the link below:

    https://www.alibabacloud.com/help/doc-detail/58131.htm

  • Create a secret for TLS (e.g. tls-encrypt-key):

    1. kubectl create secret tls tls-encrypt-key --cert=<YOUR_CERT_FILE > --key=<YOUR_KEY_FILE >

    Notice: --cert and --key are the certification and key files generated from above steps.

  • Create a Username and Password by htpasswd for your docker by the commands below, in this tutorial, we set Username to ‘Sapbigdata’, Password to ‘sapbigdata’.

    1. yum install httpd-tools -y
    2. touch htpasswd_file; htpasswd -Bb htpasswd_file Sapbigdata sapbigdata
    3. cat htpasswd_file

    You should see the result as below:

DI1

  • Create a .yaml file like the picture below, we created a .yaml file which named chart_values.yaml in this tutorial:

DI2

  • Run below command on installation host:

    1. helm install stable/docker-registry -f chart_values.yaml --set service.type=LoadBalancer \
    2. --set tlsSecretName=tls-encrypt-key -n docker-registry

    You should see the result as below:

DI3

  • Run command on installation host to get the IP address of the LoadBalancer:

    1. kubectl get svc | grep docker-registry | awk '{split($0,a," "); print a[4]}'
  • Register the domain name ‘docker-registry-on-alicloud.io’ and the IP address of the LoadBalancer to the DNS service which you applied in above step.

Verify Your Safe Docker Registry Configuration on Installation Host for SAP Data Intelligence)

In order to verify your safe docker registry configuration, you could login your docker registry on installation host with below command:

  1. docker login <DOMAIN-NAME>:5000 -u <USERNAME> -p <PASSWORD>

In this tutorial, DOMAIN-NAME is ‘docker-registry-on-alicloud.io’, USERNAME is ‘Sapbigdata’, PASSWORD is ‘sapbigdata’. Therefore you should be able to see below result after below command executed:

  1. docker login docker-registry-on-alicloud.io:5000 -u Sapbigdata -p sapbigdata

DI4

SAP Data Hub Installation

Get Installation Package

Set Default Storge Class

Set storage class to default on installation host as below:storageClass

Run Setup Script

On the installation host, unpack DHFOUNDATION07_*.ZIP file and change to the slplugin/workdir directory then run setup scripts:

  1. cd <DH_FOLDER>/slplugin/workdir/
  2. ./setup.sh

Setup configurations by your option, we use below values in our example.

conf1conf2conf3conf4

The whole process for setup will run for about 50 minutes. At last, you can see the installation finished successfully as below:DHInstallationFinished

SAP Data Intelligence Installation

General Prerequisites

Set Default Storge Class for Kubernetes Cluster

Logon your installation host by ssh, get available storage classes in the Alibaba Cloud kubernetes cluster.

  1. kubectl get storageclass

Then choose one storage classes and set it to default:

  1. kubectl patch storageclass alicloud-disk-efficiency -p '{"metadata": {"annotations": \
  2. {"storageclass.kubernetes.io/is-default-class":"true"}}}'

DI5

Run SLCB on Installation Host

Within the directory structure of the SLCB01_43-70003322.EXE, run setup scripts:

  1. mv SLCB01_43-70003322.EXE slcb
  2. chmod 0744 slcb
  3. touch inifile
  4. vi infifle

The following is a description of the parameters that need to be inserted into the inifile:

Config Item
Inputs in this tutorial as Demo
SLP_BRIDGE_REPOSITORY docker-registry-on-cloud.io:5000
SLP_DOCKER_REGISTRY “docker-registry-20200325.infra.datahub.sapcloud.io:5000”
ADMIN_USER input slcb username
ADMIN_PASSWORD input slcb password
SLP_ACTIVITY “INSTALL”
KUBECONFIG /root/.kube/config
SLP_NAMESPACE input a namespace
SAP_DOCKER_REPO input docker repo of sap
DOCKER_USER input docker user of SAP_DOCKER_REPO
DOCKER_USER_PASS input docker user’s password of SAP_DOCKER_REPO
SYSTEM_ADMIN_PASSWORD input system admin’s password in Data Intelligence
SLP_VORA_ADMIN_USERNAME input admin’s username in Data Intelligence
DEFAULT_ADMIN_PASSWORD input default admin’s password in Data Intelligence
SLP_ENABLE_NETWORK_POLICIES true
SLP_ENABLE_KANIKO true
SLP_ENABLE_CHECKPOINT_STORE true
SLP_VALIDATE_CHECKPOINT_STORE true
SLP_CHECKPOINT_STORE_TYPE_RAW oss
SLP_CHECKPOINT_OSS_ACCESS_KEY input Alibaba OSS Access Key you got before
SLP_CHECKPOINT_OSS_SECRET_ACCESS_KEY input Alibaba OSS Secret Access Key you got before
SLP_CHECKPOINT_OSS_HOST input the Internet Access of the bucket created previously (this tutorial input https://oss-cn-zhangjiakou.aliyuncs.com)
SLP_CHECKPOINT_OSS_PATH input [bucket name]/directory
SLP_PV_STORAGE_CLASS “alicloud-disk-efficiency”
SLP_VSYSTEM_STORAGE_CLASS “alicloud-disk-efficiency”
SLP_DLOG_STORAGE_CLASS “alicloud-disk-efficiency”
SLP_DISK_STORAGE_CLASS “alicloud-disk-efficiency”
SLP_HANA_STORAGE_CLASS “alicloud-disk-efficiency”
SLP_DIAGNOSTIC_STORAGE_CLASS “alicloud-disk-efficiency”
SLP_EXTRA_PARAMETERS “-e vsystem.pvcSize=20Gi -e hana.traceStorage=20Gi -e diagnostic.volumes.prometheusServer.size=20Gi”
(Due to Alibaba Disk Size limit (refer to https://www.alibabacloud.com/help/doc-detail/25513.htm), need to set the above storage size to 20G (Alibaba lower limit value) in ‘Additional Installation Parameters’ option)
SLP_CERT_DOMAIN “vora.sap.com”
LICENSE_AGREEMENT true
NON_INTERACTIVE_MODE true
SLP_TIMEOUT 1800
SLP_CUSTOM_PROFILE |-
#!baseProfile: di-platform-full
hana:
  memoryLimit: 6Gi
  memoryRequest: 4Gi
  overrides: |-
   profile: dev
   resources:
   requests:
   cpu: 1
uaa:
  overrides: |-
   cpu:
   min: 0.5’

You must insert these config item in inifile like the picture below:DI6

After that, run the init command as below:

  1. ./slcb init -l debug -i inifile --SAP_PV_DOCKER_REPO <SAP_DOCKER_REPO> -u none 2>&1

You should see the result as below:DI7

Then, run the copy command below, this step will copy images from SAP_PV_DOCKER_REPO to your docker registry.

  1. ./slcb copy -l debug -i inifile --SAP_PV_DOCKER_REPO < SAP_DOCKER_REPO> --useBridgeImage com.sap.datahub.linuxx86_64/di-platform-full-product-bridge:3.0.10 -u none 2>&1 touch inifile

You should see the result as below:DI8

Run the execution command as below, this step will begin the installation of Data Intelligence on Alibaba Cloud.

  1. ./slcb execute -l debug -x -i inifile --namespace < SLP_NAMESPACE> --SAP_PV_DOCKER_REPO < SAP_DOCKER_REPO> --useBridgeImage com.sap.datahub.linuxx86_64/di-platform-full-product-bridge:3.0.10 -u none 2>&1

The whole process for setup will run for about 20 minutes. You should see the result like below:

DI9

Data Intelligence Help URL

If there are some questions during installing or using process, please refer to SAP Data Intelligence help url: https://help.sap.com/viewer/product/SAP_DATA_INTELLIGENCE_ON-PREMISE/3.0.latest/en-US.

Post-Installation

Expose Launchpad by Load Balancing

In Alibaba Cloud console, click ‘Container Service for Kubernetes’. Then from ‘Ingress and Load Balancing’, click ‘Services’. Select the cluster, and select the namespace created during the setup process. Then click ‘Create’ to create a new Service.ServiceCreation

ServiceConfig

A new LoadBalancer Service will be created, find out the service (launchpad) and get external IP from ‘ExternalEndpoint’ column.

LaunchpadExternalEndpoint

Access Data Hub launchpad by: https://launchpad_ExternalEndpoint_ip. Use the Initial Tenant and the Initial Tenant User created during the setup process to logon.

DHlogin

After successfully logon, you should be able to see following page:

DHAfterLogin

Expose by Ingress

In Alibaba Cloud console, click ‘Container Service for Kubernetes’. Then from ‘Ingress and Load Balancing’, click ‘Ingresses’. Select the cluster, and select the namespace created during the setup process. Then click ‘Create’ to create an Ingress.

IngressCreation

Provide configuration parameters

  • Domain: a domain name is required which should be prepared by users in advance.
  • Services: ‘vsystem’ for Service Name and ‘8797’ for Service Port.
  • Enable TLS: select the secret which should be created in kubernetes using TLS certificate prepared by users in advance

It may take a few minutes for the ingress to get a backend address.

After the external address appears, update your DNS to add an entry to associate the ingress external IP to the domain your provided above.

Access Data Hub launch pad by: https://For the details of creating Ingress manually, refer to docs at https://help.sap.com/viewer/e66c399612e84a83a8abe97c0eeb443a/2.7.latest/en-US/7ea7d26ecb874d9aa046ab88c3bf4704.html.