×
Community Blog RAID and How to Set It Up on Alibaba Cloud

RAID and How to Set It Up on Alibaba Cloud

In this tutorial, we'll look at RAID, a data storage technology used to protect against drive failure, and go through how you can set it up on Alibaba Cloud.

By Hitesh Jethva, Alibaba Cloud Community Blog author.

RAID, or Redundant Array of Inexpensive Disks, is a data storage technology that combines multiple physical hard disk drive into one or more logical units to protect data in the case of a drive failure. RAID works by replicating data across two or more physical hard drives linked together by a RAID Controller. The RAID controller can be either hardware based or software based. RAID stores data on multiple disk drive and allow I/O operations to overlap in a balanced way to improve performance and increase fault tolerance. The disks can be combined into the array in different ways which are known as RAID levels. RAID technology can be used to prevent data loss, disk failure and also enhances business performance. You can also use RAID technology for SATA, SAS and SSD Drives.

RAID Storage Techniques

RAID arrays organize the data using the three storage techniques:

  • Striping: It is a method of splitting the flow of data into blocks and spreading the data blocks across multiple storage devices. This will increase the read/write performance of data.
  • Mirroring: It replicates the data into two or more disks. This method is very good for applications that require high performance and high availability.
  • Parity: This method is mainly used for fault tolerance by calculating the data in two drives and storing the results on a third.

Hardware and Software Implementations of RAID

RAID can be implemented using a Hardware RAID or Software RAID

A hardware implementation of RAID uses high-performance RAID Controller that is physically built using PCI express cards. Hardware RAID can be implemented using a complex standalone RAID controller. This controller can be equipped with their own CPU, battery-backed up cache memory, and typically this hardware setup supports hot-swapping. Hardware implementations of RAID allow you to install an operating system on top of it which can increase the uptime of the system. In these sorts of hardware implementations, logical disks are configured and mirrored outside of the system. A physical RAID controller manages the array, applications, and operating systems as logical units.

A software implementation of RAID can be used by the operating system driver. It is supported on most modern operating systems and is less costly and is a relatively more versatile option given a high level of capability with most operating systems out there. Software implementations of RAID perform the processing on the host's CPU because it does not use the server processor. With the software implementation of RAID, you do not need to install physical hardware in software. Software RAID is much slower than Hardware RAID, because it uses CPU of the host computer manage the extra disk I/O.

RAID Levels

RAID can be implemented using a various levels. RAID are typically divided into 5 levels, RAID0, RAID1, RAID5, RAID6 and RAID10.

RAID0

RAID0 is also known as a striping. It offers great performance for both read and write operations but it does not improve fault tolerance. If one disk drive fails, all data in the RAID0 array are lost. So, for this method, data will be stored to disk using shared method. Half of the content will be stored in one disk, and the other half will be stored to other disks. Therefore, you will need minimum two hard disks to create a RAID0.

RAID1

RAID1 is also known as a mirroring. In this method, data can be replicated among two or more disks. If one drive fails, data can be recovered from the mirror drive. For this method, then, you will need minimum two hard disks to create a RAID1. You can get only half of the total drive capacity because all data get written twice. It provides full fault tolerance and excellent read speed.

RAID5

RAID5 is also known as a distributed parity. You will need minimum three disks and can have up to 16 disks to implement RAID5. For this method, data blocks are striped across the disks and one disk is used to store a parity checksum of all the data blocks. If one disk failed, data can be recovered from parity information stored on the other disks. It provides excellent performance and full fault tolerance. It is an ideal solution for file and application servers that have a limited number of data drives.

RAID6

RAID6 is very similar to RAID5, but the parity data are written to two drives. It utilizes block-level striping and distributes two parity blocks on each disk within the array. You will need minimum four disks to implement RAID6. If any two disks failed, you can recover the data by replacing new disks. You will need the two extra disks for parity so RAID6 tends to be more costly. Write performance is relatively poor because the system writes data to all four drivers at the same time. It provides full fault tolerance and very good read speed.

RAID10

RAID10 also known as RAID 1+0 that combines disk mirroring and disk striping to protect data. RAID10 is very secure, because it duplicates all your data. It is also very fast because the data is striped across multiple disks. It is fault tolerant and provides good read/write performance. You will need minimum four disks to implement RAID10.

Setting up RAID0 Array

Now that we have discuss what RAID is, now let's look at how you can set up RAID0 and RAID6 on Alibaba Cloud Elastic Compute Service (ECS) instances installed with Ubuntun 16.04. We will break this up into two main sections. This section will discuss how to set up RAID0.

Building off of what we discussed before, RAID0 provides high performance and zero fault tolerance. If any one of disk fails, you cannot recover the data. RAID0 is an ideal solution for non-critical storage of data where you have to be read/written at a high speed. You will need minimum two drives to implement RAID0 array.

Prerequisites

Before you can set up RAID0 on Alibaba Cloud, you need to first have the following items:

  • A newly created Alibaba Cloud ECS instance installed with Ubuntu 16.04.
  • A minimum of two hard drives that are attached to your instance.
  • A root password that is set up to your instance.

For reference purposes, see create a new ECS instance and connect to your instance. Also, once you are logged into your Ubuntu 16.04 instance, you can run the apt-get update -y command to update your base system with the latest available packages.

Getting Started

Before starting, make sure you have two hard drives attached to your instance. You can check it by running the fdisk -l | grep -i sd* command, and then you should see the two external hard drives /dev/sdb and /dev/sdc in the following output:

Disk /dev/sda: 10 GiB, 10737418240 bytes, 20971520 sectors
/dev/sda1  *      2048   499711   497664  243M 83 Linux
/dev/sda2       501758 20969471 20467714  9.8G  5 Extended
/dev/sda5       501760 20969471 20467712  9.8G 8e Linux LVM
Disk /dev/sdb: 2 GiB, 2147483648 bytes, 4194304 sectors
Disk /dev/sdc: 2 GiB, 2147483648 bytes, 4194304 sectors

Install Mdadm

Mdadm is a Linux utility used to create, manage and monitor software RAID devices. By default, mdadm is available in the Ubuntu 16.04 default repository. You can install it by just running the apt-get install mdadm -y command, and after the installation is completed, you can proceed to the next step.

Create RAID0 Array

Now, let's create the RAID0 array with the device name (/dev/md0), raid level 0 and and disks /dev/sdb, /dev/sdc:

mdadm --create --verbose /dev/md0 --level=0 --raid-devices=2 /dev/sdb /dev/sdc

The output is as follows:

mdadm: chunk size defaults to 512K
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.

You can now check the status of RAID array with the following command:

cat /proc/mdstat

You should see the following output:

Personalities : [raid0] 
md0 : active raid0 sdc[1] sdb[0]
      4190208 blocks super 1.2 512k chunks
      
unused devices: <none>

You can also check the detail information about RAID array with the following command:

mdadm -E /dev/sd[b-c]

The output is as follows:

/dev/sdb:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : a4d636f7:eb05e54a:28cdb118:46076be1
           Name : Node2:0  (local to host Node2)
  Creation Time : Sun Nov 18 12:11:01 2018
     Raid Level : raid0
   Raid Devices : 2

 Avail Dev Size : 4190208 (2046.34 MiB 2145.39 MB)
    Data Offset : 4096 sectors
   Super Offset : 8 sectors
   Unused Space : before=4008 sectors, after=0 sectors
          State : clean
    Device UUID : 8b79b1f4:d313744a:5559fc18:cff176ba

    Update Time : Sun Nov 18 12:11:01 2018
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 15bc99ce - correct
         Events : 0

     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdc:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : a4d636f7:eb05e54a:28cdb118:46076be1
           Name : Node2:0  (local to host Node2)
  Creation Time : Sun Nov 18 12:11:01 2018
     Raid Level : raid0
   Raid Devices : 2

 Avail Dev Size : 4190208 (2046.34 MiB 2145.39 MB)
    Data Offset : 4096 sectors
   Super Offset : 8 sectors
   Unused Space : before=4008 sectors, after=0 sectors
          State : clean
    Device UUID : 528b5008:af0043f3:dbb81ab4:01c7cb32

    Update Time : Sun Nov 18 12:11:01 2018
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : e59dcd29 - correct
         Events : 0

     Chunk Size : 512K

   Device Role : Active device 1
   Array State : AA ('A' == active, '.' == missing, 'R' == replacing)

You can also run the following command:

mdadm --detail /dev/md0

The output is as follows:

/dev/md0:
        Version : 1.2
  Creation Time : Sun Nov 18 12:11:01 2018
     Raid Level : raid0
     Array Size : 4190208 (4.00 GiB 4.29 GB)
   Raid Devices : 2
  Total Devices : 2
    Persistence : Superblock is persistent

    Update Time : Sun Nov 18 12:11:01 2018
          State : clean 
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

     Chunk Size : 512K

           Name : Node2:0  (local to host Node2)
           UUID : a4d636f7:eb05e54a:28cdb118:46076be1
         Events : 0

    Number   Major   Minor   RaidDevice State
       0       8       16        0      active sync   /dev/sdb
       1       8       32        1      active sync   /dev/sdc

Create a Filesystem on RAID Array

RAID0 array is now configured and working. It's time to create a filesystem on RAID array (/dev/md0).

You can do this by running the following command:

mkfs.ext4 /dev/md0

Next, create a mount point and mount the RAID array on this:

mkdir /opt/raid0
mount /dev/md0 /opt/raid0

Next, check the mounted RAID array with the following command:

df -h

The output is as follows:

Filesystem                  Size  Used Avail Use% Mounted on
udev                        478M     0  478M   0% /dev
tmpfs                       100M  3.3M   96M   4% /run
/dev/mapper/Node1--vg-root  9.0G  5.8G  2.8G  69% /
tmpfs                       497M  4.0K  497M   1% /dev/shm
tmpfs                       5.0M     0  5.0M   0% /run/lock
tmpfs                       497M     0  497M   0% /sys/fs/cgroup
/dev/sda1                   236M   85M  139M  38% /boot
cgmfs                       100K     0  100K   0% /run/cgmanager/fs
tmpfs                       100M     0  100M   0% /run/user/0
/dev/md0                    3.9G  8.0M  3.7G   1% /opt/raid0

Next, you will need to configure /etc/mdadm/mdadm.conf file to reassembled the RAID array automatically at boot time. You can do this by running the following command:

mdadm --detail --scan | tee -a /etc/mdadm/mdadm.conf

The output will be as so:

ARRAY /dev/md0 metadata=1.2 name=Node2:0 UUID=a4d636f7:eb05e54a:28cdb118:46076be1

Next, update the initramfs to apply the changes with the update-initramfs -u command. Next, you will need to create a mount point for /dev/md0 in /etc/fstab file for automatic mounting at boot:

nano /etc/fstab

Add the following command:

/dev/md0  /opt/raid0     ext4    defaults,nofail,discard   0  0

Save and close the file, when you are finished.

Delete RAID0 Array

First, you will need to take a backup of data from the RAID device and unmount it from the filesystem with the following command:

umount /dev/md0

Next, deactivate the RAID device with the following command:

mdadm --stop /dev/md0

The output is as follows:

mdadm: stopped /dev/md0

Finally, remove the superblocks from all disks with the following command:

mdadm --zero-superblock /dev/sdb /dev/sdc

Now, check the status of RAID array with the following command:

cat /proc/mdstat 

The output is as follows:

Personalities : [raid0] 
unused devices: <none>

Setting Up RAID6 Array

Now let's set up a RAID6 array on an ECS instance.

As discussed before, RAID6 is an upgraded version of RAID5, where the parity data are written to two drives. RAID6 stores two parity records to different disk drives that enable two simultaneous disk drive failures in the same RAID group to be recovered. RAID6 provides high performance in read operations, but poor performance in write operations. RAID6 provides fault tolerance, even after two disk fails. You can rebuilt data from parity after replacing the failed disk. You will need a minimum of four disks to implement RAID6 array.

Prerequisites

Before you can set up RAID6 on Alibaba Cloud, you need to first have the following items:

  • A fresh Alibaba Cloud Ubuntu 16.04 instance.
  • Minimum four extra hard drives are attached to your instance.
  • A root password is set up to your instance.

For reference, see create a new ECS instance and connect to your instance. Also, once you are logged into your Ubuntu 16.04 instance, run the apt-get update -y command to update your base system with the latest available packages.

Getting Started

Before starting, make sure you have two hard drives attached to your instance. You can check it by running the fdisk -l | grep -i sd command. You should see the four external hard drives /dev/sdb, /dev/sdc, /dev/sdd and /dev/sde in the following output:

Disk /dev/sda: 10 GiB, 10737418240 bytes, 20971520 sectors
/dev/sda1  *      2048   499711   497664  243M 83 Linux
/dev/sda2       501758 20969471 20467714  9.8G  5 Extended
/dev/sda5       501760 20969471 20467712  9.8G 8e Linux LVM
Disk /dev/sdb: 2 GiB, 2147483648 bytes, 4194304 sectors
Disk /dev/sdc: 2 GiB, 2147483648 bytes, 4194304 sectors
Disk /dev/sdd: 2 GiB, 2147483648 bytes, 4194304 sectors
Disk /dev/sde: 2 GiB, 2147483648 bytes, 4194304 sectors

Install Mdadm

Mdadm is a Linux utility used to create, manage and monitor software RAID devices. default repository. You can install it by just running the apt-get install mdadm -y command, and after the installation is completed, you can proceed to the next step.

Create RAID6 Array

Now, let's create the RAID6 array with the device name (/dev/md0), raid level 6 and and disks /dev/sdb, /dev/sdc, /dev/sdd, /dev/sde:

mdadm --create --verbose /dev/md0 --level=6 --raid-devices=4 /dev/sdb /dev/sdc /dev/sdd /dev/sde

The output is as follows:

mdadm: layout defaults to left-symmetric
mdadm: layout defaults to left-symmetric
mdadm: chunk size defaults to 512K
mdadm: size set to 2095104K
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.

You can now check the status of RAID array with the following command:

cat /proc/mdstat

You should see the following output:

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md0 : active raid6 sde[3] sdd[2] sdc[1] sdb[0]
      4190208 blocks super 1.2 level 6, 512k chunk, algorithm 2 [4/4] [UUUU]
      
unused devices: <none>

You can also check the detail information about RAID array with the following command:

mdadm --detail /dev/md0

The output is as follows:

/dev/md0:
        Version : 1.2
  Creation Time : Sun Nov 18 13:03:32 2018
     Raid Level : raid6
     Array Size : 4190208 (4.00 GiB 4.29 GB)
  Used Dev Size : 2095104 (2046.34 MiB 2145.39 MB)
   Raid Devices : 4
  Total Devices : 4
    Persistence : Superblock is persistent

    Update Time : Sun Nov 18 13:04:35 2018
          State : clean 
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 512K

           Name : Node2:0  (local to host Node2)
           UUID : e7d5fc59:661516bc:3b1001a4:8cd03659
         Events : 17

    Number   Major   Minor   RaidDevice State
       0       8       16        0      active sync   /dev/sdb
       1       8       32        1      active sync   /dev/sdc
       2       8       48        2      active sync   /dev/sdd
       3       8       64        3      active sync   /dev/sde

Create a Filesystem on RAID Array

RAID0 array is now configured and working. It's time to create a filesystem on RAID array (/dev/md0).

You can do this by running the following command:

mkfs.ext4 /dev/md0

Next, create a mount point and mount the RAID array on this:

mkdir /opt/raid6
mount /dev/md0 /opt/raid6

Next, check the mounted RAID array with the following command:

df -h

The output is as follows:

Filesystem                  Size  Used Avail Use% Mounted on
udev                        478M     0  478M   0% /dev
tmpfs                       100M  3.3M   96M   4% /run
/dev/mapper/Node1--vg-root  9.0G  5.8G  2.8G  69% /
tmpfs                       497M  4.0K  497M   1% /dev/shm
tmpfs                       5.0M     0  5.0M   0% /run/lock
tmpfs                       497M     0  497M   0% /sys/fs/cgroup
/dev/sda1                   236M   85M  139M  38% /boot
cgmfs                       100K     0  100K   0% /run/cgmanager/fs
tmpfs                       100M     0  100M   0% /run/user/0
/dev/md0                    3.9G  8.0M  3.7G   1% /opt/raid6

Next, you will need to configure /etc/mdadm/mdadm.conf file to reassembled the RAID array automatically at boot time. You can do this by running the following command:

mdadm --detail --scan | tee -a /etc/mdadm/mdadm.conf

The output is as follows:

ARRAY /dev/md0 metadata=1.2 name=Node2:0 UUID=e7d5fc59:661516bc:3b1001a4:8cd03659

Next, update the initramfs to apply the changes with the following command:

update-initramfs -u

Next, you will need to create a mount point for /dev/md0 in /etc/fstab file for automatic mounting at boot:

nano /etc/fstab

Add the following command:

/dev/md0  /opt/raid6     ext4    defaults,nofail,discard   0  0

Add a Spare Drives to RAID Array

Currently, we have four drives available in RAID array. It is recommended to add a spare drive in RAID6 array, because if any one of the disks fails, you can rebuild the data from a spare drive.

First, shutdown the system and add a spare drive.

Next, add the spare drive (/dev/sdf) to RAID6 array with the following command:

mdadm --add /dev/md0 /dev/sdf

Next, check the added device with the following command:

mdadm --detail /dev/md0

The output is as follows:

/dev/md0:
        Version : 1.2
  Creation Time : Sun Nov 18 13:03:32 2018
     Raid Level : raid6
     Array Size : 4190208 (4.00 GiB 4.29 GB)
  Used Dev Size : 2095104 (2046.34 MiB 2145.39 MB)
   Raid Devices : 4
  Total Devices : 5
    Persistence : Superblock is persistent

    Update Time : Sun Nov 18 13:20:49 2018
          State : clean 
 Active Devices : 4
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 1

         Layout : left-symmetric
     Chunk Size : 512K

           Name : Node2:0  (local to host Node2)
           UUID : e7d5fc59:661516bc:3b1001a4:8cd03659
         Events : 18

    Number   Major   Minor   RaidDevice State
       0       8       16        0      active sync   /dev/sdb
       1       8       32        1      active sync   /dev/sdc
       2       8       48        2      active sync   /dev/sdd
       3       8       64        3      active sync   /dev/sde

       4       8       80        -      spare   /dev/sdf

Test RAID6 Fault Tolerance

Let's mark /dev/sdc as a failed drive and test whether spare drive works automatically or not.

You can mark /dev/sdc as a failed drive with the following command:

mdadm --manage --fail /dev/md0 /dev/sdc

The output is as follows:

mdadm: set /dev/sdc faulty in /dev/md0

Now, check the details of RAID6 array with the following command:

mdadm --detail /dev/md0

You should see the faulty drive and spare rebuilding process in the following output:

/dev/md0:
        Version : 1.2
  Creation Time : Sun Nov 18 13:03:32 2018
     Raid Level : raid6
     Array Size : 4190208 (4.00 GiB 4.29 GB)
  Used Dev Size : 2095104 (2046.34 MiB 2145.39 MB)
   Raid Devices : 4
  Total Devices : 5
    Persistence : Superblock is persistent

    Update Time : Sun Nov 18 13:25:59 2018
          State : clean, degraded, recovering 
 Active Devices : 3
Working Devices : 4
 Failed Devices : 1
  Spare Devices : 1

         Layout : left-symmetric
     Chunk Size : 512K

 Rebuild Status : 57% complete

           Name : Node2:0  (local to host Node2)
           UUID : e7d5fc59:661516bc:3b1001a4:8cd03659
         Events : 29

    Number   Major   Minor   RaidDevice State
       0       8       16        0      active sync   /dev/sdb
       4       8       80        1      spare rebuilding   /dev/sdf
       2       8       48        2      active sync   /dev/sdd
       3       8       64        3      active sync   /dev/sde

       1       8       32        -      faulty   /dev/sdc
0 0 0
Share on

Alibaba Clouder

2,599 posts | 763 followers

You may also like

Comments