By Hitesh Jethva, Alibaba Cloud Community Blog author.
RAID, or Redundant Array of Inexpensive Disks, is a data storage technology that combines multiple physical hard disk drive into one or more logical units to protect data in the case of a drive failure. RAID works by replicating data across two or more physical hard drives linked together by a RAID Controller. The RAID controller can be either hardware based or software based. RAID stores data on multiple disk drive and allow I/O operations to overlap in a balanced way to improve performance and increase fault tolerance. The disks can be combined into the array in different ways which are known as RAID levels. RAID technology can be used to prevent data loss, disk failure and also enhances business performance. You can also use RAID technology for SATA, SAS and SSD Drives.
RAID arrays organize the data using the three storage techniques:
RAID can be implemented using a Hardware RAID or Software RAID
A hardware implementation of RAID uses high-performance RAID Controller that is physically built using PCI express cards. Hardware RAID can be implemented using a complex standalone RAID controller. This controller can be equipped with their own CPU, battery-backed up cache memory, and typically this hardware setup supports hot-swapping. Hardware implementations of RAID allow you to install an operating system on top of it which can increase the uptime of the system. In these sorts of hardware implementations, logical disks are configured and mirrored outside of the system. A physical RAID controller manages the array, applications, and operating systems as logical units.
A software implementation of RAID can be used by the operating system driver. It is supported on most modern operating systems and is less costly and is a relatively more versatile option given a high level of capability with most operating systems out there. Software implementations of RAID perform the processing on the host's CPU because it does not use the server processor. With the software implementation of RAID, you do not need to install physical hardware in software. Software RAID is much slower than Hardware RAID, because it uses CPU of the host computer manage the extra disk I/O.
RAID can be implemented using a various levels. RAID are typically divided into 5 levels, RAID0, RAID1, RAID5, RAID6 and RAID10.
RAID0 is also known as a striping. It offers great performance for both read and write operations but it does not improve fault tolerance. If one disk drive fails, all data in the RAID0 array are lost. So, for this method, data will be stored to disk using shared method. Half of the content will be stored in one disk, and the other half will be stored to other disks. Therefore, you will need minimum two hard disks to create a RAID0.
RAID1 is also known as a mirroring. In this method, data can be replicated among two or more disks. If one drive fails, data can be recovered from the mirror drive. For this method, then, you will need minimum two hard disks to create a RAID1. You can get only half of the total drive capacity because all data get written twice. It provides full fault tolerance and excellent read speed.
RAID5 is also known as a distributed parity. You will need minimum three disks and can have up to 16 disks to implement RAID5. For this method, data blocks are striped across the disks and one disk is used to store a parity checksum of all the data blocks. If one disk failed, data can be recovered from parity information stored on the other disks. It provides excellent performance and full fault tolerance. It is an ideal solution for file and application servers that have a limited number of data drives.
RAID6 is very similar to RAID5, but the parity data are written to two drives. It utilizes block-level striping and distributes two parity blocks on each disk within the array. You will need minimum four disks to implement RAID6. If any two disks failed, you can recover the data by replacing new disks. You will need the two extra disks for parity so RAID6 tends to be more costly. Write performance is relatively poor because the system writes data to all four drivers at the same time. It provides full fault tolerance and very good read speed.
RAID10 also known as RAID 1+0 that combines disk mirroring and disk striping to protect data. RAID10 is very secure, because it duplicates all your data. It is also very fast because the data is striped across multiple disks. It is fault tolerant and provides good read/write performance. You will need minimum four disks to implement RAID10.
Now that we have discuss what RAID is, now let's look at how you can set up RAID0 and RAID6 on Alibaba Cloud Elastic Compute Service (ECS) instances installed with Ubuntun 16.04. We will break this up into two main sections. This section will discuss how to set up RAID0.
Building off of what we discussed before, RAID0 provides high performance and zero fault tolerance. If any one of disk fails, you cannot recover the data. RAID0 is an ideal solution for non-critical storage of data where you have to be read/written at a high speed. You will need minimum two drives to implement RAID0 array.
Before you can set up RAID0 on Alibaba Cloud, you need to first have the following items:
For reference purposes, see create a new ECS instance and connect to your instance. Also, once you are logged into your Ubuntu 16.04 instance, you can run the apt-get update -y
command to update your base system with the latest available packages.
Before starting, make sure you have two hard drives attached to your instance. You can check it by running the fdisk -l | grep -i sd*
command, and then you should see the two external hard drives /dev/sdb and /dev/sdc in the following output:
Disk /dev/sda: 10 GiB, 10737418240 bytes, 20971520 sectors
/dev/sda1 * 2048 499711 497664 243M 83 Linux
/dev/sda2 501758 20969471 20467714 9.8G 5 Extended
/dev/sda5 501760 20969471 20467712 9.8G 8e Linux LVM
Disk /dev/sdb: 2 GiB, 2147483648 bytes, 4194304 sectors
Disk /dev/sdc: 2 GiB, 2147483648 bytes, 4194304 sectors
Mdadm is a Linux utility used to create, manage and monitor software RAID devices. By default, mdadm is available in the Ubuntu 16.04 default repository. You can install it by just running the apt-get install mdadm -y
command, and after the installation is completed, you can proceed to the next step.
Now, let's create the RAID0 array with the device name (/dev/md0), raid level 0 and and disks /dev/sdb, /dev/sdc:
mdadm --create --verbose /dev/md0 --level=0 --raid-devices=2 /dev/sdb /dev/sdc
The output is as follows:
mdadm: chunk size defaults to 512K
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.
You can now check the status of RAID array with the following command:
cat /proc/mdstat
You should see the following output:
Personalities : [raid0]
md0 : active raid0 sdc[1] sdb[0]
4190208 blocks super 1.2 512k chunks
unused devices: <none>
You can also check the detail information about RAID array with the following command:
mdadm -E /dev/sd[b-c]
The output is as follows:
/dev/sdb:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : a4d636f7:eb05e54a:28cdb118:46076be1
Name : Node2:0 (local to host Node2)
Creation Time : Sun Nov 18 12:11:01 2018
Raid Level : raid0
Raid Devices : 2
Avail Dev Size : 4190208 (2046.34 MiB 2145.39 MB)
Data Offset : 4096 sectors
Super Offset : 8 sectors
Unused Space : before=4008 sectors, after=0 sectors
State : clean
Device UUID : 8b79b1f4:d313744a:5559fc18:cff176ba
Update Time : Sun Nov 18 12:11:01 2018
Bad Block Log : 512 entries available at offset 72 sectors
Checksum : 15bc99ce - correct
Events : 0
Chunk Size : 512K
Device Role : Active device 0
Array State : AA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdc:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : a4d636f7:eb05e54a:28cdb118:46076be1
Name : Node2:0 (local to host Node2)
Creation Time : Sun Nov 18 12:11:01 2018
Raid Level : raid0
Raid Devices : 2
Avail Dev Size : 4190208 (2046.34 MiB 2145.39 MB)
Data Offset : 4096 sectors
Super Offset : 8 sectors
Unused Space : before=4008 sectors, after=0 sectors
State : clean
Device UUID : 528b5008:af0043f3:dbb81ab4:01c7cb32
Update Time : Sun Nov 18 12:11:01 2018
Bad Block Log : 512 entries available at offset 72 sectors
Checksum : e59dcd29 - correct
Events : 0
Chunk Size : 512K
Device Role : Active device 1
Array State : AA ('A' == active, '.' == missing, 'R' == replacing)
You can also run the following command:
mdadm --detail /dev/md0
The output is as follows:
/dev/md0:
Version : 1.2
Creation Time : Sun Nov 18 12:11:01 2018
Raid Level : raid0
Array Size : 4190208 (4.00 GiB 4.29 GB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent
Update Time : Sun Nov 18 12:11:01 2018
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Chunk Size : 512K
Name : Node2:0 (local to host Node2)
UUID : a4d636f7:eb05e54a:28cdb118:46076be1
Events : 0
Number Major Minor RaidDevice State
0 8 16 0 active sync /dev/sdb
1 8 32 1 active sync /dev/sdc
RAID0 array is now configured and working. It's time to create a filesystem on RAID array (/dev/md0).
You can do this by running the following command:
mkfs.ext4 /dev/md0
Next, create a mount point and mount the RAID array on this:
mkdir /opt/raid0
mount /dev/md0 /opt/raid0
Next, check the mounted RAID array with the following command:
df -h
The output is as follows:
Filesystem Size Used Avail Use% Mounted on
udev 478M 0 478M 0% /dev
tmpfs 100M 3.3M 96M 4% /run
/dev/mapper/Node1--vg-root 9.0G 5.8G 2.8G 69% /
tmpfs 497M 4.0K 497M 1% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 497M 0 497M 0% /sys/fs/cgroup
/dev/sda1 236M 85M 139M 38% /boot
cgmfs 100K 0 100K 0% /run/cgmanager/fs
tmpfs 100M 0 100M 0% /run/user/0
/dev/md0 3.9G 8.0M 3.7G 1% /opt/raid0
Next, you will need to configure /etc/mdadm/mdadm.conf file to reassembled the RAID array automatically at boot time. You can do this by running the following command:
mdadm --detail --scan | tee -a /etc/mdadm/mdadm.conf
The output will be as so:
ARRAY /dev/md0 metadata=1.2 name=Node2:0 UUID=a4d636f7:eb05e54a:28cdb118:46076be1
Next, update the initramfs to apply the changes with the update-initramfs -u command. Next, you will need to create a mount point for /dev/md0 in /etc/fstab file for automatic mounting at boot:
nano /etc/fstab
Add the following command:
/dev/md0 /opt/raid0 ext4 defaults,nofail,discard 0 0
Save and close the file, when you are finished.
First, you will need to take a backup of data from the RAID device and unmount it from the filesystem with the following command:
umount /dev/md0
Next, deactivate the RAID device with the following command:
mdadm --stop /dev/md0
The output is as follows:
mdadm: stopped /dev/md0
Finally, remove the superblocks from all disks with the following command:
mdadm --zero-superblock /dev/sdb /dev/sdc
Now, check the status of RAID array with the following command:
cat /proc/mdstat
The output is as follows:
Personalities : [raid0]
unused devices: <none>
Now let's set up a RAID6 array on an ECS instance.
As discussed before, RAID6 is an upgraded version of RAID5, where the parity data are written to two drives. RAID6 stores two parity records to different disk drives that enable two simultaneous disk drive failures in the same RAID group to be recovered. RAID6 provides high performance in read operations, but poor performance in write operations. RAID6 provides fault tolerance, even after two disk fails. You can rebuilt data from parity after replacing the failed disk. You will need a minimum of four disks to implement RAID6 array.
Before you can set up RAID6 on Alibaba Cloud, you need to first have the following items:
For reference, see create a new ECS instance and connect to your instance. Also, once you are logged into your Ubuntu 16.04 instance, run the apt-get update -y command to update your base system with the latest available packages.
Before starting, make sure you have two hard drives attached to your instance. You can check it by running the fdisk -l | grep -i sd
command. You should see the four external hard drives /dev/sdb, /dev/sdc, /dev/sdd and /dev/sde in the following output:
Disk /dev/sda: 10 GiB, 10737418240 bytes, 20971520 sectors
/dev/sda1 * 2048 499711 497664 243M 83 Linux
/dev/sda2 501758 20969471 20467714 9.8G 5 Extended
/dev/sda5 501760 20969471 20467712 9.8G 8e Linux LVM
Disk /dev/sdb: 2 GiB, 2147483648 bytes, 4194304 sectors
Disk /dev/sdc: 2 GiB, 2147483648 bytes, 4194304 sectors
Disk /dev/sdd: 2 GiB, 2147483648 bytes, 4194304 sectors
Disk /dev/sde: 2 GiB, 2147483648 bytes, 4194304 sectors
Mdadm is a Linux utility used to create, manage and monitor software RAID devices. default repository. You can install it by just running the apt-get install mdadm -y
command, and after the installation is completed, you can proceed to the next step.
Now, let's create the RAID6 array with the device name (/dev/md0), raid level 6 and and disks /dev/sdb, /dev/sdc, /dev/sdd, /dev/sde:
mdadm --create --verbose /dev/md0 --level=6 --raid-devices=4 /dev/sdb /dev/sdc /dev/sdd /dev/sde
The output is as follows:
mdadm: layout defaults to left-symmetric
mdadm: layout defaults to left-symmetric
mdadm: chunk size defaults to 512K
mdadm: size set to 2095104K
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.
You can now check the status of RAID array with the following command:
cat /proc/mdstat
You should see the following output:
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid6 sde[3] sdd[2] sdc[1] sdb[0]
4190208 blocks super 1.2 level 6, 512k chunk, algorithm 2 [4/4] [UUUU]
unused devices: <none>
You can also check the detail information about RAID array with the following command:
mdadm --detail /dev/md0
The output is as follows:
/dev/md0:
Version : 1.2
Creation Time : Sun Nov 18 13:03:32 2018
Raid Level : raid6
Array Size : 4190208 (4.00 GiB 4.29 GB)
Used Dev Size : 2095104 (2046.34 MiB 2145.39 MB)
Raid Devices : 4
Total Devices : 4
Persistence : Superblock is persistent
Update Time : Sun Nov 18 13:04:35 2018
State : clean
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 512K
Name : Node2:0 (local to host Node2)
UUID : e7d5fc59:661516bc:3b1001a4:8cd03659
Events : 17
Number Major Minor RaidDevice State
0 8 16 0 active sync /dev/sdb
1 8 32 1 active sync /dev/sdc
2 8 48 2 active sync /dev/sdd
3 8 64 3 active sync /dev/sde
RAID0 array is now configured and working. It's time to create a filesystem on RAID array (/dev/md0).
You can do this by running the following command:
mkfs.ext4 /dev/md0
Next, create a mount point and mount the RAID array on this:
mkdir /opt/raid6
mount /dev/md0 /opt/raid6
Next, check the mounted RAID array with the following command:
df -h
The output is as follows:
Filesystem Size Used Avail Use% Mounted on
udev 478M 0 478M 0% /dev
tmpfs 100M 3.3M 96M 4% /run
/dev/mapper/Node1--vg-root 9.0G 5.8G 2.8G 69% /
tmpfs 497M 4.0K 497M 1% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 497M 0 497M 0% /sys/fs/cgroup
/dev/sda1 236M 85M 139M 38% /boot
cgmfs 100K 0 100K 0% /run/cgmanager/fs
tmpfs 100M 0 100M 0% /run/user/0
/dev/md0 3.9G 8.0M 3.7G 1% /opt/raid6
Next, you will need to configure /etc/mdadm/mdadm.conf file to reassembled the RAID array automatically at boot time. You can do this by running the following command:
mdadm --detail --scan | tee -a /etc/mdadm/mdadm.conf
The output is as follows:
ARRAY /dev/md0 metadata=1.2 name=Node2:0 UUID=e7d5fc59:661516bc:3b1001a4:8cd03659
Next, update the initramfs to apply the changes with the following command:
update-initramfs -u
Next, you will need to create a mount point for /dev/md0 in /etc/fstab file for automatic mounting at boot:
nano /etc/fstab
Add the following command:
/dev/md0 /opt/raid6 ext4 defaults,nofail,discard 0 0
Currently, we have four drives available in RAID array. It is recommended to add a spare drive in RAID6 array, because if any one of the disks fails, you can rebuild the data from a spare drive.
First, shutdown the system and add a spare drive.
Next, add the spare drive (/dev/sdf) to RAID6 array with the following command:
mdadm --add /dev/md0 /dev/sdf
Next, check the added device with the following command:
mdadm --detail /dev/md0
The output is as follows:
/dev/md0:
Version : 1.2
Creation Time : Sun Nov 18 13:03:32 2018
Raid Level : raid6
Array Size : 4190208 (4.00 GiB 4.29 GB)
Used Dev Size : 2095104 (2046.34 MiB 2145.39 MB)
Raid Devices : 4
Total Devices : 5
Persistence : Superblock is persistent
Update Time : Sun Nov 18 13:20:49 2018
State : clean
Active Devices : 4
Working Devices : 5
Failed Devices : 0
Spare Devices : 1
Layout : left-symmetric
Chunk Size : 512K
Name : Node2:0 (local to host Node2)
UUID : e7d5fc59:661516bc:3b1001a4:8cd03659
Events : 18
Number Major Minor RaidDevice State
0 8 16 0 active sync /dev/sdb
1 8 32 1 active sync /dev/sdc
2 8 48 2 active sync /dev/sdd
3 8 64 3 active sync /dev/sde
4 8 80 - spare /dev/sdf
Let's mark /dev/sdc as a failed drive and test whether spare drive works automatically or not.
You can mark /dev/sdc as a failed drive with the following command:
mdadm --manage --fail /dev/md0 /dev/sdc
The output is as follows:
mdadm: set /dev/sdc faulty in /dev/md0
Now, check the details of RAID6 array with the following command:
mdadm --detail /dev/md0
You should see the faulty drive and spare rebuilding process in the following output:
/dev/md0:
Version : 1.2
Creation Time : Sun Nov 18 13:03:32 2018
Raid Level : raid6
Array Size : 4190208 (4.00 GiB 4.29 GB)
Used Dev Size : 2095104 (2046.34 MiB 2145.39 MB)
Raid Devices : 4
Total Devices : 5
Persistence : Superblock is persistent
Update Time : Sun Nov 18 13:25:59 2018
State : clean, degraded, recovering
Active Devices : 3
Working Devices : 4
Failed Devices : 1
Spare Devices : 1
Layout : left-symmetric
Chunk Size : 512K
Rebuild Status : 57% complete
Name : Node2:0 (local to host Node2)
UUID : e7d5fc59:661516bc:3b1001a4:8cd03659
Events : 29
Number Major Minor RaidDevice State
0 8 16 0 active sync /dev/sdb
4 8 80 1 spare rebuilding /dev/sdf
2 8 48 2 active sync /dev/sdd
3 8 64 3 active sync /dev/sde
1 8 32 - faulty /dev/sdc
2,599 posts | 763 followers
FollowAlibaba Clouder - November 14, 2018
Alibaba Clouder - November 14, 2018
Iain Ferguson - February 17, 2022
Alibaba Cloud_Academy - June 7, 2022
Alibaba Clouder - January 4, 2018
Alibaba Clouder - May 17, 2021
2,599 posts | 763 followers
FollowAn encrypted and secure cloud storage service which stores, processes and accesses massive amounts of data from anywhere in the world
Learn MoreSimple, scalable, on-demand and reliable network attached storage for use with ECS instances, HPC and Container Service.
Learn MoreElastic and secure virtual cloud servers to cater all your cloud hosting needs.
Learn MoreMore Posts by Alibaba Clouder