Posted by Tejashwini on 21 June 2023
Amazon Elastic Block Store (EBS) is a reliable, durable, and easy-to-use cloud-based storage service designed for Amazon Web Services (AWS) applications. EBS snapshots are integral to the proper functioning of EBS and are used to safeguard data, make backups, and serve as a foundation for EBS volume creation. In this article, how to get the actual storage space for accurate storage management.
EBS snapshots are point-in-time copies of your EBS volumes. They use incremental backups to capture changes between snapshots to reduce the amount of data transferred and reduce costs. EBS snapshots are stored in Amazon Simple Storage Service (S3).
EBS snapshots are a crucial feature of EBS and offer several benefits. They are an efficient way to back up data, enable fast data recovery, and are used as a building block for creating new EBS volumes. Additionally, EBS snapshots are cost-effective and are designed to work with AWS workload.
Importance of EBS Snapshots
EBS snapshots are essential in maintaining the integrity, privacy, and security of data in any AWS application. They are essential for disaster recovery, backup, data replication, and data archiving.
There are two types of EBS snapshots: manual and automated. Manual snapshots are initiated by the user, while automated snapshots occur automatically according to a scheduled frequency.
Several procedures can help you determine actual stoCloudyDoorsrage space for your EBS snapshots. You can use the AWS Management Console, AWS CLI, AWS API, or third-party tools such as CloudCheckr, which can provide resource-level data to help understand snapshot consumption at the volume level.
AWS CloudWatch provides some monitoring capabilities for EBS snapshots storage consumption, but they are limited to billing-level metrics. The CloudyDoors create a small tool to find the snapshots in any region and the space each snapshot is using to store data. It’s based on calculating the changed block numbers. Using AWS CLI, we can get the number of changed blocks between any two snapshots. The script developed by CloudyDoors simplifies this and lists all the volumes in a region and its latest snapshots are compared for block changes and finally calculate the associated storage space.
Here, you can see a list of all EBS snapshots.
##############List All Volumes and Compare its any 2 snapshot based on inputs ##########
#REGION==AWS region
WORK_DIR=SNAP_INFO
FINAL_FILE=$WORK_DIR/FINAL_FILE
SNAP_LIST=$WORK_DIR/SNAP_LIST
SNAP_OUT=$WORK_DIR/SNAP_OUT
SNAP_FILE_ALL=$WORK_DIR/SNAP_FILE_ALL
VOLUME_LIST=$WORK_DIR/VOLUME_LIST
mkdir $WORK_DIR
>$WORK_DIR/FINAL_FILE
echo "What is your AWS region : "
read REGION
#echo "Type in the difference in snapshot to compare: "
NUMBER=2
echo "Installing bc "
sudo yum install bc -y
#######create a file listing Volumes####
aws ec2 describe-volumes --region $REGION --query "Volumes[*].{VolumeID:Attachments[0].VolumeId,InstanceID:Attachments[0].InstanceId}" --output text| sed -e 's/\s\+/\//g' >$VOLUME_LIST
#######using Volumes from volume list, and taking various attributes and comparing its snapshots #####
for VOLUME in `cat $VOLUME_LIST| cut -f 2 -d /`
do
INSTANCE_ID=`cat $VOLUME_LIST |grep $VOLUME | cut -f 1 -d /`
aws ec2 describe-snapshots --region $REGION --filters "Name=volume-id, Values=$VOLUME" --query 'Snapshots[*].{SnapshotId:SnapshotId,StartTime:StartTime,VolumeSize:VolumeSize}' --output text | tail -$NUMBER | sed -e 's/\s\+/\//g' >$SNAP_LIST
MIN_SNAP_NUMBER=`wc -l $SNAP_LIST| awk '{print $1}'`
if [ $MIN_SNAP_NUMBER -ne $NUMBER ]
then
echo "$INSTANCE_ID:$VOLUME:Less Than $NUMBER SNAP">>$WORK_DIR/FINAL_FILE
else
SNAP1=`cat $SNAP_LIST|head -1 | cut -f 1 -d /`
SNAP1_DATE=`cat $SNAP_LIST|head -1 | cut -f 2 -d /`
VOLUME_SIZE=`cat $SNAP_LIST|head -1 | cut -f 3 -d /`
SNAP2=`cat $SNAP_LIST|tail -1 | cut -f 1 -d /`
SNAP2_DATE=`cat $SNAP_LIST|tail -1 | cut -f 2 -d /`
##########Compare SNAP1 and SNAP2
>$SNAP_OUT
TOTAL_CB=0
#SNAP1=`echo $i |cut -f 1 -d :`
#SNAP2=`echo $i |cut -f 2 -d :`
#aws ebs list-changed-blocks --first-snapshot-id $SNAP1 --second-snapshot-id $SNAP2 --region ap-southeast-1 >snap-file
aws ebs list-changed-blocks --first-snapshot-id $SNAP2 --second-snapshot-id $SNAP1 --region $REGION >$SNAP_FILE_ALL
#CHANGED_BLOCKS=`cat snap-file | grep ChangedBlocks wc -l`
#TOTAL_CB=`expr $TOTAL_CB + $CHANGED_BLOCKS`
NEXT_TOKEN=`cat $SNAP_FILE_ALL| grep Next|tail -1 | awk '{print $2}' |sed 's/\"//g'`
OLD_TOKEN=ABC
until [ -z $NEXT_TOKEN ]
do
if [ "$OLD_TOKEN" == "$NEXT_TOKEN" ]
then
break
else
#aws ebs list-changed-blocks --first-snapshot-id $SNAP1 --second-snapshot-id $SNAP2 --next-token $NEXT_TOKEN --region ap-southeast-1 >snap-file
aws ebs list-changed-blocks --first-snapshot-id $SNAP2 --second-snapshot-id $SNAP1 --next-token $NEXT_TOKEN --region $REGION >>$SNAP_FILE_ALL
#CHANGED_BLOCKS=`cat snap-file | grep ChangedBlocks | wc -l`
#TOTAL_CB=`expr $TOTAL_CB + $CHANGED_BLOCKS`
OLD_TOKEN=$NEXT_TOKEN
NEXT_TOKEN=`cat $SNAP_FILE_ALL| grep Next |tail -1| awk '{print $2}'|sed 's/\"//g'`
fi
done
#######Calculations #################
TOTAL_CB=`cat $SNAP_FILE_ALL | grep FirstBlockToken | wc -l `
SNAP_SIZE_KB=`expr $TOTAL_CB \* 512 `
SNAP_SIZE=`echo "scale=2 ; $SNAP_SIZE_KB /1000000" | bc`
PERCENTAGE=`echo "scale=2; $SNAP_SIZE * 100 / $VOLUME_SIZE" | bc `
#PERCENTAGE=`expr $PERCENTAGE_CHANGE \* 100 `
################# Formatting and sharing to output file ################
#echo "INSTANCE ID | VOLUME | VOLUME_SIZE | SNAP1 | SNAP2 | SNAP1 DATE | SNAP2 DATE | SNAP SIZE | PERCENTAGE" > $WORK_DIR/FINAL_FILE
echo "INSTANCE ID - $INSTANCE_ID | VOLUME - $VOLUME | VOLUME_SIZE - $VOLUME_SIZE | SNAPSHOT 1 - $SNAP1 | SNAPSHOT 2 - $SNAP2 | SNAPSHOT 1 DATE - $SNAP1_DATE | SNAPSHOT 2 DATE - $SNAP2_DATE | SNAPSHOT SIZE - $SNAP_SIZE | PERCENTAGE OF SNAPSHOT - $PERCENTAGE" >> $WORK_DIR/FINAL_FILE
################## On screen status ####################
echo "The result so far is below"
cat $WORK_DIR/FINAL_FILE
echo "Continuing with remaining Volumes.. PLease wait..."
fi
done
###################end of script ####################
In conclusion, proper EBS snapshot usage and accurate storage space determination are critical, particularly for AWS applications. It’s based on calculating the changed block numbers . Using AWS CLI, we can get the number of changed blocks between any two snapshots and list all the volumes in a region and its latest snapshots are compared for block changes and finally calculate the associated storage space. By doing this, you can significantly optimize your AWS infrastructure performance, reduce infrastructure costs, and ensure optimal resource utilization.