EBS Snapshot Usage: How to Get the Actual Storage Space

EBS Snapshot Usage: How to Get the Actual Storage Space

EBS Snapshot Usage: How to Get the Actual Storage Space


Posted by Tejashwini on 21 June 2023



Introduction

Amazon Elastic Block Store (EBS) is a reliable, durable, and easy-to-use cloud-based storage service designed for Amazon Web Services (AWS) applications. EBS snapshots are integral to the proper functioning of EBS and are used to safeguard data, make backups, and serve as a foundation for EBS volume creation. In this article, how to get the actual storage space for accurate storage management.

Understanding EBS Snapshots

EBS snapshots are point-in-time copies of your EBS volumes. They use incremental backups to capture changes between snapshots to reduce the amount of data transferred and reduce costs. EBS snapshots are stored in Amazon Simple Storage Service (S3).


EBS snapshots are a crucial feature of EBS and offer several benefits. They are an efficient way to back up data, enable fast data recovery, and are used as a building block for creating new EBS volumes. Additionally, EBS snapshots are cost-effective and are designed to work with AWS workload.

Importance of EBS Snapshots

EBS snapshots are essential in maintaining the integrity, privacy, and security of data in any AWS application. They are essential for disaster recovery, backup, data replication, and data archiving.

Types of EBS Snapshots

There are two types of EBS snapshots: manual and automated. Manual snapshots are initiated by the user, while automated snapshots occur automatically according to a scheduled frequency.

How to Determine Actual Storage Space

Procedures for Measuring EBS Snapshots Storage Space

Several procedures can help you determine actual stoCloudyDoorsrage space for your EBS snapshots. You can use the AWS Management Console, AWS CLI, AWS API, or third-party tools such as CloudCheckr, which can provide resource-level data to help understand snapshot consumption at the volume level.

Tools for Measuring EBS Snapshots Storage Space

AWS CloudWatch provides some monitoring capabilities for EBS snapshots storage consumption, but they are limited to billing-level metrics. The CloudyDoors create a small tool to find the snapshots in any region and the space each snapshot is using to store data. It’s based on calculating the changed block numbers. Using AWS CLI, we can get the number of changed blocks between any two snapshots. The script developed by CloudyDoors simplifies this and lists all the volumes in a region and its latest snapshots are compared for block changes and finally calculate the associated storage space.

But How to use it? Let's see…

  • Go to the AWS console and create a new EC2 instance.
  • Creating the EBS snapshot: In the Volumes section, identify the EBS volume for which you want to create a snapshot. Select the checkbox next to the volume. Click on the "Actions" dropdown menu above the volume list and choose "Create Snapshot.
  • For examples, 


     Here, you can see a list of all EBS snapshots. 

  • The script developed by CloudyDoors simplifies this and lists all the volumes in a region and its latest snapshots are compared for block changes and finally calculate the associated storage space. 

 ##############List All Volumes and Compare its any 2 snapshot based on inputs ##########

#REGION==AWS region

WORK_DIR=SNAP_INFO

FINAL_FILE=$WORK_DIR/FINAL_FILE

SNAP_LIST=$WORK_DIR/SNAP_LIST

SNAP_OUT=$WORK_DIR/SNAP_OUT

SNAP_FILE_ALL=$WORK_DIR/SNAP_FILE_ALL

VOLUME_LIST=$WORK_DIR/VOLUME_LIST

mkdir $WORK_DIR

>$WORK_DIR/FINAL_FILE

echo "What is your AWS region : "

read REGION

#echo "Type in the difference in snapshot to compare: "

NUMBER=2

echo "Installing bc "

sudo yum install bc -y

#######create a file listing Volumes####

aws ec2 describe-volumes --region $REGION --query "Volumes[*].{VolumeID:Attachments[0].VolumeId,InstanceID:Attachments[0].InstanceId}" --output text| sed -e 's/\s\+/\//g' >$VOLUME_LIST

#######using Volumes from volume list, and taking various attributes and comparing its snapshots #####

for VOLUME in `cat $VOLUME_LIST| cut -f 2 -d /`

do

    INSTANCE_ID=`cat $VOLUME_LIST |grep $VOLUME | cut -f 1 -d /`

    aws ec2 describe-snapshots --region $REGION --filters "Name=volume-id, Values=$VOLUME" --query 'Snapshots[*].{SnapshotId:SnapshotId,StartTime:StartTime,VolumeSize:VolumeSize}' --output text | tail -$NUMBER | sed -e 's/\s\+/\//g' >$SNAP_LIST

    MIN_SNAP_NUMBER=`wc -l $SNAP_LIST| awk '{print $1}'`

    if [ $MIN_SNAP_NUMBER -ne $NUMBER ]

    then

    echo "$INSTANCE_ID:$VOLUME:Less Than $NUMBER SNAP">>$WORK_DIR/FINAL_FILE

    else

          SNAP1=`cat $SNAP_LIST|head -1 | cut -f 1 -d /`

          SNAP1_DATE=`cat $SNAP_LIST|head -1 | cut -f 2 -d /`

          VOLUME_SIZE=`cat $SNAP_LIST|head -1 | cut -f 3 -d /`

          SNAP2=`cat $SNAP_LIST|tail -1 | cut -f 1 -d /`

          SNAP2_DATE=`cat $SNAP_LIST|tail -1 | cut -f 2 -d /`

         ##########Compare SNAP1 and SNAP2

         >$SNAP_OUT

        TOTAL_CB=0

        #SNAP1=`echo $i |cut -f 1 -d :`

        #SNAP2=`echo $i |cut -f 2 -d :`

       #aws ebs list-changed-blocks --first-snapshot-id $SNAP1 --second-snapshot-id $SNAP2 --region ap-southeast-1 >snap-file

    aws ebs list-changed-blocks --first-snapshot-id $SNAP2 --second-snapshot-id $SNAP1 --region $REGION >$SNAP_FILE_ALL

        #CHANGED_BLOCKS=`cat snap-file | grep ChangedBlocks wc -l`

        #TOTAL_CB=`expr $TOTAL_CB + $CHANGED_BLOCKS`


        NEXT_TOKEN=`cat $SNAP_FILE_ALL| grep Next|tail -1 | awk '{print $2}' |sed 's/\"//g'`

        OLD_TOKEN=ABC

        until [ -z $NEXT_TOKEN ]

        do

                if [ "$OLD_TOKEN" == "$NEXT_TOKEN" ]

                then

                        break

                else

                       #aws ebs list-changed-blocks --first-snapshot-id $SNAP1 --second-snapshot-id $SNAP2 --next-token $NEXT_TOKEN --region ap-southeast-1 >snap-file

                        aws ebs list-changed-blocks --first-snapshot-id $SNAP2 --second-snapshot-id $SNAP1 --next-token $NEXT_TOKEN --region $REGION >>$SNAP_FILE_ALL

                        #CHANGED_BLOCKS=`cat snap-file | grep ChangedBlocks | wc -l`

                        #TOTAL_CB=`expr $TOTAL_CB + $CHANGED_BLOCKS`

                        OLD_TOKEN=$NEXT_TOKEN

                        NEXT_TOKEN=`cat $SNAP_FILE_ALL| grep Next |tail -1| awk '{print $2}'|sed 's/\"//g'`

                fi

        done

    #######Calculations #################

    TOTAL_CB=`cat $SNAP_FILE_ALL | grep FirstBlockToken | wc -l `

    SNAP_SIZE_KB=`expr $TOTAL_CB \* 512 `

    SNAP_SIZE=`echo "scale=2 ; $SNAP_SIZE_KB /1000000" | bc`

    PERCENTAGE=`echo "scale=2; $SNAP_SIZE * 100 / $VOLUME_SIZE" | bc `

    #PERCENTAGE=`expr $PERCENTAGE_CHANGE \* 100 `

    ################# Formatting and sharing to output file ################

    #echo "INSTANCE ID | VOLUME | VOLUME_SIZE | SNAP1 | SNAP2 | SNAP1 DATE | SNAP2 DATE | SNAP SIZE | PERCENTAGE" > $WORK_DIR/FINAL_FILE

    echo "INSTANCE ID - $INSTANCE_ID | VOLUME - $VOLUME | VOLUME_SIZE - $VOLUME_SIZE | SNAPSHOT 1 - $SNAP1 | SNAPSHOT 2 - $SNAP2 | SNAPSHOT 1 DATE - $SNAP1_DATE | SNAPSHOT 2 DATE - $SNAP2_DATE | SNAPSHOT SIZE - $SNAP_SIZE | PERCENTAGE OF SNAPSHOT - $PERCENTAGE" >> $WORK_DIR/FINAL_FILE

    ################## On screen status ####################

    echo "The result so far is below"

    cat $WORK_DIR/FINAL_FILE

    echo "Continuing with remaining Volumes.. PLease wait..."

    fi

done

###################end of script ####################

  • This is the final output that can give the actual space usage.


Conclusion

In conclusion, proper EBS snapshot usage and accurate storage space determination are critical, particularly for AWS applications. It’s based on calculating the changed block numbers . Using AWS CLI, we can get the number of changed blocks between any two snapshots and list all the volumes in a region and its latest snapshots are compared for block changes and finally calculate the associated storage space. By doing this, you can significantly optimize your AWS infrastructure performance, reduce infrastructure costs, and ensure optimal resource utilization.