Python script to delete older objects in s3

Architecture Diagram

Explaination

  • Create a IAM user in AWS account to have Programmatic Access to AWS S3.
  • will require boto3 and datetime package
  • Python Script to delete Folders which are older then last 5 days(Date format: YY.MM.DD) and there can be breaks in the days.
  • Test this script with custom Folders (Ex : S3Bucket:example1/example2/23.01.04)

Python Script for Keeping last 5 dates of folder and delete the rest of them with Explaination

import boto3  # import the Boto3 library for AWS interactions
import datetime  # import the datetime module for date and time manipulation

# Define a function to delete old S3 folders
def delete_old_s3_folders(prefix, bucket_name, aws_access_key, aws_secret_key):
    # Create a new S3 client using the provided AWS access key and secret key
    s3_client = boto3.client('s3', aws_access_key_id=aws_access_key, aws_secret_access_key=aws_secret_key)

    # Get the current date and time
    current_date = datetime.datetime.now()

    # List all objects in the bucket that have the specified prefix
    response = s3_client.list_objects_v2(Bucket=bucket_name, Prefix=prefix)

    # Create a list to store the dates of the folders
    dates = []
    for obj in response.get('Contents', []):
        keys = obj['Key']
        
        # Check if the object is a folder (ends with a /), and if its name is in the correct format
        if keys.endswith('/') and len(keys) == len(prefix) + 9 and keys[len(prefix):].count('.') ==2:
            # Extract the date part of the folder name (YY.MM.DD)
            date_str = keys[len(prefix):len(prefix)+8]
            # Add the date to the list of dates
            dates.append(date_str)
    # Print out the list of dates for debugging purposes
    print(dates)

    # Sort the dates in ascending order
    sorted_dates = sorted(dates, key=lambda x: datetime.datetime.strptime(x, '%y.%m.%d'))

    # Delete all folders except for the 5 most recent ones
    for date in sorted_dates[:-5]:
        # Print out the date of the folder to be deleted for debugging purposes
        print(date)
        # Delete the folder from the bucket
        s3_client.delete_object(Bucket=bucket_name, Key=date)   

# Define the S3 bucket name, prefix, and AWS access and secret keys
bucket_name = '<bucket_name>'
prefix = '<prefix>'
acc_key = ''
secret_key = ''

# Call the function to delete the old S3 folders
delete_old_s3_folders(prefix, bucket_name, acc_key, secret_key)

Leave a Comment

MFH IT Solutions (Regd No -LIN : AP-03-46-003-03147775)

Consultation & project support organization.

Contact

MFH IT Solutions (Regd)
NAD Kotha Road, Opp Bashyam School, Butchurajupalem, Jaya Prakash Nagar Visakhapatnam, Andhra Pradesh – 530027