Categories
AWS

How to Bulk Rename Files in AWS S3 Using Python?

Home » AWS » How to Bulk Rename Files in AWS S3 Using Python?

In our last post, we have discussed how to run python scripts in AWS. Today, let's see how to bulk rename files in AWS S3 Using Python.

Since AWS S3 currently does not have a direct renaming interface, we can achieve renaming by first copying the file with a new name and then deleting the original file.

Prerequisites

Make sure to configure AWS CLI locally before running the script.

How to Bulk Rename Files in AWS S3 Using Python?

First, iterate through all the file keys under the bucket that you need to rename.

# -*- coding: utf-8 -*-
 
import boto3
 
# Here, input the name of your bucket.
BUCKET = 'bucket'
 
s3_resource = boto3.resource('s3')
s3 = boto3.client('s3')
 
 
 
# This method can filter out all keys under a specific bucket.
def get_all_s3_keys(bucket):
    """Get a list of all keys in an S3 bucket."""
    keys = []
    s3_client = boto3.client('s3')
    kwargs = {'Bucket': bucket}
    while True:
        resp = s3_client.list_objects_v2(**kwargs)
        for obj in resp['Contents']:
            keys.append(obj['Key'])
 
        try:
            kwargs['ContinuationToken'] = resp['NextContinuationToken']
        except KeyError:
            break
 
    return keys
 
#  Here is an iterator that can filter out the corresponding keys with the desired prefix and suffix.
def get_matching_s3_keys(bucket, prefix='', suffix=''):
    """
    Generate the keys in an S3 bucket.
    :param bucket: Name of the S3 bucket.
    :param prefix: Only fetch keys that start with this prefix (optional).
    :param suffix: Only fetch keys that end with this suffix (optional).
    """
    kwargs = {'Bucket': bucket, 'Prefix': prefix}
    while True:
        resp = s3.list_objects_v2(**kwargs)
        for obj in resp['Contents']:
            key = obj['Key']
            if key.endswith(suffix):
                yield key
 
        try:
            kwargs['ContinuationToken'] = resp['NextContinuationToken']
        except KeyError:
            break

The code above helps us iterate through all keys and uses the copy() method provided by boto3 to copy files. In fact, copy_object() can also be used for copying, but copy_object() will raise an error if copying a single file larger than 5GB, so we choose copy() here.

copy_source = {
    'Bucket': 'mybucket',
    'Key': 'mykey'
}
s3_resource.meta.client.copy(copy_source, 'newrbucket', 'new_file_key')

Finally, use the delete() method to remove the original file.

s3_resource.Object('my_bucket','old_file_key').delete()

By Jaxon Tisdale

I am Jaxon Tisdale. I will share you with my experience in Network, AWS, and databases.

Leave a Reply

Your email address will not be published. Required fields are marked *