In our last post, we have discussed how to run python scripts in AWS. Today, let's see how to bulk rename files in AWS S3 Using Python.
Since AWS S3 currently does not have a direct renaming interface, we can achieve renaming by first copying the file with a new name and then deleting the original file.
Prerequisites
Make sure to configure AWS CLI locally before running the script.
How to Bulk Rename Files in AWS S3 Using Python?
First, iterate through all the file keys under the bucket that you need to rename.
# -*- coding: utf-8 -*- import boto3 # Here, input the name of your bucket. BUCKET = 'bucket' s3_resource = boto3.resource('s3') s3 = boto3.client('s3') # This method can filter out all keys under a specific bucket. def get_all_s3_keys(bucket): """Get a list of all keys in an S3 bucket.""" keys = [] s3_client = boto3.client('s3') kwargs = {'Bucket': bucket} while True: resp = s3_client.list_objects_v2(**kwargs) for obj in resp['Contents']: keys.append(obj['Key']) try: kwargs['ContinuationToken'] = resp['NextContinuationToken'] except KeyError: break return keys # Here is an iterator that can filter out the corresponding keys with the desired prefix and suffix. def get_matching_s3_keys(bucket, prefix='', suffix=''): """ Generate the keys in an S3 bucket. :param bucket: Name of the S3 bucket. :param prefix: Only fetch keys that start with this prefix (optional). :param suffix: Only fetch keys that end with this suffix (optional). """ kwargs = {'Bucket': bucket, 'Prefix': prefix} while True: resp = s3.list_objects_v2(**kwargs) for obj in resp['Contents']: key = obj['Key'] if key.endswith(suffix): yield key try: kwargs['ContinuationToken'] = resp['NextContinuationToken'] except KeyError: break
The code above helps us iterate through all keys and uses the copy() method provided by boto3 to copy files. In fact, copy_object() can also be used for copying, but copy_object() will raise an error if copying a single file larger than 5GB, so we choose copy() here.
copy_source = { 'Bucket': 'mybucket', 'Key': 'mykey' } s3_resource.meta.client.copy(copy_source, 'newrbucket', 'new_file_key')
Finally, use the delete() method to remove the original file.
s3_resource.Object('my_bucket','old_file_key').delete()