content-type validation during file uploads to an AWS S3 bucket

Uploading files to an AWS S3 bucket is a common requirement in modern applications, but it comes with a crucial responsibility—validating the content type of the uploaded files. Without proper validation, malicious or unwanted file types could slip through, potentially leading to security vulnerabilities or system issues.

In this blog, we’ll explore how to enforce content-type validation during file uploads to S3. We'll cover key techniques to block or notify when unknown content types are uploaded.


Why Content-Type Validation Matters

Content-Type validation ensures that files being uploaded match your application's expectations and security policies. Common use cases include:

  1. Preventing malicious uploads: For example, blocking executables or scripts.

  2. Maintaining application integrity: Allowing only image files for profile pictures or text files for logs.

  3. Improved error handling: Early detection of incorrect file types enhances user experience.


How Content-Type Validation Works in AWS S3

AWS S3 supports custom logic for file uploads via AWS Lambda, Pre-signed URLs, and S3 Event Notifications. Below are strategies for implementing validation:

A detailed workflow diagram illustrating the process of content-type validation during file uploads to an AWS S3 bucket. The diagram includes the following steps: 1) User uploads a file to the S3 bucket. 2) An S3 Event Notification triggers an AWS Lambda function upon file upload. 3) The Lambda function retrieves file metadata, including the Content-Type. 4) The function compares the Content-Type against a whitelist of allowed types. 5) If the Content-Type is invalid, the file is deleted, and an alert is sent via Amazon SNS. 6) If valid, the upload is accepted. Arrows and labels clearly indicate the flow of events between components: User, S3 Bucket, Lambda Function, SNS, and Whitelist Logic. The diagram uses cloud-themed icons for AWS services and a clean, professional design.


1. Validating Content-Type Using AWS Lambda

You can create an S3 event trigger for object creation. When a file is uploaded, an AWS Lambda function is triggered to inspect the file's metadata, including the Content-Type. If the content type is invalid, the function can take corrective actions, such as:

  • Deleting the file: Automatically remove invalid files.

  • Sending a notification: Use Amazon SNS or email to alert the user or admin.

Sample Code for Validation:

pythonCopy codeimport boto3

def lambda_handler(event, context):
    s3 = boto3.client('s3')
    bucket_name = event['Records'][0]['s3']['bucket']['name']
    object_key = event['Records'][0]['s3']['object']['key']

    # Get object metadata
    response = s3.head_object(Bucket=bucket_name, Key=object_key)
    content_type = response['ContentType']

    # Define allowed content types
    allowed_types = ['image/jpeg', 'image/png', 'application/pdf']

    if content_type not in allowed_types:
        # Delete the file
        s3.delete_object(Bucket=bucket_name, Key=object_key)

        # Notify via SNS or CloudWatch Logs
        print(f"Blocked file with invalid content type: {content_type}")
        return {
            'statusCode': 400,
            'body': 'Invalid content type. File removed.'
        }

    return {
        'statusCode': 200,
        'body': 'File uploaded successfully.'
    }

2. Validating Content-Type with Pre-Signed URLs

Pre-signed URLs provide a secure way to restrict uploads to specific file types. When generating the URL, you can include conditions to enforce content-type restrictions.

Example:

pythonCopy codeimport boto3

s3 = boto3.client('s3')
bucket_name = 'your-bucket-name'
object_key = 'uploads/test.jpg'

# Generate a pre-signed URL with content-type restrictions
response = s3.generate_presigned_post(
    Bucket=bucket_name,
    Key=object_key,
    Conditions=[
        {"Content-Type": "image/jpeg"}
    ],
    ExpiresIn=3600
)

print(response)

With this approach, only files with the specified content type (e.g., image/jpeg) can be uploaded.


3. Combining S3 Event Notifications and Content-Type Validation

Using S3 Event Notifications, you can trigger workflows whenever a new file is uploaded. For example:

  1. Set up an S3 bucket event to send notifications to an SNS topic.

  2. Configure an AWS Lambda function to process these notifications and validate content types.

  3. Notify users of invalid uploads or perform remediation actions.


Test case/POC.

I’ve used one S3 Bucket and configued Lambda fuction so that if anyone uploads non TXT file, it will send sns notification like below.

I’ve uploaded one dr.pcap file, after upload I got the notification in the sns email.

Best Practices for Content-Type Validation

  1. Whitelist allowed types: Always use a whitelist to explicitly define allowed file types.

  2. Verify actual file content: Content-Type headers can be manipulated. Use libraries like python-magic to inspect the file's actual MIME type.

  3. Log invalid attempts: Keep track of invalid upload attempts for auditing and troubleshooting.

  4. User feedback: Provide clear error messages when uploads fail validation.


Conclusion

Content-type validation is a critical step in securing and maintaining your S3 file uploads. Whether through Lambda functions, pre-signed URLs, or event notifications, AWS provides robust tools to implement this feature effectively. By enforcing these practices, you can ensure that your application remains secure, compliant, and user-friendly.