Troubleshooting S3A Bad Request Errors

The S3A client can see the error message "Bad Request" for many reasons —it is the standard response from Amazon S3 if it could not satisfy the request for any reason.

The main issues are covered in the Troubleshooting S3A section of the hadoop-aws module's documentation.

Common Causes of Bad Request Error Messages

Credentials

  • Your credentials are wrong.
  • Somehow the credentials have not been set properly before the S3A Filesystem instance was created. As a single instance per bucket is created per-JVM, the first configuration used to connect to a bucket is the one used thereafter.
  • You've been trying to set the credentials in the URI, but got the URL-escaping wrong. Stop trying to do that, it's a security disaster. Embrace per-bucket configuration.
  • You are trying to use per-bucket configuration for the credentials, but got the bucket name wrong there.
  • You are using session credentials, and the session has expired.

Endpoints

  • You are trying to use a V4 auth endpoint without declaring the endpoint of that region in the fs.s3a.endpoint.
  • You are trying to use a V3 auth endpoint but have set up S3 to use an explicit V4 auth endpoint. As they do not redirect to the central endpoint, you must declare the relevant endpoint explicitly.
  • You are trying to use a private S3 service but have forgotten to set the fs.s3a.endpoint; AWS is rejecting your private login.
  • You are trying to talk to a private S3 service but somehow it is talking to an HTTP page rather than an implementation of the S3 REST API.

Encryption

  • You are trying to use SSE-C with a key that cannot decrypt the remote data.
  • You are trying to work with a bucket which is configured to require encryption, but the client doesn't use it.

Classpath

  • A version of Joda-time incompatible with the JVM is on the classpath. It must be version 2.9.1 or later.

System

  • The client machine doesn't know when it is. Check the clock and the timezone settings.
  • Your DNS setup is returning the wrong IP address for the endpoint.
  • Your network is a mess.

As you can see, there is a wide variety of possible causes, spread across: credential setup, endpoint configuration, system configuration and other aspects of the S3A client. We are hampered in helping diagnose this by the need to keep those credentials secret.

Logging at lower levels

The AWS SDK and the Apache HTTP components can be configured to log at more detail, as can S3A itself.

log4j.logger.org.apache.hadoop.fs.s3a=DEBUG
log4j.logger.com.amazonaws.request=DEBUG
log4j.logger.org.apache.http=DEBUG
log4j.logger.org.apache.http.wire=ERROR

Be aware that logging HTTP headers may leak sensitive AWS account information, so the output should not be shared.

  • No labels