When a publishing uses a publish prefix, instead of listing the contents
of the whole bucket under the storage prefix, only list the contents of
the bucket under the storage prefix and publish prefix, and cache it by
publish prefix.
This speeds up publish operations under a prefix.
instead of caching the whole s3 bucket, cache only the pool path. this
requires an additional parameter, and since this is an interface, all
implementations need to follow. might help in other backends too.
closes#1181
This change lets you disable ACL when using S3 by using a configuration
value of `none`. This way we maintain backward compatibility with the
default setting being `private`.
Fixes: #1067
The S3 backend relies on ETag S3 returns being equal to the MD5 of the
object, but it’s not necessarily true. When the value returned clearly
doesn’t look like a valid MD5 hash (length isn’t exactly 32 characters),
attempt to retrieve the MD5 hash possibly stored in the metadata.
We cannot always do this since user-defined metadata isn’t returned by
the ListObjects call, so verifying it for each object is expensive as it
requires one HEAD request per each object.
This commit fixes#923.
Signed-off-by: Andrej Shadura <andrew.shadura@collabora.co.uk>
The S3 backend relies on ETag S3 returns being equal to the MD5 of the
object, but it’s not necessarily true. For that purpose we store the MD5
object in a separate metadata field as well to make sure it isn’t lost.
From https://docs.aws.amazon.com/AmazonS3/latest/API/RESTCommonResponseHeaders.html:
> The entity tag is a hash of the object. The ETag reflects changes only
> to the contents of an object, not its metadata. The ETag may or may not
> be an MD5 digest of the object data. Whether or not it depends on how
> the object was created and how it is encrypted as described below:
>
> Objects created by the PUT Object, POST Object, or Copy operation,
> or through the AWS Management Console, and are encrypted by SSE-S3 or
> plaintext, have ETags that are an MD5 digest of their object data.
>
> Objects created by the PUT Object, POST Object, or Copy operation,
> or through the AWS Management Console, and are encrypted by SSE-C or
> SSE-KMS, have ETags that are not an MD5 digest of their object data.
>
> If an object is created by either the Multipart Upload or Part Copy
> operation, the ETag is not an MD5 digest, regardless of the method
> of encryption.
Signed-off-by: Andrej Shadura <andrew.shadura@collabora.co.uk>
According to https://tools.ietf.org/html/rfc7231#section-4.3.2 HEAD
must not have response body so the AWS error code NoSuchKey
cannot be received from S3 and we need to fallback to HTTP NotFound
error code.
DELETE requests, both for temporary files and no longer referenced
packages, lacked the configured path prefix and therefor were not
removed if a prefix is configured.
Original PR: #621Fixes: #619
I've added unit-test to Martyn's PR.
Without this fix, if `prefix` is set on S3 publish endpoint,
aptly would incorrectly build path cache and re-upload every object
on publish.
This is related to #506
As a first step, don't pass MD5 explicitly, pass checksum info object,
so that as a next step we can choose which hash to use.
There should be no functional changes so far.
Next step: stop returning explicit paths from public package pool.
Now that there's an official Go AWS SDK from Amazon, use that instead of
goamz. goamz isn't getting much love these days.
Implement support for STS credentials, as in assumed roles and EC2
instance profiles. The configuration is extended to support a session
token, though I'm not sure why anyone would put temporary credentials in
a configuration file. More likely, no credentials will be explicitly
configured at all, and they will be discovered through the standard SDK
mechanisms described at
<https://blogs.aws.amazon.com/security/post/Tx3D6U6WSFGOK2H/A-New-and-Standardized-Way-to-Manage-Credentials-in-the-AWS-SDKs>.
Resolves#342.