mirror of
https://github.com/seaweedfs/seaweedfs.git
synced 2026-06-13 23:36:45 +03:00
fix(s3): honor ChecksumAlgorithm on presigned URL uploads (#9076)
* fix(s3): honor ChecksumAlgorithm on presigned URL uploads AWS SDK presigners hoist x-amz-sdk-checksum-algorithm (and related checksum headers) into the signed URL's query string, so servers must read either location. detectRequestedChecksumAlgorithm only looked at request headers, so presigned PUTs with ChecksumAlgorithm set validated and stored no additional checksum, and HEAD/GET never returned the x-amz-checksum-* header. Read these parameters from headers first, then fall back to a case-insensitive query-string lookup. Apply the same fallback when comparing an object-level checksum value against the computed one. Fixes #9075 * test(s3): presigned URL checksum integration tests (#9075) Adds test/s3/checksum with end-to-end coverage for flexible-checksum behavior on presigned URL uploads. Tests generate a presigned PUT URL with ChecksumAlgorithm set, upload the body with a plain http.Client (bypassing AWS SDK middleware so the server must honor the query-string hoisted x-amz-sdk-checksum-algorithm), then HEAD/GET with ChecksumMode=ENABLED and assert the stored x-amz-checksum-* header. Covers SHA256, SHA1, and a negative control with no checksum requested. Wires the new directory into s3-go-tests.yml as its own CI job. * perf(s3): parse presigned query once in detectRequestedChecksumAlgorithm Previously, each header fallback called getHeaderOrQuery, which re-parsed r.URL.Query() and allocated a new map on every invocation — up to eight times per PutObject request. Parse the raw query at most once per request (only when non-empty) and pass the pre-parsed url.Values into a new lookupHeaderOrQuery helper. Also drops a redundant strings.ToLower allocation in the case-insensitive query key scan (strings.EqualFold already handles ASCII case folding). Addresses review feedback from gemini-code-assist on PR #9076. * test(s3): honor credential env vars and add presigned upload timeout - init() now reads S3_ACCESS_KEY/S3_SECRET_KEY (and AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY / AWS_REGION fallbacks) so that `make test-with-server ACCESS_KEY=... SECRET_KEY=...` no longer authenticates with hardcoded defaults while the server has been started with different credentials. - uploadViaPresignedURL uses a dedicated http.Client with a 30s timeout instead of http.DefaultClient, so a stalled server fails fast in CI instead of blocking until the suite's global timeout fires. Addresses review feedback from coderabbitai on PR #9076. * test(s3): pass S3_PORT and credentials through to checksum tests - 'make test' now exports S3_ENDPOINT, S3_ACCESS_KEY, and S3_SECRET_KEY derived from the Makefile variables so the Go test process talks to the same endpoint/credentials that start-server was launched with. - start-server cleans up the background SeaweedFS process and PID file when the readiness poll times out, preventing stale port conflicts on subsequent runs. Addresses review feedback from coderabbitai on PR #9076. * ci(s3): raise checksum tests step timeout make test-with-server builds weed_binary, waits up to 90s for readiness, then runs go test -timeout=10m. The previous 12-minute step timeout only had ~2 minutes of headroom over the Go timeout, risking the Actions runner killing the step before tests reported a real failure. Bumps the job timeout from 15 to 20 minutes and the step timeout from 12 to 16 minutes, matching other S3 integration jobs. Addresses review feedback from coderabbitai on PR #9076. * perf(s3): thread pre-parsed query through putToFiler hot path Parse the request's query string once at the top of putToFiler and reuse the resulting url.Values for both the checksum-algorithm detection and the expected-checksum verification. Previously, the verification path called getHeaderOrQuery which re-parsed r.URL.Query() again on every PutObject, defeating the previous commit's single-parse goal. - Add parseRequestQuery + detectRequestedChecksumAlgorithmQ (the pre-parsed-query variant). detectRequestedChecksumAlgorithm is now a thin wrapper used by callers that do a single lookup. - putToFiler parses once and threads the result through both call sites. - Remove getHeaderOrQuery and update the unit test to use lookupHeaderOrQuery directly. Addresses follow-up review from gemini-code-assist on PR #9076. * test(s3): check io.ReadAll error in uploadViaPresignedURL helper * test(s3): drop SHA1 presigned test case The AWS SDK v2 presigner signs a Content-MD5 header at presign time for SHA1 PutObject requests even when no body is attached (the MD5 of the empty payload gets baked into the signed headers). Uploading the real body via a plain http.Client then trips SeaweedFS's MD5 validation and returns BadDigest — an SDK/presigner quirk, not a SeaweedFS bug. The SHA256 positive case already exercises the server-side query-hoisted algorithm path that issue #9075 is about, and the unit tests in weed/s3api cover each algorithm's header mapping. Drop the SHA1 integration case rather than chase SDK-specific workarounds. * test(s3): provide real Content-MD5 to presigned checksum test AWS SDK v2's flexible-checksum middleware signs a Content-MD5 header at presign time. There is no body to hash at that point, so it seeds the header with MD5 of the empty payload. When the real body is then PUT with a plain http.Client, SeaweedFS's server-side Content-MD5 verification correctly rejects the upload with BadDigest. Pre-compute the MD5 of the test body and thread it into PutObjectInput.ContentMD5 so the signed Content-MD5 matches the body that will actually be uploaded. The test still exercises the server-side path that reads X-Amz-Sdk-Checksum-Algorithm from the query string (the fix that PR #9076 is validating). * test(s3): send the signed Content-MD5 header on presigned upload uploadViaPresignedURL now accepts an extraHeaders map so callers can thread through headers that the presigner signed but the raw http request would otherwise omit. The SHA256 test passes the Content-MD5 it computed, matching what the presigner baked into the signature. Fixes SignatureDoesNotMatch seen in CI after the previous commit set ContentMD5 on the presign input without sending the corresponding header on the actual upload. * test(s3): build presigned URL with the raw v4 signer The AWS SDK v2 s3.PresignClient runs the flexible-checksum middleware for any PutObject input that carries ChecksumAlgorithm. That middleware injects a Content-MD5 header at presign time, and with no body present it seeds MD5-of-empty. Any subsequent upload of a non-empty body through a plain http.Client then trips SeaweedFS's Content-MD5 verification and returns BadDigest — not the code path that issue #9075 is about. Replace the PresignClient usage in the integration test with a direct call to v4.Signer.PresignHTTP, building a canonical URL whose query string already contains x-amz-sdk-checksum-algorithm=SHA256. This is exactly the shape of URL a browser/curl client would receive from any presigner that hoists the algorithm header, and it exercises the server-side fix from PR #9076 without dragging in SDK-specific middleware quirks. * test(s3): set X-Amz-Expires on presigned URL before signing v4.Signer.PresignHTTP does not add X-Amz-Expires on its own — the caller has to seed it into the request's query string so the signer includes it in the canonical query and the server accepts the presigned URL. Without it, SeaweedFS correctly returns AuthorizationQueryParametersError. Also adds a .gitignore for the make-managed test volume data, log file, and PID file so local `make test-with-server` runs do not leave artifacts tracked by git. Verified by running the integration tests locally: make test-with-server → both presigned checksum tests PASS.
This commit is contained in:
@@ -1 +1 @@
|
||||
{"sessionId":"d6574c47-eafc-4a94-9dce-f9ffea22b53c","pid":10111,"acquiredAt":1775248373916}
|
||||
{"sessionId":"81536282-6277-4e32-9922-37c6e3a01b1f","pid":84648,"acquiredAt":1776208413235}
|
||||
@@ -245,6 +245,61 @@ jobs:
|
||||
path: test/s3/retention/weed-test*.log
|
||||
retention-days: 3
|
||||
|
||||
s3-checksum-tests:
|
||||
name: S3 Checksum Tests
|
||||
runs-on: ubuntu-22.04
|
||||
timeout-minutes: 20
|
||||
|
||||
steps:
|
||||
- name: Check out code
|
||||
uses: actions/checkout@v6
|
||||
|
||||
- name: Set up Go
|
||||
uses: actions/setup-go@v6
|
||||
with:
|
||||
go-version-file: 'go.mod'
|
||||
id: go
|
||||
|
||||
- name: Install SeaweedFS
|
||||
run: |
|
||||
go install -buildvcs=false
|
||||
|
||||
- name: Run S3 Checksum Tests
|
||||
timeout-minutes: 16
|
||||
working-directory: test/s3/checksum
|
||||
run: |
|
||||
set -x
|
||||
echo "=== System Information ==="
|
||||
uname -a
|
||||
free -h
|
||||
df -h
|
||||
echo "=== Starting Tests ==="
|
||||
make test-with-server
|
||||
|
||||
- name: Show server logs on failure
|
||||
if: failure()
|
||||
working-directory: test/s3/checksum
|
||||
run: |
|
||||
echo "=== Server Logs ==="
|
||||
if [ -f weed-test.log ]; then
|
||||
echo "Last 100 lines of server logs:"
|
||||
tail -100 weed-test.log
|
||||
else
|
||||
echo "No server log file found"
|
||||
fi
|
||||
|
||||
echo "=== Test Environment ==="
|
||||
ps aux | grep -E "(weed|test)" || true
|
||||
netstat -tlnp | grep -E "(8333|9333|8080)" || true
|
||||
|
||||
- name: Upload test logs on failure
|
||||
if: failure()
|
||||
uses: actions/upload-artifact@v7
|
||||
with:
|
||||
name: s3-checksum-test-logs
|
||||
path: test/s3/checksum/weed-test*.log
|
||||
retention-days: 3
|
||||
|
||||
s3-cors-tests:
|
||||
name: S3 CORS Tests
|
||||
runs-on: ubuntu-22.04
|
||||
|
||||
@@ -0,0 +1,3 @@
|
||||
test-volume-data/
|
||||
weed-test.log
|
||||
weed-server.pid
|
||||
@@ -0,0 +1,100 @@
|
||||
# S3 Checksum Integration Tests
|
||||
# Covers flexible-checksum behavior on presigned URL uploads (issue #9075).
|
||||
|
||||
.PHONY: help build-weed check-deps start-server stop-server test test-with-server logs clean health-check
|
||||
|
||||
WEED_BINARY := ../../../weed/weed_binary
|
||||
S3_PORT := 8333
|
||||
ACCESS_KEY ?= some_access_key1
|
||||
SECRET_KEY ?= some_secret_key1
|
||||
TEST_TIMEOUT := 10m
|
||||
TEST_PATTERN ?= TestPresignedPut
|
||||
SERVER_DIR := ./test-volume-data/server-data
|
||||
|
||||
help:
|
||||
@echo "S3 Checksum Integration Tests"
|
||||
@echo ""
|
||||
@echo "Targets:"
|
||||
@echo " build-weed Build the SeaweedFS binary"
|
||||
@echo " start-server Start a local SeaweedFS S3 server for testing"
|
||||
@echo " stop-server Stop the local SeaweedFS server"
|
||||
@echo " test Run checksum tests (server must already be running)"
|
||||
@echo " test-with-server Start server, run tests, stop server"
|
||||
@echo " logs Tail server log"
|
||||
@echo " clean Remove test artifacts"
|
||||
|
||||
build-weed:
|
||||
@echo "Building SeaweedFS binary..."
|
||||
@cd ../../../weed && go build -o weed_binary .
|
||||
@chmod +x $(WEED_BINARY)
|
||||
|
||||
check-deps: build-weed
|
||||
@command -v go >/dev/null 2>&1 || (echo "Go is required but not installed" && exit 1)
|
||||
@test -f $(WEED_BINARY) || (echo "SeaweedFS binary not found at $(WEED_BINARY)" && exit 1)
|
||||
@go list -m github.com/aws/aws-sdk-go-v2 >/dev/null 2>&1 || (echo "AWS SDK Go v2 not found. Run 'go mod tidy'." && exit 1)
|
||||
@go list -m github.com/stretchr/testify >/dev/null 2>&1 || (echo "Testify not found. Run 'go mod tidy'." && exit 1)
|
||||
|
||||
start-server: check-deps
|
||||
@echo "Starting SeaweedFS server..."
|
||||
@rm -f weed-server.pid
|
||||
@mkdir -p $(SERVER_DIR)
|
||||
@AWS_ACCESS_KEY_ID=$(ACCESS_KEY) AWS_SECRET_ACCESS_KEY=$(SECRET_KEY) $(WEED_BINARY) mini \
|
||||
-dir=$(SERVER_DIR) \
|
||||
-s3.port=$(S3_PORT) \
|
||||
> weed-test.log 2>&1 & \
|
||||
echo $$! > weed-server.pid
|
||||
@for i in $$(seq 1 90); do \
|
||||
if curl -s http://localhost:$(S3_PORT) >/dev/null 2>&1; then \
|
||||
echo "✅ SeaweedFS server started on port $(S3_PORT) after $$i seconds"; \
|
||||
exit 0; \
|
||||
fi; \
|
||||
sleep 1; \
|
||||
done; \
|
||||
echo "❌ Server failed to start within 90 seconds"; \
|
||||
if [ -f weed-server.pid ]; then \
|
||||
PID=$$(cat weed-server.pid); \
|
||||
if ps -p $$PID >/dev/null 2>&1; then \
|
||||
kill -TERM $$PID 2>/dev/null || true; \
|
||||
sleep 1; \
|
||||
ps -p $$PID >/dev/null 2>&1 && kill -KILL $$PID 2>/dev/null || true; \
|
||||
fi; \
|
||||
rm -f weed-server.pid; \
|
||||
fi; \
|
||||
if [ -f weed-test.log ]; then tail -100 weed-test.log; fi; \
|
||||
exit 1
|
||||
|
||||
stop-server:
|
||||
@if [ -f weed-server.pid ]; then \
|
||||
PID=$$(cat weed-server.pid); \
|
||||
if ps -p $$PID >/dev/null 2>&1; then \
|
||||
kill -TERM $$PID 2>/dev/null || true; \
|
||||
sleep 2; \
|
||||
ps -p $$PID >/dev/null 2>&1 && kill -KILL $$PID 2>/dev/null || true; \
|
||||
fi; \
|
||||
rm -f weed-server.pid; \
|
||||
fi
|
||||
@echo "✅ SeaweedFS server stopped"
|
||||
|
||||
test: check-deps
|
||||
@echo "Running checksum tests (pattern: $(TEST_PATTERN))..."
|
||||
@S3_ENDPOINT=http://localhost:$(S3_PORT) \
|
||||
S3_ACCESS_KEY=$(ACCESS_KEY) \
|
||||
S3_SECRET_KEY=$(SECRET_KEY) \
|
||||
go test -v -timeout=$(TEST_TIMEOUT) -run "$(TEST_PATTERN)" .
|
||||
|
||||
test-with-server: start-server
|
||||
@sleep 3
|
||||
@$(MAKE) test || (echo "Tests failed, stopping server..." && $(MAKE) stop-server && exit 1)
|
||||
@$(MAKE) stop-server
|
||||
@echo "✅ Checksum tests completed"
|
||||
|
||||
logs:
|
||||
@if [ -f weed-test.log ]; then tail -f weed-test.log; else echo "No log file"; fi
|
||||
|
||||
health-check:
|
||||
@curl -s http://localhost:$(S3_PORT) >/dev/null 2>&1 && echo "✅ S3 on $(S3_PORT)" || (echo "❌ S3 not up" && exit 1)
|
||||
|
||||
clean:
|
||||
@$(MAKE) stop-server
|
||||
@rm -f weed-test.log weed-server.pid
|
||||
@rm -rf ./test-volume-data
|
||||
@@ -0,0 +1,256 @@
|
||||
package checksum_test
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"context"
|
||||
"crypto/sha256"
|
||||
"encoding/base64"
|
||||
"fmt"
|
||||
"io"
|
||||
"net/http"
|
||||
"net/url"
|
||||
"os"
|
||||
"strings"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"github.com/aws/aws-sdk-go-v2/aws"
|
||||
v4 "github.com/aws/aws-sdk-go-v2/aws/signer/v4"
|
||||
"github.com/aws/aws-sdk-go-v2/config"
|
||||
"github.com/aws/aws-sdk-go-v2/credentials"
|
||||
"github.com/aws/aws-sdk-go-v2/service/s3"
|
||||
"github.com/aws/aws-sdk-go-v2/service/s3/types"
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/require"
|
||||
)
|
||||
|
||||
// Integration test for https://github.com/seaweedfs/seaweedfs/issues/9075:
|
||||
// presigned PUT URLs that request a flexible-checksum algorithm (SHA256, SHA1, ...)
|
||||
// must cause the server to compute and persist that checksum, and HEAD/GET must
|
||||
// return the x-amz-checksum-* header when the caller asks for it.
|
||||
//
|
||||
// AWS SDK presigners hoist headers like x-amz-sdk-checksum-algorithm into the
|
||||
// signed URL's query string, so we deliberately upload the body with a plain
|
||||
// http.Client to ensure the server sees the checksum algorithm via the query
|
||||
// string — which is exactly what a browser / curl / non-SDK client does.
|
||||
|
||||
type s3TestConfig struct {
|
||||
Endpoint string
|
||||
AccessKey string
|
||||
SecretKey string
|
||||
Region string
|
||||
BucketPrefix string
|
||||
}
|
||||
|
||||
var defaultConfig = &s3TestConfig{
|
||||
Endpoint: "http://localhost:8333",
|
||||
AccessKey: "some_access_key1",
|
||||
SecretKey: "some_secret_key1",
|
||||
Region: "us-east-1",
|
||||
BucketPrefix: "test-checksum-",
|
||||
}
|
||||
|
||||
func getenvAny(keys ...string) string {
|
||||
for _, k := range keys {
|
||||
if v := os.Getenv(k); v != "" {
|
||||
return v
|
||||
}
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
func init() {
|
||||
if v := getenvAny("S3_ENDPOINT"); v != "" {
|
||||
defaultConfig.Endpoint = v
|
||||
}
|
||||
if v := getenvAny("S3_ACCESS_KEY", "AWS_ACCESS_KEY_ID"); v != "" {
|
||||
defaultConfig.AccessKey = v
|
||||
}
|
||||
if v := getenvAny("S3_SECRET_KEY", "AWS_SECRET_ACCESS_KEY"); v != "" {
|
||||
defaultConfig.SecretKey = v
|
||||
}
|
||||
if v := getenvAny("AWS_REGION", "AWS_DEFAULT_REGION"); v != "" {
|
||||
defaultConfig.Region = v
|
||||
}
|
||||
}
|
||||
|
||||
// presignedHTTPClient is used for non-SDK PUTs to a presigned URL. A fixed
|
||||
// timeout keeps tests from hanging forever if the server stalls.
|
||||
var presignedHTTPClient = &http.Client{Timeout: 30 * time.Second}
|
||||
|
||||
func getS3Client(t *testing.T) *s3.Client {
|
||||
cfg, err := config.LoadDefaultConfig(context.TODO(),
|
||||
config.WithRegion(defaultConfig.Region),
|
||||
config.WithCredentialsProvider(credentials.NewStaticCredentialsProvider(
|
||||
defaultConfig.AccessKey, defaultConfig.SecretKey, "")),
|
||||
config.WithEndpointResolverWithOptions(aws.EndpointResolverWithOptionsFunc(
|
||||
func(service, region string, _ ...interface{}) (aws.Endpoint, error) {
|
||||
return aws.Endpoint{
|
||||
URL: defaultConfig.Endpoint,
|
||||
SigningRegion: defaultConfig.Region,
|
||||
HostnameImmutable: true,
|
||||
}, nil
|
||||
})),
|
||||
)
|
||||
require.NoError(t, err)
|
||||
return s3.NewFromConfig(cfg, func(o *s3.Options) { o.UsePathStyle = true })
|
||||
}
|
||||
|
||||
func uniqueBucket() string {
|
||||
return fmt.Sprintf("%s%d", defaultConfig.BucketPrefix, time.Now().UnixNano())
|
||||
}
|
||||
|
||||
func createBucket(t *testing.T, client *s3.Client, name string) {
|
||||
_, err := client.CreateBucket(context.TODO(), &s3.CreateBucketInput{Bucket: aws.String(name)})
|
||||
require.NoError(t, err)
|
||||
}
|
||||
|
||||
func cleanupBucket(t *testing.T, client *s3.Client, name string) {
|
||||
t.Helper()
|
||||
objs, err := client.ListObjectsV2(context.TODO(), &s3.ListObjectsV2Input{Bucket: aws.String(name)})
|
||||
if err == nil {
|
||||
for _, o := range objs.Contents {
|
||||
_, _ = client.DeleteObject(context.TODO(), &s3.DeleteObjectInput{
|
||||
Bucket: aws.String(name), Key: o.Key,
|
||||
})
|
||||
}
|
||||
}
|
||||
_, _ = client.DeleteBucket(context.TODO(), &s3.DeleteBucketInput{Bucket: aws.String(name)})
|
||||
}
|
||||
|
||||
// uploadViaPresignedURL PUTs body to the given presigned URL using a plain
|
||||
// http.Client, mirroring what a browser / curl / non-SDK client would do.
|
||||
// Crucially this path does NOT run any AWS SDK middleware, so the server must
|
||||
// compute and store the checksum based on the algorithm parameter that was
|
||||
// hoisted into the query string by the presigner.
|
||||
//
|
||||
// extraHeaders are additional HTTP headers the presigner signed (e.g.
|
||||
// Content-MD5). Their values must exactly match what was signed or SigV4
|
||||
// verification will return SignatureDoesNotMatch.
|
||||
func uploadViaPresignedURL(t *testing.T, url string, body []byte, extraHeaders map[string]string) {
|
||||
t.Helper()
|
||||
req, err := http.NewRequest(http.MethodPut, url, bytes.NewReader(body))
|
||||
require.NoError(t, err)
|
||||
req.ContentLength = int64(len(body))
|
||||
for k, v := range extraHeaders {
|
||||
req.Header.Set(k, v)
|
||||
}
|
||||
resp, err := presignedHTTPClient.Do(req)
|
||||
require.NoError(t, err)
|
||||
defer resp.Body.Close()
|
||||
respBody, err := io.ReadAll(resp.Body)
|
||||
require.NoError(t, err, "read presigned PUT response body")
|
||||
require.Equalf(t, http.StatusOK, resp.StatusCode,
|
||||
"presigned PUT failed: %d %s", resp.StatusCode, string(respBody))
|
||||
}
|
||||
|
||||
// presignPutURL builds a presigned S3 PutObject URL using the low-level
|
||||
// SigV4 presigner, with an optional x-amz-sdk-checksum-algorithm query
|
||||
// parameter baked into the signed canonical query string.
|
||||
//
|
||||
// We use the raw signer instead of s3.PresignClient because the SDK's
|
||||
// flexible-checksum middleware tries to inject a Content-MD5 header for
|
||||
// flexible-checksum PutObject calls, and at presign time (no body) it seeds
|
||||
// MD5-of-empty, which then mismatches any real body uploaded through a plain
|
||||
// http.Client. The raw signer has no such middleware and produces exactly
|
||||
// the kind of URL a browser, curl caller, or custom client would receive.
|
||||
func presignPutURL(t *testing.T, bucket, key, checksumAlgorithm string) string {
|
||||
t.Helper()
|
||||
putURL, err := url.Parse(fmt.Sprintf("%s/%s/%s", strings.TrimRight(defaultConfig.Endpoint, "/"), bucket, key))
|
||||
require.NoError(t, err)
|
||||
q := putURL.Query()
|
||||
// PresignHTTP does not add X-Amz-Expires on its own; the caller must
|
||||
// seed it so the signer picks it up into the canonical query string.
|
||||
q.Set("X-Amz-Expires", "600")
|
||||
if checksumAlgorithm != "" {
|
||||
q.Set("x-amz-sdk-checksum-algorithm", checksumAlgorithm)
|
||||
}
|
||||
putURL.RawQuery = q.Encode()
|
||||
|
||||
req, err := http.NewRequest(http.MethodPut, putURL.String(), nil)
|
||||
require.NoError(t, err)
|
||||
|
||||
signer := v4.NewSigner()
|
||||
creds := aws.Credentials{
|
||||
AccessKeyID: defaultConfig.AccessKey,
|
||||
SecretAccessKey: defaultConfig.SecretKey,
|
||||
}
|
||||
// For presigned URLs the AWS convention is UNSIGNED-PAYLOAD so the
|
||||
// signer doesn't require a body-hash up front.
|
||||
signedURL, _, err := signer.PresignHTTP(context.TODO(), creds, req,
|
||||
"UNSIGNED-PAYLOAD", "s3", defaultConfig.Region, time.Now(),
|
||||
func(o *v4.SignerOptions) {
|
||||
o.DisableURIPathEscaping = true
|
||||
})
|
||||
require.NoError(t, err)
|
||||
return signedURL
|
||||
}
|
||||
|
||||
// TestPresignedPutWithChecksumSHA256 reproduces issue #9075: a presigned PUT
|
||||
// URL that carries x-amz-sdk-checksum-algorithm=SHA256 in its query string
|
||||
// must cause the object to be stored with an x-amz-checksum-sha256 attribute,
|
||||
// visible via HEAD when the caller requests ChecksumMode=ENABLED.
|
||||
func TestPresignedPutWithChecksumSHA256(t *testing.T) {
|
||||
client := getS3Client(t)
|
||||
bucket := uniqueBucket()
|
||||
createBucket(t, client, bucket)
|
||||
defer cleanupBucket(t, client, bucket)
|
||||
|
||||
key := "presigned-sha256.txt"
|
||||
body := []byte("hello seaweedfs checksum")
|
||||
sha256Sum := sha256.Sum256(body)
|
||||
expected := base64.StdEncoding.EncodeToString(sha256Sum[:])
|
||||
|
||||
signedURL := presignPutURL(t, bucket, key, "SHA256")
|
||||
uploadViaPresignedURL(t, signedURL, body, nil)
|
||||
|
||||
head, err := client.HeadObject(context.TODO(), &s3.HeadObjectInput{
|
||||
Bucket: aws.String(bucket),
|
||||
Key: aws.String(key),
|
||||
ChecksumMode: types.ChecksumModeEnabled,
|
||||
})
|
||||
require.NoError(t, err)
|
||||
require.NotNil(t, head.ChecksumSHA256, "x-amz-checksum-sha256 missing from HEAD response")
|
||||
assert.Equal(t, expected, aws.ToString(head.ChecksumSHA256),
|
||||
"stored SHA256 does not match body")
|
||||
|
||||
// GET with ChecksumMode=ENABLED should also return the header and the body should match.
|
||||
getOut, err := client.GetObject(context.TODO(), &s3.GetObjectInput{
|
||||
Bucket: aws.String(bucket),
|
||||
Key: aws.String(key),
|
||||
ChecksumMode: types.ChecksumModeEnabled,
|
||||
})
|
||||
require.NoError(t, err)
|
||||
defer getOut.Body.Close()
|
||||
require.NotNil(t, getOut.ChecksumSHA256, "x-amz-checksum-sha256 missing from GET response")
|
||||
assert.Equal(t, expected, aws.ToString(getOut.ChecksumSHA256))
|
||||
got, err := io.ReadAll(getOut.Body)
|
||||
require.NoError(t, err)
|
||||
assert.Equal(t, body, got)
|
||||
}
|
||||
|
||||
// TestPresignedPutWithoutChecksumAlgorithm is a negative control: when the
|
||||
// caller doesn't request a checksum algorithm, HEAD should not return one.
|
||||
func TestPresignedPutWithoutChecksumAlgorithm(t *testing.T) {
|
||||
client := getS3Client(t)
|
||||
bucket := uniqueBucket()
|
||||
createBucket(t, client, bucket)
|
||||
defer cleanupBucket(t, client, bucket)
|
||||
|
||||
key := "presigned-nosum.txt"
|
||||
body := []byte("no checksum requested")
|
||||
|
||||
signedURL := presignPutURL(t, bucket, key, "")
|
||||
uploadViaPresignedURL(t, signedURL, body, nil)
|
||||
|
||||
head, err := client.HeadObject(context.TODO(), &s3.HeadObjectInput{
|
||||
Bucket: aws.String(bucket),
|
||||
Key: aws.String(key),
|
||||
ChecksumMode: types.ChecksumModeEnabled,
|
||||
})
|
||||
require.NoError(t, err)
|
||||
assert.Nil(t, head.ChecksumSHA256)
|
||||
assert.Nil(t, head.ChecksumSHA1)
|
||||
assert.Nil(t, head.ChecksumCRC32)
|
||||
assert.Nil(t, head.ChecksumCRC32C)
|
||||
}
|
||||
@@ -362,8 +362,14 @@ func (s3a *S3ApiServer) putToFiler(r *http.Request, filePath string, dataReader
|
||||
plaintextHash := md5.New()
|
||||
dataReader = io.TeeReader(dataReader, plaintextHash)
|
||||
|
||||
// Parse the request's query string once. AWS SDK presigners hoist signed
|
||||
// headers such as x-amz-sdk-checksum-algorithm into the URL's query string,
|
||||
// so the checksum detection and (later) expected-checksum verification below
|
||||
// both need to consult the query in addition to headers.
|
||||
requestQuery := parseRequestQuery(r)
|
||||
|
||||
// Detect and set up additional checksum computation (S3 checksum algorithm support)
|
||||
checksumAlgo, checksumHeaderName, checksumErrCode := detectRequestedChecksumAlgorithm(r)
|
||||
checksumAlgo, checksumHeaderName, checksumErrCode := detectRequestedChecksumAlgorithmQ(r, requestQuery)
|
||||
if checksumErrCode != s3err.ErrNone {
|
||||
return "", checksumErrCode, SSEResponseMetadata{}
|
||||
}
|
||||
@@ -657,7 +663,7 @@ func (s3a *S3ApiServer) putToFiler(r *http.Request, filePath string, dataReader
|
||||
checksumBase64 = base64.StdEncoding.EncodeToString(checksumHash.Sum(nil))
|
||||
// Verify against client-provided checksum if present in request headers
|
||||
// (non-chunked uploads send the value directly; chunked uploads validate in the reader)
|
||||
if expectedChecksum := r.Header.Get(checksumHeaderName); expectedChecksum != "" {
|
||||
if expectedChecksum := lookupHeaderOrQuery(r, requestQuery, checksumHeaderName); expectedChecksum != "" {
|
||||
if expectedChecksum != checksumBase64 {
|
||||
glog.Warningf("putToFiler: checksum mismatch for %s: expected %s, got %s", checksumHeaderName, expectedChecksum, checksumBase64)
|
||||
s3a.deleteOrphanedChunks(chunkResult.FileChunks)
|
||||
@@ -891,14 +897,66 @@ var checksumHeaders = []struct {
|
||||
{s3_constants.AmzChecksumSHA256, ChecksumAlgorithmSHA256, s3_constants.AmzChecksumSHA256},
|
||||
}
|
||||
|
||||
// lookupHeaderOrQuery returns the value of an x-amz-* parameter, checking the
|
||||
// request headers first and falling back to the pre-parsed query values. AWS
|
||||
// SDK presigners hoist headers such as x-amz-sdk-checksum-algorithm into the
|
||||
// signed URL's query string, so the server must accept either location.
|
||||
// Query parameter lookup is case-insensitive to tolerate SDK/canonicalization
|
||||
// variations.
|
||||
func lookupHeaderOrQuery(r *http.Request, query url.Values, key string) string {
|
||||
if v := r.Header.Get(key); v != "" {
|
||||
return v
|
||||
}
|
||||
if len(query) == 0 {
|
||||
return ""
|
||||
}
|
||||
if v := query.Get(key); v != "" {
|
||||
return v
|
||||
}
|
||||
for k, vs := range query {
|
||||
if len(vs) == 0 || vs[0] == "" {
|
||||
continue
|
||||
}
|
||||
if strings.EqualFold(k, key) {
|
||||
return vs[0]
|
||||
}
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
// parseRequestQuery returns the parsed query for a request, or nil if there
|
||||
// is no raw query to parse. Callers that perform multiple header-or-query
|
||||
// lookups on the same request should call this once and thread the result
|
||||
// into lookupHeaderOrQuery / detectRequestedChecksumAlgorithmQ to avoid
|
||||
// re-parsing the query on every lookup.
|
||||
func parseRequestQuery(r *http.Request) url.Values {
|
||||
if r == nil || r.URL == nil || r.URL.RawQuery == "" {
|
||||
return nil
|
||||
}
|
||||
return r.URL.Query()
|
||||
}
|
||||
|
||||
// detectRequestedChecksumAlgorithm detects the checksum algorithm requested by the client.
|
||||
// It checks the x-amz-sdk-checksum-algorithm header, x-amz-checksum-algorithm header,
|
||||
// x-amz-trailer header (including comma-separated values), and individual x-amz-checksum-*
|
||||
// headers. Returns the algorithm enum, the canonical HTTP header name, and an error code
|
||||
// if an unsupported algorithm is specified.
|
||||
// headers. For presigned URLs, the same parameters are accepted from the query string
|
||||
// because AWS SDK presigners hoist those headers into the query. Returns the algorithm
|
||||
// enum, the canonical HTTP header name, and an error code if an unsupported algorithm is
|
||||
// specified.
|
||||
//
|
||||
// This wrapper parses the query string on the caller's behalf. Hot paths that
|
||||
// also need to inspect the query for other parameters (see putToFiler) should
|
||||
// call detectRequestedChecksumAlgorithmQ with a pre-parsed url.Values.
|
||||
func detectRequestedChecksumAlgorithm(r *http.Request) (ChecksumAlgorithm, string, s3err.ErrorCode) {
|
||||
// Check x-amz-sdk-checksum-algorithm (set by AWS SDKs)
|
||||
if algo := r.Header.Get(s3_constants.AmzSdkChecksumAlgorithm); algo != "" {
|
||||
return detectRequestedChecksumAlgorithmQ(r, parseRequestQuery(r))
|
||||
}
|
||||
|
||||
// detectRequestedChecksumAlgorithmQ is the pre-parsed-query variant of
|
||||
// detectRequestedChecksumAlgorithm. Pass nil for `query` when the caller has
|
||||
// no parsed query to reuse.
|
||||
func detectRequestedChecksumAlgorithmQ(r *http.Request, query url.Values) (ChecksumAlgorithm, string, s3err.ErrorCode) {
|
||||
// Check x-amz-sdk-checksum-algorithm (set by AWS SDKs; hoisted to query for presigned URLs)
|
||||
if algo := lookupHeaderOrQuery(r, query, s3_constants.AmzSdkChecksumAlgorithm); algo != "" {
|
||||
if m, ok := checksumAlgorithmMapping[strings.ToUpper(algo)]; ok {
|
||||
return m.alg, m.name, s3err.ErrNone
|
||||
}
|
||||
@@ -906,8 +964,8 @@ func detectRequestedChecksumAlgorithm(r *http.Request) (ChecksumAlgorithm, strin
|
||||
return ChecksumAlgorithmNone, "", s3err.ErrInvalidRequest
|
||||
}
|
||||
|
||||
// Check x-amz-checksum-algorithm header
|
||||
if algo := r.Header.Get(s3_constants.AmzChecksumAlgorithm); algo != "" {
|
||||
// Check x-amz-checksum-algorithm header (also accept from query for presigned URLs)
|
||||
if algo := lookupHeaderOrQuery(r, query, s3_constants.AmzChecksumAlgorithm); algo != "" {
|
||||
if m, ok := checksumAlgorithmMapping[strings.ToUpper(algo)]; ok {
|
||||
return m.alg, m.name, s3err.ErrNone
|
||||
}
|
||||
@@ -929,10 +987,11 @@ func detectRequestedChecksumAlgorithm(r *http.Request) (ChecksumAlgorithm, strin
|
||||
}
|
||||
}
|
||||
|
||||
// Check individual checksum headers (non-chunked uploads send the value directly)
|
||||
// Check individual checksum headers (non-chunked uploads send the value directly;
|
||||
// presigned URLs may hoist them to the query string)
|
||||
// Uses ordered slice for deterministic selection
|
||||
for _, entry := range checksumHeaders {
|
||||
if r.Header.Get(entry.header) != "" {
|
||||
if lookupHeaderOrQuery(r, query, entry.header) != "" {
|
||||
return entry.alg, entry.name, s3err.ErrNone
|
||||
}
|
||||
}
|
||||
|
||||
@@ -0,0 +1,108 @@
|
||||
package s3api
|
||||
|
||||
import (
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"testing"
|
||||
|
||||
"github.com/seaweedfs/seaweedfs/weed/s3api/s3_constants"
|
||||
"github.com/seaweedfs/seaweedfs/weed/s3api/s3err"
|
||||
)
|
||||
|
||||
func TestDetectRequestedChecksumAlgorithm(t *testing.T) {
|
||||
cases := []struct {
|
||||
name string
|
||||
setup func(r *http.Request)
|
||||
wantAlg ChecksumAlgorithm
|
||||
wantHeader string
|
||||
wantErr s3err.ErrorCode
|
||||
}{
|
||||
{
|
||||
name: "sdk algorithm header",
|
||||
setup: func(r *http.Request) {
|
||||
r.Header.Set(s3_constants.AmzSdkChecksumAlgorithm, "SHA256")
|
||||
},
|
||||
wantAlg: ChecksumAlgorithmSHA256,
|
||||
wantHeader: s3_constants.AmzChecksumSHA256,
|
||||
},
|
||||
{
|
||||
name: "algorithm header",
|
||||
setup: func(r *http.Request) {
|
||||
r.Header.Set(s3_constants.AmzChecksumAlgorithm, "CRC32")
|
||||
},
|
||||
wantAlg: ChecksumAlgorithmCRC32,
|
||||
wantHeader: s3_constants.AmzChecksumCRC32,
|
||||
},
|
||||
{
|
||||
// Presigned URL: AWS SDK hoists the sdk-checksum-algorithm header into the query string.
|
||||
// Regression test for https://github.com/seaweedfs/seaweedfs/issues/9075
|
||||
name: "presigned url hoists sdk algorithm to query",
|
||||
setup: func(r *http.Request) {
|
||||
q := r.URL.Query()
|
||||
q.Set("X-Amz-Sdk-Checksum-Algorithm", "SHA256")
|
||||
r.URL.RawQuery = q.Encode()
|
||||
},
|
||||
wantAlg: ChecksumAlgorithmSHA256,
|
||||
wantHeader: s3_constants.AmzChecksumSHA256,
|
||||
},
|
||||
{
|
||||
name: "presigned url hoists checksum-algorithm to query (lowercase)",
|
||||
setup: func(r *http.Request) {
|
||||
q := r.URL.Query()
|
||||
q.Set("x-amz-checksum-algorithm", "sha1")
|
||||
r.URL.RawQuery = q.Encode()
|
||||
},
|
||||
wantAlg: ChecksumAlgorithmSHA1,
|
||||
wantHeader: s3_constants.AmzChecksumSHA1,
|
||||
},
|
||||
{
|
||||
name: "presigned url hoists individual checksum value",
|
||||
setup: func(r *http.Request) {
|
||||
q := r.URL.Query()
|
||||
q.Set("X-Amz-Checksum-Sha256", "abc")
|
||||
r.URL.RawQuery = q.Encode()
|
||||
},
|
||||
wantAlg: ChecksumAlgorithmSHA256,
|
||||
wantHeader: s3_constants.AmzChecksumSHA256,
|
||||
},
|
||||
{
|
||||
name: "unsupported in query returns error",
|
||||
setup: func(r *http.Request) {
|
||||
q := r.URL.Query()
|
||||
q.Set("X-Amz-Sdk-Checksum-Algorithm", "MD5")
|
||||
r.URL.RawQuery = q.Encode()
|
||||
},
|
||||
wantAlg: ChecksumAlgorithmNone,
|
||||
wantErr: s3err.ErrInvalidRequest,
|
||||
},
|
||||
{
|
||||
name: "no checksum",
|
||||
setup: func(r *http.Request) {},
|
||||
wantAlg: ChecksumAlgorithmNone,
|
||||
},
|
||||
}
|
||||
for _, tc := range cases {
|
||||
t.Run(tc.name, func(t *testing.T) {
|
||||
r := httptest.NewRequest(http.MethodPut, "http://example.com/bucket/key", nil)
|
||||
tc.setup(r)
|
||||
alg, header, code := detectRequestedChecksumAlgorithm(r)
|
||||
if code != tc.wantErr {
|
||||
t.Fatalf("err code: got %v, want %v", code, tc.wantErr)
|
||||
}
|
||||
if alg != tc.wantAlg {
|
||||
t.Fatalf("algorithm: got %v, want %v", alg, tc.wantAlg)
|
||||
}
|
||||
if header != tc.wantHeader {
|
||||
t.Fatalf("header name: got %q, want %q", header, tc.wantHeader)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestLookupHeaderOrQueryCaseInsensitive(t *testing.T) {
|
||||
r := httptest.NewRequest(http.MethodPut, "http://example.com/?x-AMZ-sdk-CHECKSUM-algorithm=SHA256", nil)
|
||||
q := parseRequestQuery(r)
|
||||
if got := lookupHeaderOrQuery(r, q, s3_constants.AmzSdkChecksumAlgorithm); got != "SHA256" {
|
||||
t.Fatalf("expected SHA256, got %q", got)
|
||||
}
|
||||
}
|
||||
Reference in New Issue
Block a user