mirror of
https://github.com/seaweedfs/seaweedfs.git
synced 2026-06-13 23:36:45 +03:00
feat(security): hot-reload HTTPS certs without restart (k8s cert-manager) (#9181)
* feat(security): hot-reload HTTPS certs for master/volume/filer/webdav/admin S3 and filer already use a refreshing pemfile provider for their HTTPS cert, so rotated certificates (e.g. from k8s cert-manager) are picked up without a restart. Master, volume, webdav, and admin, however, passed cert/key paths straight to ServeTLS/ListenAndServeTLS and loaded once at startup — rotating those certs required a pod restart. Add a small helper NewReloadingServerCertificate in weed/security that wraps pemfile.Provider and returns a tls.Config.GetCertificate closure, then wire it into the four remaining HTTPS entry points. httpdown now also calls ServeTLS when TLSConfig carries a GetCertificate/Certificates but CertFile/KeyFile are empty, so volume server can pre-populate TLSConfig. A unit test exercises the rotation path (write cert, rotate on disk, assert the callback returns the new cert) with a short refresh window. * refactor(security): route filer/s3 HTTPS through the shared cert reloader Before: filer.go and s3.go each kept a *certprovider.Provider on the options struct plus a duplicated GetCertificateWithUpdate method. Both were loading pemfile themselves. Behaviorally they already reloaded, but the logic was duplicated two ways and neither path was shared with the newly-added master/volume/webdav/admin wiring. After: both use security.NewReloadingServerCertificate like the other servers. The per-struct certProvider field and GetCertificateWithUpdate method are removed, along with the now-unused certprovider and pemfile imports. Net: -32 lines, one code path for all HTTPS cert reloading. No behavior change — the refresh window, cache, and handshake contract are identical (the helper wraps the same pemfile.NewProvider). * feat(security): hot-reload HTTPS client certs for mount/backup/upload/etc The HTTP client in weed/util/http/client loaded the mTLS client cert once at startup via tls.LoadX509KeyPair. That left every long-lived HTTPS client process (weed mount, backup, filer.copy, filer→volume, s3→filer/volume) unable to pick up a rotated client cert without a restart — even though the same cert-manager setup was already rotating the server side fine. Swap the client cert loader for a tls.Config.GetClientCertificate callback backed by the same refreshing pemfile provider. New TLS handshakes pick up the rotated cert; in-flight pooled connections keep their old cert and drop as normal transport churn happens. To keep this reusable from both server and client TLS code without an import cycle (weed/security already imports weed/util/http/client for LoadHTTPClientFromFile), extract the pemfile wrapper into a new weed/security/certreload subpackage. weed/security keeps its thin NewReloadingServerCertificate wrapper. The existing unit test moves with the implementation. gRPC mTLS was already handled by security.LoadServerTLS / LoadClientTLS; this PR does not change any gRPC paths. MQ broker, MQ agent, Kafka gateway, and FUSE mount control plane are gRPC-only and therefore already rotate. CA bundles (ClientCAs / RootCAs / grpc.ca) are still loaded once — noted as a known limitation in the wiki. * fix(security): address PR review feedback on cert reloader Bots (gemini-code-assist + coderabbit) flagged three real issues and a couple of nits. Addressing them here: 1. KeyMaterial used context.Background(). The grpc pemfile provider's KeyMaterial blocks until material arrives or the context deadline expires; with Background() a slow disk could hang the TLS handshake indefinitely. Switched both the server and client callbacks to use hello.Context() / cri.Context() so a stuck read is bounded by the handshake timeout. 2. Admin server loaded TLS inside the serve goroutine. If the cert was bad, the goroutine returned but startAdminServer kept blocking on <-ctx.Done() with no listener, making the process look healthy with nothing bound. Moved TLS setup to run before the goroutine starts and propagate errors via fmt.Errorf; also captures the provider and defers Close(). 3. HTTP client discarded the certprovider.Provider from NewClientGetCertificate. That leaked the refresh goroutine, and NewHttpClientWithTLS had a worse case where a CA-file failure after provider creation orphaned the provider entirely. Added a certProvider field and a Close() method on HTTPClient, and made the constructors close the provider on subsequent error paths. 4. Server-side paths (master/volume/filer/s3/webdav/admin) now retain the provider. filer and webdav run ServeTLS synchronously, so a plain defer works. master/volume/s3 dispatch goroutines and return while the server keeps running, so they hook Close() into grace.OnInterrupt. 5. Test: certreload_test now tolerates transient read/parse errors during file rotation (writeSelfSigned rewrites cert before key) and reports the last error only if the deadline expires. No user-visible behavior change for the happy path. * test(tls): add end-to-end HTTPS cert rotation integration test Boots a real `weed master` with HTTPS enabled, captures the leaf cert served at TLS handshake time, atomically rewrites the cert/key files on disk (the same rename-in-place pattern kubelet does when it swaps a cert-manager Secret), and asserts that a subsequent TLS handshake observes the rotated leaf — with no process restart, no SIGHUP, no reloader sidecar. Verifies the full path: on-disk change → pemfile refresh tick → provider.KeyMaterial → tls.Config.GetCertificate → server TLS handshake. Runtime is ~1s by exposing the reloader's refresh window as an env var (WEED_TLS_CERT_REFRESH_INTERVAL) and setting it to 500ms for the test. The same env var is user-facing — documented in the wiki — so operators running short-lived certs (Vault, cert-manager with duration: 24h, etc.) can tighten the rotation-pickup window without a rebuild. Defaults to 5h to preserve prior behavior. security.CredRefreshingInterval is kept for API compatibility but now aliases certreload.DefaultRefreshInterval so the same env controls both gRPC mTLS and HTTPS reload. * ci(tls): wire the TLS rotation integration test into GitHub Actions Mirrors the existing vacuum-integration-tests.yml shape: Ubuntu runner, Go 1.25, build weed, run `go test` in test/tls_rotation, upload master logs on failure. 10-minute job timeout; the test itself finishes in about a second because WEED_TLS_CERT_REFRESH_INTERVAL is set to 500ms inside the test. Runs on every push to master and on every PR to master. * fix(tls): address follow-up PR review comments Three new comments on the integration test + volume shutdown path: 1. Test: peekServerCert was swallowing every dial/handshake error, which meant waitForCert's "last err: <nil>" fatal message lost all diagnostic value. Thread errors back through: peekServerCert now returns (*x509.Certificate, error), and waitForCert records the latest error so a CI flake points at the actual cause (master didn't come up, handshake rejected, CA pool mismatch, etc.). 2. Test: set HOME=<tempdir> on the master subprocess. Viper today registers the literal path "$HOME/.seaweedfs" without env expansion, so a developer's ~/.seaweedfs/security.toml is accidentally invisible — the test was relying on that. Pinning HOME is belt-and-braces against a future viper upgrade that does expand env vars. 3. volume.go: startClusterHttpService's provider close was registered via grace.OnInterrupt, which fires on SIGTERM but NOT on the v.shutdownCtx.Done() path used by mini / integration tests. The pemfile refresh goroutine leaked in that shutdown path. Now the helper returns a close func and the caller invokes it on BOTH shutdown paths for parity. Also add MinVersion: TLS 1.2 to the test's tls.Config to quiet the ast-grep static-analysis nit — zero-risk since the pool only trusts our in-memory CA. Test runs clean 3/3.
This commit is contained in:
@@ -0,0 +1,56 @@
|
||||
name: "TLS Rotation Integration Tests"
|
||||
|
||||
on:
|
||||
push:
|
||||
branches: [ master ]
|
||||
pull_request:
|
||||
branches: [ master ]
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
tls-rotation-tests:
|
||||
name: TLS Rotation Integration Tests
|
||||
runs-on: ubuntu-22.04
|
||||
timeout-minutes: 10
|
||||
steps:
|
||||
- name: Set up Go 1.x
|
||||
uses: actions/setup-go@v6
|
||||
with:
|
||||
go-version: ^1.25
|
||||
id: go
|
||||
|
||||
- name: Check out code into the Go module directory
|
||||
uses: actions/checkout@v6
|
||||
|
||||
- name: Build weed binary
|
||||
run: |
|
||||
cd weed && go build -o weed .
|
||||
|
||||
- name: Run TLS Rotation Integration Tests
|
||||
working-directory: test/tls_rotation
|
||||
run: |
|
||||
go test -v -count=1 -timeout 5m
|
||||
|
||||
- name: Collect server logs on failure
|
||||
if: failure()
|
||||
run: |
|
||||
echo "Collecting master logs from temp directories..."
|
||||
mkdir -p /tmp/tls-rotation-test-logs
|
||||
find /tmp -maxdepth 1 -type d -name "TestMasterHTTPS*" 2>/dev/null | while read dir; do
|
||||
if [ -d "$dir" ]; then
|
||||
echo "Found test directory: $dir"
|
||||
cp -r "$dir" /tmp/tls-rotation-test-logs/ 2>/dev/null || true
|
||||
fi
|
||||
done
|
||||
echo "Collected logs:"
|
||||
find /tmp/tls-rotation-test-logs -type f -name "*.log" 2>/dev/null || echo "No logs found"
|
||||
|
||||
- name: Archive logs
|
||||
if: failure()
|
||||
uses: actions/upload-artifact@v7
|
||||
with:
|
||||
name: tls-rotation-test-logs
|
||||
path: /tmp/tls-rotation-test-logs/
|
||||
retention-days: 14
|
||||
@@ -0,0 +1,323 @@
|
||||
// Package tls_rotation exercises HTTPS certificate rotation end-to-end:
|
||||
// start a real `weed master` with an HTTPS listener, capture the leaf
|
||||
// served at handshake time, rewrite the cert/key files on disk, and
|
||||
// assert that a subsequent handshake sees the new leaf — all without
|
||||
// stopping the master process. The test shortens the reloader's refresh
|
||||
// window to ~half a second via WEED_TLS_CERT_REFRESH_INTERVAL so it
|
||||
// completes in seconds rather than hours.
|
||||
package tls_rotation
|
||||
|
||||
import (
|
||||
"context"
|
||||
"crypto/ecdsa"
|
||||
"crypto/elliptic"
|
||||
"crypto/rand"
|
||||
"crypto/tls"
|
||||
"crypto/x509"
|
||||
"crypto/x509/pkix"
|
||||
"encoding/pem"
|
||||
"fmt"
|
||||
"math/big"
|
||||
"net"
|
||||
"os"
|
||||
"os/exec"
|
||||
"path/filepath"
|
||||
"strconv"
|
||||
"syscall"
|
||||
"testing"
|
||||
"time"
|
||||
)
|
||||
|
||||
// TestMasterHTTPSCertRotation boots `weed master` with HTTPS, confirms
|
||||
// the initial leaf is served, rotates the cert/key pair on disk, and
|
||||
// asserts the rotated leaf is served on subsequent TLS handshakes.
|
||||
func TestMasterHTTPSCertRotation(t *testing.T) {
|
||||
if testing.Short() {
|
||||
t.Skip("skipping HTTPS rotation integration test in -short mode")
|
||||
}
|
||||
|
||||
weedBin := findWeedBinary(t)
|
||||
|
||||
dir := t.TempDir()
|
||||
tlsDir := filepath.Join(dir, "tls")
|
||||
if err := os.MkdirAll(tlsDir, 0o755); err != nil {
|
||||
t.Fatalf("mkdir tls: %v", err)
|
||||
}
|
||||
certPath := filepath.Join(tlsDir, "server.crt")
|
||||
keyPath := filepath.Join(tlsDir, "server.key")
|
||||
|
||||
ca, caKey := generateCA(t)
|
||||
leafSerial1 := big.NewInt(10001)
|
||||
leafSerial2 := big.NewInt(10002)
|
||||
|
||||
// Initial leaf on disk.
|
||||
writeLeaf(t, certPath, keyPath, ca, caKey, leafSerial1)
|
||||
|
||||
masterDir := filepath.Join(dir, "master")
|
||||
if err := os.MkdirAll(masterDir, 0o755); err != nil {
|
||||
t.Fatalf("mkdir master: %v", err)
|
||||
}
|
||||
// Empty security.toml so the master doesn't pick up a user's
|
||||
// ~/.seaweedfs/security.toml during the test.
|
||||
if err := os.WriteFile(filepath.Join(masterDir, "security.toml"), []byte("# test\n"), 0o644); err != nil {
|
||||
t.Fatalf("write security.toml: %v", err)
|
||||
}
|
||||
|
||||
// Master auto-derives gRPC port as port+10000 when -port.grpc is
|
||||
// unset, so both must fit in uint16. Pin both explicitly.
|
||||
port, grpcPort := getFreeTCPPort(t), getFreeTCPPort(t)
|
||||
|
||||
ctx, cancel := context.WithTimeout(context.Background(), 90*time.Second)
|
||||
defer cancel()
|
||||
|
||||
cmd := exec.CommandContext(ctx, weedBin, "master",
|
||||
"-ip", "127.0.0.1",
|
||||
"-port", strconv.Itoa(port),
|
||||
"-port.grpc", strconv.Itoa(grpcPort),
|
||||
"-mdir", masterDir,
|
||||
)
|
||||
cmd.Dir = masterDir
|
||||
cmd.Env = append(os.Environ(),
|
||||
// Isolate HOME so the subprocess cannot pick up a developer's
|
||||
// ~/.seaweedfs/security.toml. Viper's AddConfigPath uses the
|
||||
// literal string "$HOME/.seaweedfs" without env expansion today,
|
||||
// so this is only belt-and-braces — but it insures us against a
|
||||
// future viper upgrade that does expand env vars.
|
||||
"HOME="+dir,
|
||||
"WEED_HTTPS_MASTER_CERT="+certPath,
|
||||
"WEED_HTTPS_MASTER_KEY="+keyPath,
|
||||
// Short refresh window so rotation completes in seconds.
|
||||
"WEED_TLS_CERT_REFRESH_INTERVAL=500ms",
|
||||
)
|
||||
logPath := filepath.Join(masterDir, "master.log")
|
||||
logOut, err := os.Create(logPath)
|
||||
if err != nil {
|
||||
t.Fatalf("create master log: %v", err)
|
||||
}
|
||||
cmd.Stdout = logOut
|
||||
cmd.Stderr = logOut
|
||||
|
||||
if err := cmd.Start(); err != nil {
|
||||
t.Fatalf("start master: %v", err)
|
||||
}
|
||||
t.Cleanup(func() {
|
||||
if cmd.Process != nil {
|
||||
_ = cmd.Process.Signal(syscall.SIGTERM)
|
||||
done := make(chan struct{})
|
||||
go func() { _ = cmd.Wait(); close(done) }()
|
||||
select {
|
||||
case <-done:
|
||||
case <-time.After(10 * time.Second):
|
||||
_ = cmd.Process.Kill()
|
||||
<-done
|
||||
}
|
||||
}
|
||||
_ = logOut.Close()
|
||||
if t.Failed() {
|
||||
if b, readErr := os.ReadFile(logPath); readErr == nil {
|
||||
t.Logf("master.log:\n%s", string(b))
|
||||
}
|
||||
}
|
||||
})
|
||||
|
||||
caPool := x509.NewCertPool()
|
||||
caPool.AddCert(ca)
|
||||
|
||||
addr := fmt.Sprintf("127.0.0.1:%d", port)
|
||||
|
||||
// 1. Wait for the initial leaf to appear. Master takes a few seconds
|
||||
// to open its HTTPS listener.
|
||||
waitForCert(t, addr, caPool, leafSerial1, 30*time.Second, "initial cert")
|
||||
|
||||
// Sanity: same handshake twice still observes the initial leaf.
|
||||
got, err := peekServerCert(addr, caPool)
|
||||
if err != nil || got == nil || got.SerialNumber.Cmp(leafSerial1) != 0 {
|
||||
t.Fatalf("second probe before rotation did not return initial leaf: cert=%v err=%v", got, err)
|
||||
}
|
||||
|
||||
// 2. Rotate on disk. pemfile watches mtime, so each file's write is
|
||||
// an atomic rename (tempfile in the same directory).
|
||||
writeLeaf(t, certPath, keyPath, ca, caKey, leafSerial2)
|
||||
|
||||
// 3. Wait for new leaf to take over. With a 500ms refresh and no
|
||||
// connection pooling (tls.Dial opens a fresh conn each time), this
|
||||
// should take a couple of seconds.
|
||||
waitForCert(t, addr, caPool, leafSerial2, 15*time.Second, "rotated cert")
|
||||
}
|
||||
|
||||
// waitForCert polls until a TLS handshake against addr yields a peer
|
||||
// cert with the expected serial, or fails the test at the deadline.
|
||||
// The last handshake error is surfaced in the fatal message so that a
|
||||
// CI flake makes the root cause obvious (master didn't come up, TLS
|
||||
// handshake rejected, CA pool mismatch, etc.).
|
||||
func waitForCert(t *testing.T, addr string, caPool *x509.CertPool, wantSerial *big.Int, within time.Duration, label string) {
|
||||
t.Helper()
|
||||
deadline := time.Now().Add(within)
|
||||
var lastErr error
|
||||
var lastSerial *big.Int
|
||||
for time.Now().Before(deadline) {
|
||||
cert, err := peekServerCert(addr, caPool)
|
||||
if err != nil {
|
||||
lastErr = err
|
||||
} else if cert != nil {
|
||||
lastSerial = cert.SerialNumber
|
||||
if cert.SerialNumber.Cmp(wantSerial) == 0 {
|
||||
return
|
||||
}
|
||||
}
|
||||
time.Sleep(250 * time.Millisecond)
|
||||
}
|
||||
t.Fatalf("timeout waiting for %s (want serial %s, last seen %v, last err %v)", label, wantSerial, lastSerial, lastErr)
|
||||
}
|
||||
|
||||
// peekServerCert opens a one-shot TLS connection and returns the leaf.
|
||||
// Errors (dial failure, handshake rejection, empty peer chain) are
|
||||
// returned rather than swallowed, so the caller can surface them when
|
||||
// the test times out.
|
||||
func peekServerCert(addr string, caPool *x509.CertPool) (*x509.Certificate, error) {
|
||||
d := &net.Dialer{Timeout: 2 * time.Second}
|
||||
conn, err := tls.DialWithDialer(d, "tcp", addr, &tls.Config{
|
||||
RootCAs: caPool,
|
||||
ServerName: "localhost",
|
||||
MinVersion: tls.VersionTLS12,
|
||||
})
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
defer conn.Close()
|
||||
state := conn.ConnectionState()
|
||||
if len(state.PeerCertificates) == 0 {
|
||||
return nil, fmt.Errorf("handshake returned empty peer chain")
|
||||
}
|
||||
return state.PeerCertificates[0], nil
|
||||
}
|
||||
|
||||
func getFreeTCPPort(t *testing.T) int {
|
||||
t.Helper()
|
||||
ln, err := net.Listen("tcp", "127.0.0.1:0")
|
||||
if err != nil {
|
||||
t.Fatalf("listen ephemeral: %v", err)
|
||||
}
|
||||
port := ln.Addr().(*net.TCPAddr).Port
|
||||
_ = ln.Close()
|
||||
return port
|
||||
}
|
||||
|
||||
func findWeedBinary(t *testing.T) string {
|
||||
t.Helper()
|
||||
candidates := []string{
|
||||
"../../weed/weed",
|
||||
"../weed/weed",
|
||||
"./weed",
|
||||
}
|
||||
for _, c := range candidates {
|
||||
if _, err := os.Stat(c); err == nil {
|
||||
abs, absErr := filepath.Abs(c)
|
||||
if absErr == nil {
|
||||
return abs
|
||||
}
|
||||
return c
|
||||
}
|
||||
}
|
||||
if path, err := exec.LookPath("weed"); err == nil {
|
||||
return path
|
||||
}
|
||||
t.Skip("weed binary not found — build with `cd weed && go build` first")
|
||||
return ""
|
||||
}
|
||||
|
||||
// --- cert fixtures -------------------------------------------------------
|
||||
|
||||
// generateCA returns a self-signed CA cert and its private key.
|
||||
func generateCA(t *testing.T) (*x509.Certificate, *ecdsa.PrivateKey) {
|
||||
t.Helper()
|
||||
key, err := ecdsa.GenerateKey(elliptic.P256(), rand.Reader)
|
||||
if err != nil {
|
||||
t.Fatalf("gen CA key: %v", err)
|
||||
}
|
||||
tmpl := &x509.Certificate{
|
||||
SerialNumber: big.NewInt(1),
|
||||
Subject: pkix.Name{CommonName: "tls-rotation-test-CA"},
|
||||
NotBefore: time.Now().Add(-time.Hour),
|
||||
NotAfter: time.Now().Add(24 * time.Hour),
|
||||
IsCA: true,
|
||||
BasicConstraintsValid: true,
|
||||
KeyUsage: x509.KeyUsageCertSign | x509.KeyUsageCRLSign,
|
||||
}
|
||||
der, err := x509.CreateCertificate(rand.Reader, tmpl, tmpl, &key.PublicKey, key)
|
||||
if err != nil {
|
||||
t.Fatalf("create CA cert: %v", err)
|
||||
}
|
||||
parsed, err := x509.ParseCertificate(der)
|
||||
if err != nil {
|
||||
t.Fatalf("parse CA cert: %v", err)
|
||||
}
|
||||
return parsed, key
|
||||
}
|
||||
|
||||
// writeLeaf signs a new leaf cert with the given serial and writes it
|
||||
// plus its key to the given paths via atomic rename — the pattern
|
||||
// Kubernetes (cert-manager → Secret volume mount) produces in practice.
|
||||
func writeLeaf(t *testing.T, certPath, keyPath string, ca *x509.Certificate, caKey *ecdsa.PrivateKey, serial *big.Int) {
|
||||
t.Helper()
|
||||
leafKey, err := ecdsa.GenerateKey(elliptic.P256(), rand.Reader)
|
||||
if err != nil {
|
||||
t.Fatalf("gen leaf key: %v", err)
|
||||
}
|
||||
tmpl := &x509.Certificate{
|
||||
SerialNumber: serial,
|
||||
Subject: pkix.Name{
|
||||
CommonName: "localhost",
|
||||
},
|
||||
NotBefore: time.Now().Add(-time.Hour),
|
||||
NotAfter: time.Now().Add(24 * time.Hour),
|
||||
KeyUsage: x509.KeyUsageDigitalSignature | x509.KeyUsageKeyEncipherment,
|
||||
ExtKeyUsage: []x509.ExtKeyUsage{x509.ExtKeyUsageServerAuth, x509.ExtKeyUsageClientAuth},
|
||||
DNSNames: []string{"localhost"},
|
||||
IPAddresses: []net.IP{net.ParseIP("127.0.0.1"), net.ParseIP("::1")},
|
||||
}
|
||||
der, err := x509.CreateCertificate(rand.Reader, tmpl, ca, &leafKey.PublicKey, caKey)
|
||||
if err != nil {
|
||||
t.Fatalf("create leaf cert: %v", err)
|
||||
}
|
||||
|
||||
atomicWritePEM(t, certPath, "CERTIFICATE", der)
|
||||
|
||||
keyDER, err := x509.MarshalECPrivateKey(leafKey)
|
||||
if err != nil {
|
||||
t.Fatalf("marshal leaf key: %v", err)
|
||||
}
|
||||
atomicWritePEM(t, keyPath, "EC PRIVATE KEY", keyDER)
|
||||
}
|
||||
|
||||
// atomicWritePEM writes a PEM file via tempfile-in-same-directory plus
|
||||
// rename, matching what kubelet does when it swaps the ..data symlink
|
||||
// for a renewed Secret. Ensures the reader never sees a truncated file.
|
||||
func atomicWritePEM(t *testing.T, path, blockType string, der []byte) {
|
||||
t.Helper()
|
||||
dir := filepath.Dir(path)
|
||||
tmp, err := os.CreateTemp(dir, ".tls-*")
|
||||
if err != nil {
|
||||
t.Fatalf("create tempfile: %v", err)
|
||||
}
|
||||
ok := false
|
||||
defer func() {
|
||||
if !ok {
|
||||
_ = os.Remove(tmp.Name())
|
||||
}
|
||||
}()
|
||||
if err := pem.Encode(tmp, &pem.Block{Type: blockType, Bytes: der}); err != nil {
|
||||
tmp.Close()
|
||||
t.Fatalf("pem encode: %v", err)
|
||||
}
|
||||
if err := tmp.Close(); err != nil {
|
||||
t.Fatalf("close tempfile: %v", err)
|
||||
}
|
||||
if err := os.Chmod(tmp.Name(), 0o600); err != nil {
|
||||
t.Fatalf("chmod tempfile: %v", err)
|
||||
}
|
||||
if err := os.Rename(tmp.Name(), path); err != nil {
|
||||
t.Fatalf("rename tempfile onto %s: %v", path, err)
|
||||
}
|
||||
ok = true
|
||||
}
|
||||
+45
-30
@@ -4,6 +4,7 @@ import (
|
||||
"bufio"
|
||||
"context"
|
||||
"crypto/rand"
|
||||
"crypto/tls"
|
||||
"fmt"
|
||||
"log"
|
||||
"net"
|
||||
@@ -411,43 +412,57 @@ func startAdminServer(ctx context.Context, options AdminOptions, enableUI bool,
|
||||
Handler: handler,
|
||||
}
|
||||
|
||||
// Decide TLS configuration BEFORE launching the server goroutine, so a
|
||||
// bad cert or a missing key surfaces as a startup error instead of a
|
||||
// silently returned goroutine that leaves startAdminServer blocked on
|
||||
// ctx.Done() with no listener.
|
||||
var (
|
||||
clientCertFile,
|
||||
certFile,
|
||||
keyFile string
|
||||
)
|
||||
useTLS := false
|
||||
useMTLS := false
|
||||
|
||||
if viper.GetString("https.admin.key") != "" {
|
||||
useTLS = true
|
||||
certFile = viper.GetString("https.admin.cert")
|
||||
keyFile = viper.GetString("https.admin.key")
|
||||
}
|
||||
|
||||
if viper.GetString("https.admin.ca") != "" {
|
||||
useMTLS = true
|
||||
clientCertFile = viper.GetString("https.admin.ca")
|
||||
}
|
||||
|
||||
if useMTLS {
|
||||
server.TLSConfig = security.LoadClientTLSHTTP(clientCertFile)
|
||||
}
|
||||
|
||||
if useTLS {
|
||||
getCert, certProvider, certErr := security.NewReloadingServerCertificate(certFile, keyFile)
|
||||
if certErr != nil {
|
||||
return fmt.Errorf("load admin HTTPS certificate: %w", certErr)
|
||||
}
|
||||
defer certProvider.Close()
|
||||
if server.TLSConfig == nil {
|
||||
server.TLSConfig = &tls.Config{}
|
||||
}
|
||||
server.TLSConfig.GetCertificate = getCert
|
||||
}
|
||||
|
||||
// Start server
|
||||
go func() {
|
||||
log.Printf("Starting SeaweedFS Admin Server on port %d", *options.port)
|
||||
|
||||
// start http or https server with security.toml
|
||||
var (
|
||||
clientCertFile,
|
||||
certFile,
|
||||
keyFile string
|
||||
)
|
||||
useTLS := false
|
||||
useMTLS := false
|
||||
|
||||
if viper.GetString("https.admin.key") != "" {
|
||||
useTLS = true
|
||||
certFile = viper.GetString("https.admin.cert")
|
||||
keyFile = viper.GetString("https.admin.key")
|
||||
}
|
||||
|
||||
if viper.GetString("https.admin.ca") != "" {
|
||||
useMTLS = true
|
||||
clientCertFile = viper.GetString("https.admin.ca")
|
||||
}
|
||||
|
||||
if useMTLS {
|
||||
server.TLSConfig = security.LoadClientTLSHTTP(clientCertFile)
|
||||
}
|
||||
|
||||
var serveErr error
|
||||
if useTLS {
|
||||
log.Printf("Starting SeaweedFS Admin Server with TLS on port %d", *options.port)
|
||||
err = server.ListenAndServeTLS(certFile, keyFile)
|
||||
serveErr = server.ListenAndServeTLS("", "")
|
||||
} else {
|
||||
err = server.ListenAndServe()
|
||||
serveErr = server.ListenAndServe()
|
||||
}
|
||||
|
||||
if err != nil && err != http.ErrServerClosed {
|
||||
log.Printf("Failed to start server: %v", err)
|
||||
if serveErr != nil && serveErr != http.ErrServerClosed {
|
||||
log.Printf("Failed to start server: %v", serveErr)
|
||||
}
|
||||
}()
|
||||
|
||||
|
||||
+5
-20
@@ -15,8 +15,6 @@ import (
|
||||
"time"
|
||||
|
||||
"github.com/spf13/viper"
|
||||
"google.golang.org/grpc/credentials/tls/certprovider"
|
||||
"google.golang.org/grpc/credentials/tls/certprovider/pemfile"
|
||||
"google.golang.org/grpc/reflection"
|
||||
|
||||
"github.com/seaweedfs/seaweedfs/weed/credential"
|
||||
@@ -82,7 +80,6 @@ type FilerOptions struct {
|
||||
allowedOrigins *string
|
||||
exposeDirectoryData *bool
|
||||
tusBasePath *string
|
||||
certProvider certprovider.Provider
|
||||
s3ConfigFile *string // optional path to static S3 identity config
|
||||
mountPeerRegistryEnable *bool // accept MountRegister/MountList RPCs (peer chunk sharing tier 1)
|
||||
// shutdownCtx, when non-nil, tells startFiler to gracefully shut down its
|
||||
@@ -315,15 +312,6 @@ func runFiler(cmd *Command, args []string) bool {
|
||||
return true
|
||||
}
|
||||
|
||||
// GetCertificateWithUpdate Auto refreshing TSL certificate
|
||||
func (fo *FilerOptions) GetCertificateWithUpdate(*tls.ClientHelloInfo) (*tls.Certificate, error) {
|
||||
certs, err := fo.certProvider.KeyMaterial(context.Background())
|
||||
if certs == nil {
|
||||
return nil, err
|
||||
}
|
||||
return &certs.Certs[0], err
|
||||
}
|
||||
|
||||
func (fo *FilerOptions) startFiler() {
|
||||
|
||||
defaultMux := http.NewServeMux()
|
||||
@@ -500,14 +488,11 @@ func (fo *FilerOptions) startFiler() {
|
||||
caCertFile := viper.GetString("https.filer.ca")
|
||||
disbaleTlsVerifyClientCert := viper.GetBool("https.filer.disable_tls_verify_client_cert")
|
||||
|
||||
pemfileOptions := pemfile.Options{
|
||||
CertFile: certFile,
|
||||
KeyFile: keyFile,
|
||||
RefreshDuration: security.CredRefreshingInterval,
|
||||
}
|
||||
if fo.certProvider, err = pemfile.NewProvider(pemfileOptions); err != nil {
|
||||
glog.Fatalf("pemfile.NewProvider(%v) failed: %v", pemfileOptions, err)
|
||||
getCert, certProvider, err := security.NewReloadingServerCertificate(certFile, keyFile)
|
||||
if err != nil {
|
||||
glog.Fatalf("Filer failed to load HTTPS certificate: %v", err)
|
||||
}
|
||||
defer certProvider.Close()
|
||||
|
||||
caCertPool := x509.NewCertPool()
|
||||
if caCertFile != "" {
|
||||
@@ -524,7 +509,7 @@ func (fo *FilerOptions) startFiler() {
|
||||
}
|
||||
|
||||
tlsConfig := &tls.Config{
|
||||
GetCertificate: fo.GetCertificateWithUpdate,
|
||||
GetCertificate: getCert,
|
||||
ClientAuth: clientAuth,
|
||||
ClientCAs: caCertPool,
|
||||
}
|
||||
|
||||
+14
-1
@@ -328,7 +328,20 @@ func startMaster(masterOption MasterOptions, masterWhiteList []string) {
|
||||
}
|
||||
|
||||
if useTLS {
|
||||
go newHttpServer(r, tlsConfig).ServeTLS(masterListener, certFile, keyFile)
|
||||
getCert, certProvider, err := security.NewReloadingServerCertificate(certFile, keyFile)
|
||||
if err != nil {
|
||||
glog.Fatalf("failed to load master HTTPS certificate: %v", err)
|
||||
}
|
||||
// Master runs ServeTLS in a goroutine and this function then blocks
|
||||
// on shutdownCtx / select{}; tie the pem refresh goroutine to the
|
||||
// existing interrupt hook instead of a local defer that would fire
|
||||
// while the server is still running.
|
||||
grace.OnInterrupt(certProvider.Close)
|
||||
if tlsConfig == nil {
|
||||
tlsConfig = &tls.Config{}
|
||||
}
|
||||
tlsConfig.GetCertificate = getCert
|
||||
go newHttpServer(r, tlsConfig).ServeTLS(masterListener, "", "")
|
||||
} else {
|
||||
go newHttpServer(r, nil).Serve(masterListener)
|
||||
}
|
||||
|
||||
+5
-20
@@ -15,8 +15,6 @@ import (
|
||||
"time"
|
||||
|
||||
"github.com/gorilla/mux"
|
||||
"google.golang.org/grpc/credentials/tls/certprovider"
|
||||
"google.golang.org/grpc/credentials/tls/certprovider/pemfile"
|
||||
"google.golang.org/grpc/reflection"
|
||||
|
||||
"github.com/seaweedfs/seaweedfs/weed/glog"
|
||||
@@ -62,7 +60,6 @@ type S3Options struct {
|
||||
localFilerSocket *string
|
||||
dataCenter *string
|
||||
localSocket *string
|
||||
certProvider certprovider.Provider
|
||||
idleTimeout *int
|
||||
concurrentUploadLimitMB *int
|
||||
concurrentFileUploadLimit *int
|
||||
@@ -228,15 +225,6 @@ func runS3(cmd *Command, args []string) bool {
|
||||
|
||||
}
|
||||
|
||||
// GetCertificateWithUpdate Auto refreshing TSL certificate
|
||||
func (s3opt *S3Options) GetCertificateWithUpdate(*tls.ClientHelloInfo) (*tls.Certificate, error) {
|
||||
certs, err := s3opt.certProvider.KeyMaterial(context.Background())
|
||||
if certs == nil {
|
||||
return nil, err
|
||||
}
|
||||
return &certs.Certs[0], err
|
||||
}
|
||||
|
||||
// resolveExternalUrl returns the external URL from the flag or falls back to the S3_EXTERNAL_URL env var.
|
||||
func (s3opt *S3Options) resolveExternalUrl() string {
|
||||
if s3opt.externalUrl != nil && *s3opt.externalUrl != "" {
|
||||
@@ -415,14 +403,11 @@ func (s3opt *S3Options) startS3Server() bool {
|
||||
glog.Fatalf("S3 API Server error: -s3.port.https (%d) cannot be the same as -s3.port (%d)", *s3opt.portHttps, *s3opt.port)
|
||||
}
|
||||
|
||||
pemfileOptions := pemfile.Options{
|
||||
CertFile: *s3opt.tlsCertificate,
|
||||
KeyFile: *s3opt.tlsPrivateKey,
|
||||
RefreshDuration: security.CredRefreshingInterval,
|
||||
}
|
||||
if s3opt.certProvider, err = pemfile.NewProvider(pemfileOptions); err != nil {
|
||||
glog.Fatalf("pemfile.NewProvider(%v) failed: %v", pemfileOptions, err)
|
||||
getCert, certProvider, err := security.NewReloadingServerCertificate(*s3opt.tlsCertificate, *s3opt.tlsPrivateKey)
|
||||
if err != nil {
|
||||
glog.Fatalf("S3 API Server failed to load HTTPS certificate: %v", err)
|
||||
}
|
||||
grace.OnInterrupt(certProvider.Close)
|
||||
|
||||
caCertPool := x509.NewCertPool()
|
||||
if *s3opt.tlsCACertificate != "" {
|
||||
@@ -440,7 +425,7 @@ func (s3opt *S3Options) startS3Server() bool {
|
||||
}
|
||||
|
||||
tlsConfig := &tls.Config{
|
||||
GetCertificate: s3opt.GetCertificateWithUpdate,
|
||||
GetCertificate: getCert,
|
||||
ClientAuth: clientAuth,
|
||||
ClientCAs: caCertPool,
|
||||
}
|
||||
|
||||
+30
-5
@@ -2,6 +2,7 @@ package command
|
||||
|
||||
import (
|
||||
"context"
|
||||
"crypto/tls"
|
||||
"fmt"
|
||||
"net/http"
|
||||
httppprof "net/http/pprof"
|
||||
@@ -312,7 +313,7 @@ func (v VolumeServerOptions) startVolumeServer(volumeFolders, maxVolumeCounts, v
|
||||
}
|
||||
|
||||
// starting the cluster http server
|
||||
clusterHttpServer := v.startClusterHttpService(volumeMux)
|
||||
clusterHttpServer, closeCert := v.startClusterHttpService(volumeMux)
|
||||
|
||||
grace.OnReload(volumeServer.LoadNewVolumes)
|
||||
grace.OnReload(volumeServer.Reload)
|
||||
@@ -329,6 +330,9 @@ func (v VolumeServerOptions) startVolumeServer(volumeFolders, maxVolumeCounts, v
|
||||
}
|
||||
|
||||
shutdown(publicHttpDown, clusterHttpServer, grpcS, volumeServer)
|
||||
if closeCert != nil {
|
||||
closeCert()
|
||||
}
|
||||
stopChan <- true
|
||||
})
|
||||
|
||||
@@ -337,6 +341,9 @@ func (v VolumeServerOptions) startVolumeServer(volumeFolders, maxVolumeCounts, v
|
||||
case <-stopChan:
|
||||
case <-v.shutdownCtx.Done():
|
||||
shutdown(publicHttpDown, clusterHttpServer, grpcS, volumeServer)
|
||||
if closeCert != nil {
|
||||
closeCert()
|
||||
}
|
||||
}
|
||||
} else {
|
||||
select {
|
||||
@@ -442,7 +449,13 @@ func (v VolumeServerOptions) startPublicHttpService(handler http.Handler) httpdo
|
||||
return publicHttpDown
|
||||
}
|
||||
|
||||
func (v VolumeServerOptions) startClusterHttpService(handler http.Handler) httpdown.Server {
|
||||
// startClusterHttpService starts the volume cluster HTTP server and
|
||||
// returns it along with a close func for the cert reloader's refresh
|
||||
// goroutine (nil when HTTPS is disabled). The caller is responsible
|
||||
// for invoking the close func on every shutdown path — both the
|
||||
// SIGTERM/grace.OnInterrupt path and the shutdownCtx path used by
|
||||
// mini/integration tests.
|
||||
func (v VolumeServerOptions) startClusterHttpService(handler http.Handler) (httpdown.Server, func()) {
|
||||
var (
|
||||
certFile, keyFile string
|
||||
)
|
||||
@@ -461,8 +474,7 @@ func (v VolumeServerOptions) startClusterHttpService(handler http.Handler) httpd
|
||||
httpDown := httpdown.HTTP{
|
||||
KillTimeout: time.Minute,
|
||||
StopTimeout: 30 * time.Second,
|
||||
CertFile: certFile,
|
||||
KeyFile: keyFile}
|
||||
}
|
||||
httpS := &http.Server{Handler: handler}
|
||||
|
||||
if viper.GetString("https.volume.ca") != "" {
|
||||
@@ -471,11 +483,24 @@ func (v VolumeServerOptions) startClusterHttpService(handler http.Handler) httpd
|
||||
security.FixTlsConfig(util.GetViper(), httpS.TLSConfig)
|
||||
}
|
||||
|
||||
var closeCert func()
|
||||
if certFile != "" && keyFile != "" {
|
||||
getCert, certProvider, err := security.NewReloadingServerCertificate(certFile, keyFile)
|
||||
if err != nil {
|
||||
glog.Fatalf("Volume server failed to load TLS certificate: %v", err)
|
||||
}
|
||||
closeCert = certProvider.Close
|
||||
if httpS.TLSConfig == nil {
|
||||
httpS.TLSConfig = &tls.Config{}
|
||||
}
|
||||
httpS.TLSConfig.GetCertificate = getCert
|
||||
}
|
||||
|
||||
clusterHttpServer := httpDown.Serve(httpS, listener)
|
||||
go func() {
|
||||
if e := clusterHttpServer.Wait(); e != nil {
|
||||
glog.Fatalf("Volume server fail to serve: %v", e)
|
||||
}
|
||||
}()
|
||||
return clusterHttpServer
|
||||
return clusterHttpServer, closeCert
|
||||
}
|
||||
|
||||
@@ -2,6 +2,7 @@ package command
|
||||
|
||||
import (
|
||||
"context"
|
||||
"crypto/tls"
|
||||
"fmt"
|
||||
"net/http"
|
||||
"os"
|
||||
@@ -150,7 +151,13 @@ func (wo *WebDavOption) startWebDav() bool {
|
||||
|
||||
if *wo.tlsPrivateKey != "" {
|
||||
glog.V(0).Infof("Start Seaweed WebDav Server %s at https %s", version.Version(), listenAddress)
|
||||
if err = httpS.ServeTLS(webDavListener, *wo.tlsCertificate, *wo.tlsPrivateKey); err != nil && err != http.ErrServerClosed {
|
||||
getCert, certProvider, err := security.NewReloadingServerCertificate(*wo.tlsCertificate, *wo.tlsPrivateKey)
|
||||
if err != nil {
|
||||
glog.Fatalf("WebDav Server failed to load TLS certificate: %v", err)
|
||||
}
|
||||
defer certProvider.Close()
|
||||
httpS.TLSConfig = &tls.Config{GetCertificate: getCert}
|
||||
if err = httpS.ServeTLS(webDavListener, "", ""); err != nil && err != http.ErrServerClosed {
|
||||
glog.Fatalf("WebDav Server Fail to serve: %v", err)
|
||||
}
|
||||
} else {
|
||||
|
||||
@@ -0,0 +1,120 @@
|
||||
// Package certreload wraps grpc's pemfile.Provider so both TLS servers
|
||||
// (weed/security) and TLS clients (weed/util/http/client) can share one
|
||||
// reloading cert implementation without an import cycle between them.
|
||||
//
|
||||
// Lives in its own subpackage because weed/security already imports
|
||||
// weed/util/http/client (for LoadHTTPClientFromFile).
|
||||
package certreload
|
||||
|
||||
import (
|
||||
"context"
|
||||
"crypto/tls"
|
||||
"fmt"
|
||||
"os"
|
||||
"time"
|
||||
|
||||
"google.golang.org/grpc/credentials/tls/certprovider"
|
||||
"google.golang.org/grpc/credentials/tls/certprovider/pemfile"
|
||||
)
|
||||
|
||||
// RefreshIntervalEnv names an environment variable that overrides the
|
||||
// refresh cadence. Accepts any time.ParseDuration value (e.g. "30m",
|
||||
// "500ms"). Primarily a hook for integration tests that need rotation
|
||||
// to complete in seconds, but also useful in production when paired
|
||||
// with short-lived certs (e.g. Vault-issued).
|
||||
const RefreshIntervalEnv = "WEED_TLS_CERT_REFRESH_INTERVAL"
|
||||
|
||||
// DefaultRefreshInterval is the cadence at which the pemfile provider
|
||||
// stats cert/key files on disk. It re-parses only when mtime/contents
|
||||
// change, so the hot path (KeyMaterial() on each TLS handshake) stays
|
||||
// cheap.
|
||||
//
|
||||
// 5 hours matches the prior constant used for gRPC mTLS. Resolved once
|
||||
// at process start from RefreshIntervalEnv if set.
|
||||
var DefaultRefreshInterval = resolveRefreshInterval(5 * time.Hour)
|
||||
|
||||
func resolveRefreshInterval(fallback time.Duration) time.Duration {
|
||||
if s := os.Getenv(RefreshIntervalEnv); s != "" {
|
||||
if d, err := time.ParseDuration(s); err == nil && d > 0 {
|
||||
return d
|
||||
}
|
||||
}
|
||||
return fallback
|
||||
}
|
||||
|
||||
// NewServerGetCertificate returns a callback suitable for
|
||||
// tls.Config.GetCertificate. It reloads certFile and keyFile from disk
|
||||
// on each refresh tick so rotated certs (e.g. from k8s cert-manager)
|
||||
// are picked up without a restart. Caller should Close() the returned
|
||||
// provider at shutdown.
|
||||
func NewServerGetCertificate(certFile, keyFile string) (func(*tls.ClientHelloInfo) (*tls.Certificate, error), certprovider.Provider, error) {
|
||||
return newServerGetCertificate(certFile, keyFile, DefaultRefreshInterval)
|
||||
}
|
||||
|
||||
func newServerGetCertificate(certFile, keyFile string, refresh time.Duration) (func(*tls.ClientHelloInfo) (*tls.Certificate, error), certprovider.Provider, error) {
|
||||
provider, err := newProvider(certFile, keyFile, refresh)
|
||||
if err != nil {
|
||||
return nil, nil, err
|
||||
}
|
||||
get := func(hello *tls.ClientHelloInfo) (*tls.Certificate, error) {
|
||||
// KeyMaterial blocks until the pemfile provider has read the files
|
||||
// for the first time. Use the handshake context so a stuck read
|
||||
// bounds the handshake instead of hanging it forever.
|
||||
ctx := context.Background()
|
||||
if hello != nil {
|
||||
ctx = hello.Context()
|
||||
}
|
||||
return current(ctx, provider, certFile)
|
||||
}
|
||||
return get, provider, nil
|
||||
}
|
||||
|
||||
// NewClientGetCertificate returns a callback suitable for
|
||||
// tls.Config.GetClientCertificate. Fires per TLS handshake, so long-lived
|
||||
// HTTPS clients (FUSE mount, backup, filer→volume, etc.) pick up rotated
|
||||
// client mTLS certs as pooled connections recycle.
|
||||
func NewClientGetCertificate(certFile, keyFile string) (func(*tls.CertificateRequestInfo) (*tls.Certificate, error), certprovider.Provider, error) {
|
||||
return newClientGetCertificate(certFile, keyFile, DefaultRefreshInterval)
|
||||
}
|
||||
|
||||
func newClientGetCertificate(certFile, keyFile string, refresh time.Duration) (func(*tls.CertificateRequestInfo) (*tls.Certificate, error), certprovider.Provider, error) {
|
||||
provider, err := newProvider(certFile, keyFile, refresh)
|
||||
if err != nil {
|
||||
return nil, nil, err
|
||||
}
|
||||
get := func(cri *tls.CertificateRequestInfo) (*tls.Certificate, error) {
|
||||
ctx := context.Background()
|
||||
if cri != nil {
|
||||
ctx = cri.Context()
|
||||
}
|
||||
return current(ctx, provider, certFile)
|
||||
}
|
||||
return get, provider, nil
|
||||
}
|
||||
|
||||
func newProvider(certFile, keyFile string, refresh time.Duration) (certprovider.Provider, error) {
|
||||
if certFile == "" || keyFile == "" {
|
||||
return nil, fmt.Errorf("both certFile and keyFile are required")
|
||||
}
|
||||
provider, err := pemfile.NewProvider(pemfile.Options{
|
||||
CertFile: certFile,
|
||||
KeyFile: keyFile,
|
||||
RefreshDuration: refresh,
|
||||
})
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("pemfile.NewProvider: %w", err)
|
||||
}
|
||||
return provider, nil
|
||||
}
|
||||
|
||||
func current(ctx context.Context, provider certprovider.Provider, certFile string) (*tls.Certificate, error) {
|
||||
km, err := provider.KeyMaterial(ctx)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
if km == nil || len(km.Certs) == 0 {
|
||||
return nil, fmt.Errorf("no TLS key material available for %s", certFile)
|
||||
}
|
||||
cert := km.Certs[0]
|
||||
return &cert, nil
|
||||
}
|
||||
@@ -0,0 +1,122 @@
|
||||
package certreload
|
||||
|
||||
import (
|
||||
"crypto/ecdsa"
|
||||
"crypto/elliptic"
|
||||
"crypto/rand"
|
||||
"crypto/x509"
|
||||
"crypto/x509/pkix"
|
||||
"encoding/pem"
|
||||
"math/big"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"testing"
|
||||
"time"
|
||||
)
|
||||
|
||||
// TestServerGetCertificatePicksUpNewFile writes an initial cert/key
|
||||
// pair, starts the reloader, then overwrites the files with a new pair
|
||||
// and asserts that the callback eventually returns the new certificate.
|
||||
func TestServerGetCertificatePicksUpNewFile(t *testing.T) {
|
||||
dir := t.TempDir()
|
||||
certPath := filepath.Join(dir, "tls.crt")
|
||||
keyPath := filepath.Join(dir, "tls.key")
|
||||
|
||||
if err := writeSelfSigned(certPath, keyPath, "first"); err != nil {
|
||||
t.Fatalf("write first cert: %v", err)
|
||||
}
|
||||
|
||||
// Use a short refresh window so the test doesn't wait 5h.
|
||||
getCert, provider, err := newServerGetCertificate(certPath, keyPath, 100*time.Millisecond)
|
||||
if err != nil {
|
||||
t.Fatalf("newServerGetCertificate: %v", err)
|
||||
}
|
||||
defer provider.Close()
|
||||
|
||||
c1, err := getCert(nil)
|
||||
if err != nil {
|
||||
t.Fatalf("initial getCert: %v", err)
|
||||
}
|
||||
first, err := x509.ParseCertificate(c1.Certificate[0])
|
||||
if err != nil {
|
||||
t.Fatalf("parse first cert: %v", err)
|
||||
}
|
||||
if first.Subject.CommonName != "first" {
|
||||
t.Fatalf("want CN=first, got %q", first.Subject.CommonName)
|
||||
}
|
||||
|
||||
// Rotate the files. pemfile watches mtime; sleep so the new files have
|
||||
// a later mtime even on filesystems with 1s granularity.
|
||||
time.Sleep(1100 * time.Millisecond)
|
||||
if err := writeSelfSigned(certPath, keyPath, "second"); err != nil {
|
||||
t.Fatalf("write second cert: %v", err)
|
||||
}
|
||||
|
||||
// The pemfile poller can briefly observe a mismatched cert/key pair
|
||||
// during rotation (cert written before key). Tolerate transient errors
|
||||
// and only fail at the deadline; keep the last error for diagnostics.
|
||||
var lastErr error
|
||||
deadline := time.Now().Add(5 * time.Second)
|
||||
for time.Now().Before(deadline) {
|
||||
c2, err := getCert(nil)
|
||||
if err != nil {
|
||||
lastErr = err
|
||||
time.Sleep(100 * time.Millisecond)
|
||||
continue
|
||||
}
|
||||
parsed, err := x509.ParseCertificate(c2.Certificate[0])
|
||||
if err != nil {
|
||||
lastErr = err
|
||||
time.Sleep(100 * time.Millisecond)
|
||||
continue
|
||||
}
|
||||
if parsed.Subject.CommonName == "second" {
|
||||
return
|
||||
}
|
||||
time.Sleep(100 * time.Millisecond)
|
||||
}
|
||||
t.Fatalf("reloader did not pick up rotated cert within timeout; last error: %v", lastErr)
|
||||
}
|
||||
|
||||
func writeSelfSigned(certPath, keyPath, commonName string) error {
|
||||
key, err := ecdsa.GenerateKey(elliptic.P256(), rand.Reader)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
tmpl := &x509.Certificate{
|
||||
SerialNumber: big.NewInt(time.Now().UnixNano()),
|
||||
Subject: pkix.Name{CommonName: commonName},
|
||||
NotBefore: time.Now().Add(-time.Hour),
|
||||
NotAfter: time.Now().Add(time.Hour),
|
||||
KeyUsage: x509.KeyUsageDigitalSignature,
|
||||
ExtKeyUsage: []x509.ExtKeyUsage{x509.ExtKeyUsageServerAuth},
|
||||
}
|
||||
der, err := x509.CreateCertificate(rand.Reader, tmpl, tmpl, &key.PublicKey, key)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
certOut, err := os.Create(certPath)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
if err := pem.Encode(certOut, &pem.Block{Type: "CERTIFICATE", Bytes: der}); err != nil {
|
||||
certOut.Close()
|
||||
return err
|
||||
}
|
||||
if err := certOut.Close(); err != nil {
|
||||
return err
|
||||
}
|
||||
keyDER, err := x509.MarshalECPrivateKey(key)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
keyOut, err := os.Create(keyPath)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
if err := pem.Encode(keyOut, &pem.Block{Type: "EC PRIVATE KEY", Bytes: keyDER}); err != nil {
|
||||
keyOut.Close()
|
||||
return err
|
||||
}
|
||||
return keyOut.Close()
|
||||
}
|
||||
@@ -10,11 +10,11 @@ import (
|
||||
"path/filepath"
|
||||
"slices"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/spf13/viper"
|
||||
|
||||
"github.com/seaweedfs/seaweedfs/weed/glog"
|
||||
"github.com/seaweedfs/seaweedfs/weed/security/certreload"
|
||||
"github.com/seaweedfs/seaweedfs/weed/util"
|
||||
util_http_client "github.com/seaweedfs/seaweedfs/weed/util/http/client"
|
||||
"google.golang.org/grpc"
|
||||
@@ -24,7 +24,11 @@ import (
|
||||
"google.golang.org/grpc/security/advancedtls"
|
||||
)
|
||||
|
||||
const CredRefreshingInterval = time.Duration(5) * time.Hour
|
||||
// CredRefreshingInterval is the refresh cadence for gRPC mTLS certs.
|
||||
// Shares its source of truth with certreload.DefaultRefreshInterval so
|
||||
// a single WEED_TLS_CERT_REFRESH_INTERVAL env var tunes both gRPC and
|
||||
// HTTPS cert reload.
|
||||
var CredRefreshingInterval = certreload.DefaultRefreshInterval
|
||||
|
||||
type Authenticator struct {
|
||||
AllowedWildcardDomain string
|
||||
|
||||
@@ -0,0 +1,20 @@
|
||||
package security
|
||||
|
||||
import (
|
||||
"crypto/tls"
|
||||
|
||||
"github.com/seaweedfs/seaweedfs/weed/security/certreload"
|
||||
"google.golang.org/grpc/credentials/tls/certprovider"
|
||||
)
|
||||
|
||||
// NewReloadingServerCertificate returns a GetCertificate callback for
|
||||
// tls.Config.GetCertificate, backed by a refreshing pem provider so
|
||||
// rotated certs (e.g. from Kubernetes cert-manager) are picked up
|
||||
// without a restart.
|
||||
//
|
||||
// Thin wrapper over certreload; the shared implementation lives in the
|
||||
// subpackage so both server TLS (this package) and HTTP client TLS
|
||||
// (weed/util/http/client) can use it without an import cycle.
|
||||
func NewReloadingServerCertificate(certFile, keyFile string) (func(*tls.ClientHelloInfo) (*tls.Certificate, error), certprovider.Provider, error) {
|
||||
return certreload.NewServerGetCertificate(certFile, keyFile)
|
||||
}
|
||||
@@ -11,6 +11,9 @@ import (
|
||||
"strings"
|
||||
"sync"
|
||||
|
||||
"google.golang.org/grpc/credentials/tls/certprovider"
|
||||
|
||||
"github.com/seaweedfs/seaweedfs/weed/security/certreload"
|
||||
util "github.com/seaweedfs/seaweedfs/weed/util"
|
||||
"github.com/spf13/viper"
|
||||
)
|
||||
@@ -23,6 +26,25 @@ type HTTPClient struct {
|
||||
Client *http.Client
|
||||
Transport *http.Transport
|
||||
expectHttpsScheme bool
|
||||
// certProvider, when non-nil, owns a background refresh goroutine for
|
||||
// the client mTLS cert/key pair. Close() must be called to stop it.
|
||||
certProvider certprovider.Provider
|
||||
}
|
||||
|
||||
// Close stops any background cert refresh goroutine. Safe to call on a
|
||||
// client that was constructed without mTLS. Existing pooled connections
|
||||
// are also closed via CloseIdleConnections.
|
||||
func (httpClient *HTTPClient) Close() {
|
||||
if httpClient == nil {
|
||||
return
|
||||
}
|
||||
if httpClient.certProvider != nil {
|
||||
httpClient.certProvider.Close()
|
||||
httpClient.certProvider = nil
|
||||
}
|
||||
if httpClient.Client != nil {
|
||||
httpClient.Client.CloseIdleConnections()
|
||||
}
|
||||
}
|
||||
|
||||
func (httpClient *HTTPClient) Do(req *http.Request) (*http.Response, error) {
|
||||
@@ -100,7 +122,7 @@ func NewHttpClient(clientName ClientName, opts ...HttpClientOpt) (*HTTPClient, e
|
||||
var tlsConfig *tls.Config = nil
|
||||
|
||||
if httpClient.expectHttpsScheme {
|
||||
clientCertPair, err := getClientCertPair(clientName)
|
||||
certFileName, keyFileName, hasClientCert, err := clientCertPaths(clientName)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
@@ -110,20 +132,24 @@ func NewHttpClient(clientName ClientName, opts ...HttpClientOpt) (*HTTPClient, e
|
||||
return nil, err
|
||||
}
|
||||
|
||||
if clientCertPair != nil || len(clientCaCert) != 0 {
|
||||
if hasClientCert || len(clientCaCert) != 0 {
|
||||
caCertPool, err := createHTTPClientCertPool(clientCaCert, clientCaCertName)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
tlsConfig = &tls.Config{
|
||||
Certificates: []tls.Certificate{},
|
||||
RootCAs: caCertPool,
|
||||
InsecureSkipVerify: false,
|
||||
}
|
||||
|
||||
if clientCertPair != nil {
|
||||
tlsConfig.Certificates = append(tlsConfig.Certificates, *clientCertPair)
|
||||
if hasClientCert {
|
||||
getClientCert, provider, err := certreload.NewClientGetCertificate(certFileName, keyFileName)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("error loading client certificate and key: %s", err)
|
||||
}
|
||||
tlsConfig.GetClientCertificate = getClientCert
|
||||
httpClient.certProvider = provider
|
||||
}
|
||||
}
|
||||
|
||||
@@ -175,20 +201,20 @@ func getFileContentFromSecurityConfiguration(clientName ClientName, fileType str
|
||||
return nil, "", nil
|
||||
}
|
||||
|
||||
func getClientCertPair(clientName ClientName) (*tls.Certificate, error) {
|
||||
certFileName := getStringOptionFromSecurityConfiguration(clientName, "cert")
|
||||
keyFileName := getStringOptionFromSecurityConfiguration(clientName, "key")
|
||||
if certFileName == "" && keyFileName == "" {
|
||||
return nil, nil
|
||||
// clientCertPaths reads the https.<clientName>.{cert,key} paths from the
|
||||
// security config, validates they're either both set or both empty, and
|
||||
// returns them along with a hasClientCert flag. Loading is deferred to
|
||||
// certreload so the cert/key pair is picked up from disk on rotation.
|
||||
func clientCertPaths(clientName ClientName) (certFile, keyFile string, hasClientCert bool, err error) {
|
||||
certFile = getStringOptionFromSecurityConfiguration(clientName, "cert")
|
||||
keyFile = getStringOptionFromSecurityConfiguration(clientName, "key")
|
||||
if certFile == "" && keyFile == "" {
|
||||
return "", "", false, nil
|
||||
}
|
||||
if certFileName != "" && keyFileName != "" {
|
||||
clientCert, err := tls.LoadX509KeyPair(certFileName, keyFileName)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("error loading client certificate and key: %s", err)
|
||||
}
|
||||
return &clientCert, nil
|
||||
if certFile == "" || keyFile == "" {
|
||||
return "", "", false, fmt.Errorf("https.%s: both cert and key must be set (got cert=%q key=%q)", clientName.LowerCaseString(), certFile, keyFile)
|
||||
}
|
||||
return nil, fmt.Errorf("error loading key pair: key `%s` and certificate `%s`", keyFileName, certFileName)
|
||||
return certFile, keyFile, true, nil
|
||||
}
|
||||
|
||||
func getClientCaCert(clientName ClientName) ([]byte, string, error) {
|
||||
@@ -208,35 +234,44 @@ func NewHttpClientWithTLS(certFile, keyFile, caFile string, insecureSkipVerify b
|
||||
return nil, fmt.Errorf("both cert and key are required for mTLS, got cert=%q key=%q", certFile, keyFile)
|
||||
}
|
||||
|
||||
var clientCert *tls.Certificate
|
||||
var getClientCert func(*tls.CertificateRequestInfo) (*tls.Certificate, error)
|
||||
if certFile != "" && keyFile != "" {
|
||||
cert, err := tls.LoadX509KeyPair(certFile, keyFile)
|
||||
cb, provider, err := certreload.NewClientGetCertificate(certFile, keyFile)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("error loading client certificate and key: %s", err)
|
||||
}
|
||||
clientCert = &cert
|
||||
getClientCert = cb
|
||||
httpClient.certProvider = provider
|
||||
}
|
||||
// closeProviderOnError ensures the cert reloader's background refresh
|
||||
// goroutine is shut down if any subsequent step fails before we hand
|
||||
// the client back to the caller.
|
||||
closeProviderOnError := func() {
|
||||
if httpClient.certProvider != nil {
|
||||
httpClient.certProvider.Close()
|
||||
httpClient.certProvider = nil
|
||||
}
|
||||
}
|
||||
|
||||
var caCertPool *x509.CertPool
|
||||
if caFile != "" {
|
||||
caCert, err := os.ReadFile(caFile)
|
||||
if err != nil {
|
||||
closeProviderOnError()
|
||||
return nil, fmt.Errorf("error reading CA cert %s: %s", caFile, err)
|
||||
}
|
||||
caCertPool, err = createHTTPClientCertPool(caCert, caFile)
|
||||
if err != nil {
|
||||
closeProviderOnError()
|
||||
return nil, err
|
||||
}
|
||||
}
|
||||
|
||||
if clientCert != nil || caCertPool != nil || insecureSkipVerify {
|
||||
if getClientCert != nil || caCertPool != nil || insecureSkipVerify {
|
||||
tlsConfig = &tls.Config{
|
||||
Certificates: []tls.Certificate{},
|
||||
RootCAs: caCertPool,
|
||||
InsecureSkipVerify: insecureSkipVerify,
|
||||
}
|
||||
if clientCert != nil {
|
||||
tlsConfig.Certificates = append(tlsConfig.Certificates, *clientCert)
|
||||
GetClientCertificate: getClientCert,
|
||||
RootCAs: caCertPool,
|
||||
InsecureSkipVerify: insecureSkipVerify,
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -289,10 +289,15 @@ func (s *server) manage() {
|
||||
|
||||
func (s *server) serve() {
|
||||
stats.BumpSum(s.stats, "serve", 1)
|
||||
if s.certFile == "" && s.keyFile == "" {
|
||||
s.serveErr <- s.server.Serve(s.listener)
|
||||
} else {
|
||||
switch {
|
||||
case s.certFile != "" || s.keyFile != "":
|
||||
s.serveErr <- s.server.ServeTLS(s.listener, s.certFile, s.keyFile)
|
||||
case s.server.TLSConfig != nil && (len(s.server.TLSConfig.Certificates) > 0 || s.server.TLSConfig.GetCertificate != nil):
|
||||
// TLSConfig carries the certificate source (e.g. a hot-reloading
|
||||
// GetCertificate callback). Pass empty file args so ServeTLS uses it.
|
||||
s.serveErr <- s.server.ServeTLS(s.listener, "", "")
|
||||
default:
|
||||
s.serveErr <- s.server.Serve(s.listener)
|
||||
}
|
||||
close(s.serveDone)
|
||||
close(s.serveErr)
|
||||
|
||||
Reference in New Issue
Block a user