feat(iam): STS web-identity AWS-fidelity polish (Phase 1) (#9318)

* feat(iam): STS web-identity AWS-fidelity polish

- OIDC discovery via .well-known/openid-configuration; falls back to
  /.well-known/jwks.json when discovery is absent. Reject discovery docs
  whose issuer claim does not match the configured issuer to defend
  against issuer-substitution.
- ComputeParentUser derives a stable per-identity hash from (sub, iss).
  Surface as aws:userid in the request context and as a parent_user
  claim in the session JWT so per-user state survives token rotation.
- Per-role MaxSessionDuration (3600..43200) clamps requested
  DurationSeconds before the STS service applies its own caps.
- Tighten RoleSessionName to the AWS contract: 2..64 chars from
  [\w+=,.@-].
- Populate PackedPolicySize in AssumeRole / AssumeRoleWithWebIdentity /
  AssumeRoleWithLDAPIdentity responses as a percentage of the 2048-byte
  inline session policy budget.

* fix(iam): leave omitted DurationSeconds nil so STS default applies

capDurationByRole was substituting the role's MaxSessionDuration
when the caller omitted DurationSeconds entirely. AWS returns the
configured default (typically 1 hour) in that case, not the role's
upper bound — a 12h MaxSessionDuration shouldn't silently make every
no-duration assume-role mint a 12h session.

Return nil when requested is nil; let the downstream
calculateSessionDuration in the STS service apply its TokenDuration
default. The role-max upper bound still clamps when the request
arrives with a concrete value above the cap.

Addresses gemini high-priority review on PR #9318.

* fix(iam): synchronize OIDCProvider JWKS cache fields

jwksCache, jwksFetchedAt, resolvedJWKSUri, and discoveryFailed are
mutated lazily on the first token-validate call and refreshed
afterwards on TTL expiry. Multiple S3 requests can land here in
parallel, so the writes were racing against subsequent reads on
every other goroutine. resolvedJWKSUri/discoveryFailed inherited
the same un-protected pattern when discovery shipped.

Add sync.RWMutex; getPublicKey takes the read lock for the
common cache-hit path and promotes to the write lock for misses
+ refreshes. fetchJWKSLocked / resolveJWKSUriLocked assume the
write lock is held by the caller; fetchJWKS keeps the
test-friendly entry point that acquires the lock itself.

Addresses gemini high-priority review on PR #9318.

* fix(iam): trim trailing slash + retry discovery after transient failure

Two OIDC discovery edge cases reviewers flagged:

1. Issuer comparison was sensitive to trailing slashes. resolveJWKSUri
   trims them when building the discovery URL, but the doc.Issuer ↔
   p.config.Issuer check did not, so an IDP whose issuer claim drops or
   adds the slash relative to the configured value would be falsely
   rejected. Trim a single trailing slash on each side before comparing.

2. discoveryFailed flipped to true on any error and stayed there for the
   process lifetime. A transient 5xx at startup permanently locked the
   provider into the /.well-known/jwks.json fallback. Reset the flag at
   the top of fetchJWKSLocked when no URI has been cached yet, so each
   JWKS refresh (typically once per TTL = 1h) reattempts discovery.
   Successful discovery remains cached via resolvedJWKSUri so we don't
   pay the discovery RTT on every refresh.

Addresses gemini security-medium + medium reviews on PR #9318.

* fix(iam): require non-empty issuer in OIDC discovery doc

The previous "doc.Issuer != "" && ..." guard let a discovery document
that omitted the issuer field bypass the issuer-mismatch check
entirely, letting the doc steer fetchJWKS at any URL it provided.
OIDC Discovery 1.0 §3 mandates the issuer field; treat missing as a
hard failure same as mismatched. Trailing-slash equivalence still
applies.

Adds TestDiscoveryRejectsMissingIssuer alongside the existing
TestDiscoveryRejectsIssuerMismatch via a new omitDiscoveryIssuer
toggle on fakeIDP.
This commit is contained in:
Chris Lu
2026-05-04 22:10:49 -07:00
committed by GitHub
parent 2417ba0354
commit d951a8df5a
11 changed files with 698 additions and 39 deletions
+30
View File
@@ -1,6 +1,8 @@
package sts
import (
"crypto/sha256"
"encoding/base64"
"fmt"
"time"
@@ -8,6 +10,20 @@ import (
"github.com/seaweedfs/seaweedfs/weed/glog"
)
// ComputeParentUser returns a stable per-identity hash derived from the OIDC
// (sub, iss) tuple. Only the (sub, iss) pair is guaranteed stable across token
// refreshes per OpenID Connect Core 1.0 §5.7, so any per-user state (audit
// logs, quotas) must key off this value rather than the access-key or session
// id. The hash is base64-rawurl-encoded SHA-256 over "openid:<sub>:<iss>" so
// it stays filesystem-safe and bounded in length for storage in audit paths.
func ComputeParentUser(sub, iss string) string {
if sub == "" {
return ""
}
h := sha256.Sum256([]byte("openid:" + sub + ":" + iss))
return base64.RawURLEncoding.EncodeToString(h[:])
}
// defaultCredentialGenerator is a reusable instance for generating temporary credentials
// Reusing a single instance across all calls to ToSessionInfo() reduces allocation overhead
// since this method may be called frequently during signature verification
@@ -45,6 +61,12 @@ type STSSessionClaims struct {
// Session metadata
AssumedAt time.Time `json:"assumed_at"` // when role was assumed
MaxDuration int64 `json:"max_dur,omitempty"` // maximum session duration in seconds
// ParentUser is a stable hash of (sub, iss) for tokens minted from an OIDC
// identity. It survives token rotation since only the (sub, iss) tuple is
// guaranteed stable per OpenID Connect Core 1.0. Empty for non-federated
// session types.
ParentUser string `json:"puid,omitempty"`
}
// NewSTSSessionClaims creates new STS session claims with all required information
@@ -96,6 +118,7 @@ func (c *STSSessionClaims) ToSessionInfo() *SessionInfo {
ExternalUserId: c.ExternalUserId,
ProviderIssuer: c.ProviderIssuer,
RequestContext: c.RequestContext,
ParentUser: c.ParentUser,
// Provide the Subject (sub) from registered claims
Subject: c.Subject,
Credentials: credentials,
@@ -182,3 +205,10 @@ func (c *STSSessionClaims) WithSessionName(sessionName string) *STSSessionClaims
c.SessionName = sessionName
return c
}
// WithParentUser sets the stable per-identity hash for the session. See
// ComputeParentUser for the derivation rule.
func (c *STSSessionClaims) WithParentUser(parentUser string) *STSSessionClaims {
c.ParentUser = parentUser
return c
}