Addresses outstanding review concerns and several adjacent issues
surfaced during a follow-up review pass.
Lifecycle / supervision
- Track every per-child Wait goroutine via sync.WaitGroup and unblock
pending sigCh sends through a context.Cancel so early-return paths
(OnChildSpawn / OnMasterReady error, recovery doCommand error,
ErrOverRecovery) can no longer leak goroutines or stall children.
- Install signal.Notify(SIGTERM, SIGINT) in the master so deploy/
rolling-restart signals enter the shutdown path instead of killing
the master without graceful teardown.
- Replace the unconditional SIGKILL defer with a SIGTERM-then-SIGKILL
sequence gated by a configurable ShutdownGracePeriod (defaults to 5s,
Windows path stays SIGKILL since Signal(SIGTERM) is unsupported).
API
- OnChildRecover now returns error so callers can implement recovery
policies (circuit-breaker etc.); panic in any hook is recovered and
surfaced as the returned error, with diagnostic logging.
- Add RecoverInterval (optional crash-loop backoff) and
ShutdownGracePeriod fields with safe zero-value defaults.
- Export ErrCommandProducerNilCmd and ErrCommandProducerNotStarted
sentinel errors so callers can errors.Is them.
- Rename oldPid/newPid to oldPID/newPID per Go initialism convention.
- Logger interface now declares an explicit compile-time compatibility
check with fasthttp.Logger.
Resource hygiene
- Master closes both the original tcpListener and the duped fd in
p.files when prefork() returns; previously the duped fd leaked once
per call.
- doCommand wraps every error path with %w + fmt.Errorf so caller-side
diagnostics keep stage context.
- Strip pre-existing FASTHTTP_PREFORK_CHILD entries before appending so
child env never carries duplicate keys.
- Extract magic numbers as package constants
(inheritedListenerFD, masterPollInterval, defaultShutdownGracePeriod,
preforkChildEnvValue).
- Rename the inherited listener fd via os.NewFile so net.FileListener
errors are diagnosable.
Tests
- Migrate to t.Setenv (drop the global setUp/tearDown helpers) — fixes
the env-mutation-vs-parallel race.
- Replace rand.Intn port helper with `:0` + Listener.Addr() to remove
port-collision flakes under -count and parallel runs.
- Collapse the three near-identical Test_ListenAndServe* tests into a
single table-driven subtest that actually asserts the args forwarded
to ServeFunc/ServeTLSFunc/ServeTLSEmbedFunc.
- Add coverage for the previously untested branches:
CommandProducer returning err / nil cmd / unstarted cmd,
initial OnChildSpawn error, OnMasterReady error,
hook panic surfacing, RecoverInterval enforcement.
- noopChildProducer helper kills + waits any spawned child binaries
during cleanup so failed tests no longer leave subprocesses around.
- Move Wait() goroutine before OnChildSpawn so Kill()+Wait() works
correctly if a callback fails and the deferred cleanup runs
- Add Wait() call in deferred cleanup after Kill() to reap children
- Same fix in recovery loop
- Remove shallow callback tests that only tested Go compiler
- Add Test_Prefork_Lifecycle: runs full prefork with CommandProducer,
verifies callbacks fire in correct order with correct arguments
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- OnChildRecover: signature changed to func(oldPid, newPid int) so
callers can track which process was replaced
- OnChildSpawn: also called for recovered children (a recovered child
is still a spawned child)
- watchMaster: call OnMasterDeath when FindProcess fails (process is
most likely gone)
- CommandProducer: document that FASTHTTP_PREFORK_CHILD=1 must be set
in the child env, and what the default does when nil
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Lint fixes:
- Remove unused Reuseport field write in test (govet/unusedwrite)
- Replace fmt.Errorf with errors.New for static errors (perfsprint)
Review feedback (Copilot):
- Validate CommandProducer returns a started command (nil/Process check)
- Clarify ListenAndServeTLS doc: parameter order and internal forwarding
- Use hermetic test binary re-exec instead of external 'go' binary
- Rename misleading test to reflect what it actually asserts
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Keep upstream's (addr, certKey, certFile) signature to avoid breaking
callers. Fix the doc comment to match the actual parameter order instead.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The three ListenAndServe* methods had identical child setup code
(listen, set ln, watch master). Extract to listenAsChild() for
cleaner code. Also add comment for the magic file descriptor number 3.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
On Windows, os.Getppid() returns a static PID that doesn't change when
the parent exits (no reparenting). Use FindProcess+Wait instead, which
correctly detects parent exit. Also document why masterPID comparison
works for Docker containers (master PID 1 case).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Integrate upstream's OnMasterDeath callback (replaces WatchMaster bool),
os.Executable() for child command, and watchMaster as method on Prefork.
Keep our OnChildSpawn, OnMasterReady, OnChildRecover callbacks and
CommandProducer. Update tests accordingly.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: detect master process death in prefork children
Prefork child processes had no mechanism to detect if the master process
died unexpectedly. Children would become orphans, get reparented to
PID 1, and keep running silently with no supervision.
Add a watchMaster goroutine that stores the original parent PID at
startup and exits when the parent PID changes, matching the approach
used in gofiber/fiber.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: add integration test for watchMaster orphan detection
Verifies that prefork children exit when the master process is killed,
using a two-level subprocess chain (test → master → child) with pipe-based
synchronization to ensure the child has recorded its PPID before the
master is killed.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: pass masterPID to watchMaster and clean up tests
Capture PPID before launching the goroutine to eliminate a race between
the PPID snapshot and the ready signal. Align test style with the rest
of the project (t.Parallel, naming, ASCII-only comments).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: make prefork orphan detection configurable via OnMasterDeath callback
Address review feedback: make watchMaster opt-in via an OnMasterDeath
callback field (nil/off by default for backwards compatibility). Users
can set DefaultOnMasterDeath for os.Exit(1) or provide custom cleanup.
Also fixes ticker leak in watchMaster.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* address review feedback: remove DefaultOnMasterDeath, delete tests, fix log message
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
It's better to use an environment variable as they are more standard.
They way flags are parsed isn't standardized within the Go ecosystem.
Fixes: https://github.com/valyala/fasthttp/issues/1782
* Run go test on github actions
travis-ci.org has stopped.
See also: https://github.com/curl/curl/issues/7150
Downside: github actions don't support ppc64le
* Run less
* delete .travis.yml
* Remove travis + minor lint fixes
* Make the prefork mode more robust
The main process will exit if one of the prefork child processes doesn't complete successfully under the current prefork mode, so it ought to make sure that all child processes run independently and the main process will only exit after all child processes are finished.
* Start over those failed child processes automatically
* Kill all child processes before main process exits
* Remove redundant code
* Add configurable threshold of starting over child processes
* Return a error of RecoverThreshold
* Resolved requested changes
* Add logs
* Resolve requested changes
* feat: workflow to valid security using GoSec
* Update security.yml
* Fix gosec problems
These are all either false positives or os.Open operations done on
filenames supplied by the fasthttp user which we have to assume is safe.
* Just ignore some rules globally
* Fix more warnings
* No more warnings
Co-authored-by: Erik Dubbelboer <erik@dubbelboer.com>