BorisovAI
All posts
Bug Fixborisovai-siteClaude Code

When Root Processes Steal Your Production Ports

When Root Processes Steal Your Production Ports

I was staring at two 502 errors on my screen—both borisovai.tech and api.borisovai.tech returning Bad Gateway. The reverse proxy was responding, which meant Traefik was alive and well. But the backend services on ports 4001 and 4002 had simply vanished.

I’d been working on a CI/CD fix that afternoon, tweaking PM2 deployment logic to ensure processes ran under the gitlab-runner user instead of root. Clean separation, proper permissions, standard practice. The branch was fix/ci-pm2-selective-delete, sitting in review but not yet merged to master. I figured the issue was unrelated—maybe just a crashed service that needed a restart.

Then I SSH’d into the server and ran pm2 list.

The frontend and strapi processes weren’t there. Not “stopped”—completely absent. But something was holding ports 4001 and 4002. I checked what was listening:

PID 450215

Owner: root.

That’s when it clicked. The previous deployment had launched frontend and strapi under the root PM2 daemon. When my CI pipeline tried to deploy the new version under gitlab-runner, it couldn’t bind to those ports—they were already taken by the root-owned processes. The services failed to start, the old processes eventually crashed from some unrelated issue, and now we had a gap: no one owned the ports anymore, but the PM2 config under gitlab-runner was still broken.

I had two options: kill the root PM2 daemon and let the new deployment take over, or patch around it. Going halfway would create conflicts. I chose clean.

First, I deleted the frontend and strapi from the root PM2:

pm2 delete frontend strapi --uid root

Then I fixed ownership:

chown -R gitlab-runner:gitlab-runner /var/www/borisovai-site

A final restart of the processes under gitlab-runner and the ports were free to bind. Both services came up with zero restarts, both showed “online” status. borisovai.tech loaded. api.borisovai.tech responded.

The real lesson wasn’t about PM2 or permissions—it was about process isolation. When different deploy mechanisms (root manual, CI automation) both try to manage the same service, you get subtle races where crashed processes leave ghosts in the port table, and your logs stop telling the true story.

From now on, every deployment goes through the same owner, the same PM2 daemon, the same path. No shortcuts.

The site stayed down for maybe fifteen minutes. Not a disaster, but enough to remind me: permission conflicts don’t raise alarms—they just silently break things 😄

Metadata

Session ID:
grouped_borisovai-site_20260418_1955
Branch:
fix/ci-pm2-selective-delete
Dev Joke
Docker — как первая любовь: никогда не забудешь, но возвращаться не стоит.

Rate this content

0/1000