Systematic Debugging the Overnight OOM
24th January 2026
Tracking down the OOM event
I woke up this morning to find my GNOME session had crashed overnight. Terminal sessions gone, browsers closed, had to log back in. The journalctl output told me why: an Out of Memory event that killed 48 processes at 00:56:33.
My gut reaction was to blame Chrome or some runaway Node process, but this time I decided to actually look into it.
Systematic debugging with Claude
I asked Claude to investigate using my sd short alias for Jesse Vincent’s excellent systematic-debugging skill, a four-phase debugging framework that goes:
-
Root Cause Investigation
-
Pattern Analysis
-
Hypothesis Testing
-
Implementation
That order matters. No guessing allowed.
please scan all claude code transcripts and the relevant system logs and develop three hypothesis, using the sd skill, as to what caused the OOM in the last ~6 hours or so
What got killed
Here’s what journalctl showed:
14 orphaned bd processes stood out.
bd is the golang implementation of beads, a git-backed issue tracker I use with Claude Code. It spawns processes for triage, graph computations, and IPC. When Claude subagents invoke it, apparently these child processes weren’t getting cleaned up.
The system had been running for 13+ days. 14 zombie beads processes built up over that time.
| Process | Count |
|---|---|
bd |
14 |
zsh |
11 |
http-server |
5 |
python |
4 |
claude |
2 |
zoom |
1 |
Three hypotheses
Beads process accumulation Each bd process holds memory for issue caching, graph operations (PageRank, betweenness), and IPC channels. 14 orphans over 13 days of uptime. Most likely cause.
Claude transcript growth Found transcript files at 491MB and 348MB. One session had 64 subagent files. Long-running sessions with large contexts might not free memory properly.
http-server leaks 5 orphaned http-server processes. Claude spawns these for HTML previews. When sessions crash, they persist.
The pattern underneath
All three point to the same thing: process lifecycle management failure.
When a parent process exits, children should get SIGHUP and terminate. But if they’re detached or nohup‘d, they become orphans with PPID=1. Without explicit cleanup, they stick around. Memory builds up. Eventually the OOM killer steps in.
The system has 62GB RAM and 80GB swap. Usage was at 42GB, not dangerous by itself. But multiple processes trying to allocate at once can still trigger the OOM killer.
Switching to beads_rust
The investigation led to a related change: migrating from bd to br (beads_rust) from @doodlestein.
please replace bd with br ... in ~/.claude, all CLAUDE.mds, all claude skills, and everywhere else that `bd` is mentioned
The differences in br:
- Never auto-commits to git
- No background daemon processes
- You run
br sync --flush-onlyexplicitly when you want to sync
The golang bd had automatic background syncing and daemons. The Rust version makes everything explicit. Nothing runs unless you tell it to.
beads_rust: You want a stable, minimal issue tracker that stays out of your way.beads: You want advanced features like Linear/Jira sync, RPC daemon, and automatic hooks.
Here’s a full comparison:
br |
beads |
|
|---|---|---|
| Language | Rust | Go |
| Lines of code | ~20,000 | ~276,000 |
| Git operations | Never (explicit) | Auto-commit, hooks |
| Storage | SQLite + JSONL | SQLite/Dolt |
| Background daemon | No | Yes |
| Hook installation | Manual | Automatic |
| Binary size | ~5-8 MB | ~30+ MB |
| Complexity | Focused | Feature-rich |
What I took away
The systematic debugging skill stopped me from doing what I normally would have done: blame Chrome, kill some processes, call it a day. Instead I got actual evidence pointing to 14 zombie beads processes that had built up over two weeks. The fix wasn’t just killing processes. It was figuring out why they accumulated and switching to a tool that handles process lifecycle better. Full investigation notes: oom-investigation-2026-01-24.md
Recent articles
- Cleaning up taskmaster's terminal output - 25th February 2026
- Building a functional consciousness eval suite for LLMs - 8th February 2026
- Claude Code starts faster on Ubuntu when installed via Homebrew - 26th January 2026