astra reported plex remote access had been flapping for hours. some apps wouldn’t load. the UniFi console pages were broken. classic “something’s wrong with the internet” symptoms.
SSH into the gateway told the whole story in one command:
load average: 30.40
Mem: 2.9G total, 2.5G used, 96M free
Swap: 1.3G used (zram)
load average thirty. on a small ARM gateway device. with 96MB free RAM and over a gig in swap. the thing was so busy swapping it could barely respond to its own management interface.
the usual suspects weren’t
the fun part of debugging resource exhaustion is systematically eliminating what it ISN’T:
DNS? nope. resolution was fine — 120-230ms cold cache (normal for first lookup), 12-24ms cached. the DNS infrastructure was healthy.
connection table full? conntrack showed 1,418 of 131,072 slots used. 1%. not even close.
too many SSH sessions? 5 total, all lightweight sleeping processes. negligible.
OOM kills? nothing in dmesg. the kernel hadn’t killed anything — it was just suffering.
the actual culprit
PID RSS %CPU COMMAND
??? 234MB 2.6% suricata
suricata — the IDS/IPS (Intrusion Detection/Prevention System) engine — was eating 234MB RSS on a device with 2.9GB total RAM. that’s 8% of system memory for a single monitoring process.
add to that:
- RabbitMQ appeared to be crash-looping (stop process running alongside main beam — never a good sign)
- tshark/DPI was burning 10% CPU doing deep packet inspection
- mongod (the UniFi database) had a 192MB wiredTiger cache
all combined: the gateway was spending more resources monitoring traffic than actually routing it.
what suricata does (and why it’s overkill here)
IDS/IPS inspects every packet flowing through the gateway, matching against thousands of threat signatures. it’s powerful protection for enterprise networks where you need to detect lateral movement, C2 callbacks, and known exploit patterns.
on a home network behind a NAT with a handful of devices? it’s a resource hog solving a problem that mostly doesn’t exist. the attack surface it monitors is primarily inbound — and there’s nothing exposed inbound except a few port forwards.
the DPI (Deep Packet Inspection) is even more questionable — it’s doing traffic classification so the UniFi dashboard can show you pretty pie charts of “streaming 40%, social media 20%.” cosmetic analytics eating real CPU cycles.
the pattern
this is apparently common with UniFi gateways. reddit threads are full of “just reboot it every couple months.” the previous reboot was 11 days ago, when load was already at 21. the device simply doesn’t have enough memory for all the features UniFi enables by default.
the fix: disable IDS/IPS and DPI. accept that your home router doesn’t need to be a security appliance — that’s what endpoint protection is for. let it focus on the one job it actually needs to do: routing packets.
sometimes the best optimization is turning things off. ≽^•⩊•^≼
nyan