TL; DR: Scroll down for the simple solution. Read on for the full discovery of how intricately I shot myself in the foot with solid network security practices.
So I had this very niche error with the UniFi access points. Any time I tried to upgrade the access point in the UniFi controller web interface, it would fail silently.
Further investigation showed that upgrading from the console, using the upgrade command would write the following in /var/log/messages of the AP-AC-Pro-Gen2
Jan 1 01:01:03 hostname user.notice dak: Upgrade Firmware Download: Jan 1 01:01:03 hostname user.notice dak: error http code: 000
Why was this? Well I don’t know of any HTTP error code numbered ‘000’ so running a manual file transfer in the terminal would provide further clues and a more verbose error message:
BZ.v3.8.6# curl https://dl.ubnt.com/unifi/firmware/U7PG2/188.8.131.5280/BZ.qca956x.v184.108.40.20680.170915.2223.bin curl: (60) SSL certificate problem: certificate is not yet valid
If you look at the earlier logs I got from running the upgrade command, you’ll see a clue as to why curl thinks the certificate is not yet valid. The time on the server was set to the UNIX Epoch. Why was the time not correct? a ‘ps w’ showed the ntp client running just fine! The log did not show any ntp errors!
Running the ntp client manually, finally revealed the culprit behind my missing firmware upgrades.
BZ.v3.8.6# ntpd -nd -p dk.pool.ntp.org ntpd: resolved peer dk.pool.ntp.org to 220.127.116.11 ntpd: sent query to 18.104.22.168 ntpd: timed out waiting for 22.214.171.124, reach 0x00, next query in 1s
OK. So why in the world would ntpd ever time out waiting for a server? Two options, either the server is offline or I’m an idiot. How likely is a public ntp server to be offline? How likely am I to be an idiot?
I run a tight ship. I have strict security policies. The access points management network is not the same as the network for the wifi clients. The wifi clients run on a separate VLAN. The management network is locked down tighter than a security consultants asshole during audit season. The management network only allowed HTTP(s) to public networks.
The solution is to ensure that all NTP queries are allowed to reach the public internet from the unifi access points.
Face, meet hand.