The router resetter seems to have caught an outage this morning, power cycled the router and modem, and gotten things working again – all without my even knowing there had been a problem – just like it’s supposed to!
The ping stats program did detect some outages yesterday, but due to a dumb coding error didn’t actually invoke the resetter and I had to do it by hand. I finally resorted to ‘perl -w’ on the first line of the scripts to find it. Guess I should use that more often 🙂
Anyway, the hardware’s in place and working (OK, it should have a housing, but so should all the other PIC nodes), resets are posted on the web site, and it actually does what it’s supposed to do!
Update 5/16/11: The resetter seems to be working exactly as designed. Unfortunately, it’s been resetting multiple times a day recently. So I just bought a $25 TP-Link WR-740N N/G/B wireless router from Microcenter and brought it up with (almost) all the same config as the old one. Seems to work OK, though it doesn’t support port translation on virtual servers, so my ssh and http access to the Pogoplug no longer work. Anyway, let’s see how often this one needs to be reset!
Update 5/17/11: While the new router hasn’t hung and required a power cycle, things weren’t quite as stable as I thought. If it had required a power cycle, it would never have come up on its own! The only way it would come up after a power cycle is by going into the router’s web interface and rebooting it (again). I resolved that (without fully understanding why) by changing the DSL modem setup so it offered a 192.168.0 DHCP address to the router rather than giving the router the 68.74.64 address the modem got from the ISP. (All the router resets the web site shows today are from working on it.)
Unfortunately, this may keep me from having access to the Pogoplug from the outside: As far as I can tell, with this setup the ‘virtual server’ mechanism in the router just doesn’t work. Even though I changed the http and ssh listen ports to the same ones exposed on the Internet to avoid the lack of port mapping, I don’t seem to be able to connect. More work is needed.
But the new router does seem to stay up much better than the old one.
Update 2/12/14: The router resetter has been just chugging along doing its job for a couple of years now. Once in a while (maybe once/monthish?) it will report a reset or sometimes a couple. Otherwise, it just sits there silently watching. The problem it was really put in place for – walking up to the computer only to find there’s no Internet access – just doesn’t happen any more. Great!
After reading some dumb warning about wifi security, I decided to try to “improve” things that weren’t obviously broken by going from WEP to WPA2-PSK (one router could only do WPA-PSK) and using better passwords. I’d gone with WEP because some device at the time didn’t support WPA, but everything I have now does support it, so it should work.
I changed both routers, and (I think!) every wifi device I have. But the system was flaky, and even the main PC reported the Internet connection bouncing. Checking the router resetter (first from the section at the bottom of the home page, then drilling down to logs on the Pogo), it was resetting every ~5 minutes. I figured there was some unfortunate storm of devices making noise before everything got settled down, so I uplugged the cable from the PIC resetter node to the outlet box with the SSR.
Troubleshooting was challenging in part because the !@#@$% GoogleDrive Sync had 135,000 file descriptors open and I couldn’t even get a browser refresh on the router admin page. I could ping by name (like google.com) from a command prompt, but couldn’t do a wget, with some error about bad file descriptor. Killing GoogleDrive Sync fixed all that. Bad Google.
But the ping stats (that drive the router resetter) still weren’t happy. (And it took a while to remind myself that there’s a pingstat.pl script that determines router health separate from the main 485pollB.pl script.) But looking thru that script for the “Router is confused” message I found in pingstatstdout, I saw it: Part of the router health check was verifying that it could read (tho not parse) the router admin page. To see it, the admin username and password were hardcoded into a wget command. And I’d changed the credentials! I changed the hard coded creds to the new ones, plugged all the cables back in, and it seems to be stable again (after a total of 34 recorded router resets!).
All in all, it was a great example of how “if it ain’t broke…” came about. But at least now I’ve got better encryption and better credentials on my wifi exposure. (Though I couldn’t bring myself to implement what would have helped most: mac filtering. The thought of having to go play admin if a visitor wanted to connect a laptop or phone made that just too disruptive.)