This has been annoying me for years, and is finally very much improved.
My home page would parse through the several megabytes of ascii sensor data to find the last typically 5 days’ data and generate the 5 graphs – every time it was called. That guaranteed you were always looking at the latest data, but unfortunately that parsing/processing took on the order of 30 seconds, and it all had to happen before the page could be delivered and painted. There must be a better way.
There is. Fortunately, all the parsing and graph creation are neatly contained in one php file. (Don’t say it.) Now, the process on the Pogoplug here in the house that monitors the sensors and ftps the data up to the web host every 5 minutes does a wget on that php file (which returns nothing) right after it pushes new data up. The home page no longer calls the graphing php, and now loads in seconds. The graphs could be up to 5 minutes (plus the 30 sec php processing time) old, but they should always be the best available. One could argue I should be kicking off the graphing script with an ssh instead of wget, but that’s harder.)
Unless, of course, something fails. So just in case, the main page now checks the timestamps of all 5 graph files, and if they’re older than 300 seconds, it prints a warning. I can always call the php script manually if I have to.
The bad news is that whenever I call the php graphing script, my browser (or wget) indicates that the connection has been reset by the other end. Maybe the fact that nothing is returned is a problem? I made up another php script that prints a trivial html page in addition to includeing the graphing php, just like the home page did originally. That reports a reset as well. So I can’t even put a refresh button on the home page to call the graphing script. Oh well. At least the home page loads fast now, and the wget seems to work reliably, so I shouldn’t have to call the graphing script by hand.
Update 3/27/12: Did a little more poking around and have come to a couple of conclusions/observations:
– the web server times out an http request for a php page in about 15 seconds and resets the connection
– the php from such a request will continue running, though there seems to be a 300 sec max run time before the process is killed
I can’t get my old home page to work now. If I include the graphing php script (just like I used to), the connection is reset. I even went back and restored versions from a month ago, and same results. It looks like the web server is timing out more aggressively than it used to.
Called GoDaddy support to ask if such a timeout had changed. Got a pleasant but not heavily technical guy who had to go ask others a couple of times. He said there had been no such change, but I don’t have a lot of confidence in that answer.
It’s not perfect, but it works, and I’ve spent enough time on it already.
Update 4/5/13: Wow. This round of updating the home page actually got me a working watchdog! When I first put this up a few years ago I tried to set up a cron job to notify me if something was wrong – like the data file not getting updated. But since they didn’t provide access to crontab, I put a fair amount of effort into a continuously running process to do the job. Early ones failed, and I made self-respawning versions to circumvent max runtimes they seemed to reduce and reduce just to stop me. I never wanted to duel with the sysadmins – just wanted some kind of watchdog. Of course they won, and I gave up on my watchdog.
In this round of messing around, I noticed a cronjob manager in their web-based hosting control panel. I tried it, and when it didn’t work, called the help desk. They said they were doing maintenance on the servers so maybe that was why it didn’t work and I should give it a day or 2. Yeah, right. But sure enough, in a couple of days it started working! Lots of testing later, I learned:
– my crontab file is in my root directory – one above html
– editing that file takes effect immediately and gives me complete control over cron jobs
– initial directory is my root – one above html
– */15 syntax in say minutes field will hit every 15 mins – I didn’t know that
– both stdout and stderr are emailed to user in MAILTO= line – didn’t know about stdout
Those tidbits plus a little hacking of my old perl script gave me precisely the watchdog I’ve wanted for years. And by having the watchdog touch a file I was able to put a check on that file’s timestamp in the home page so it can flag when the watchdog fails. Perfect!
I wonder if that crontab tool was there in the control panel – where it would never occur to me to look for it – from the beginning.
Update 4/8/13: This doesn’t really belong here, but there’s no place else for it at the moment, so here goes… As a backup/check on my rain gauge, there’s a nice professionally maintained one by the USGS on Salt Creek in Elmhurst maybe 2 miles away. That provides incremental rainfall in 1/100″ increments every 5 minutes – exactly as mine does. While I put a link to its most recent graph right above my rain graph, sucking the data down and plotting it on the same graph as mine would be very nice.
There seem to be 2 servers with essentially the same info (as far as what I want), though their request syntaxes differ. After some trial and error, here are some command lines that get the data and their results:
wget -O - 'http://waterdata.usgs.gov/il/nwis/uv?cb_00045=on&format= rdb&period=5&site_no=05531300' 2>/dev/null >> raindata USGS 05531300 2013-04-08 15:35 CST 0.00 P USGS 05531300 2013-04-08 15:40 CST 0.00 P USGS 05531300 2013-04-08 15:45 CST 0.00 P wget -O - "http://waterservices.usgs.gov/nwis/iv/?format=rdb,1.0& sites=USGS:05531300&period=P1D¶meterCd=00045" 2>/dev/null |grep "^USGS"|cut --output-delimiter=, -f3,5 >> raindata2 2013-04-08 15:35,0.00 2013-04-08 15:40,0.00 2013-04-08 15:45,0.00
Running those from my shiny new cron capability, I see that (today) they both seem to make new data available in 20 minute chunks, posted just about 20 minutes in arrears. The chunks are XX:10-25, XX:30-45, XX:50-XX+1:05, and are posted at about XX:45, XX:05, XX:25 respectively. Setting my cron job for “6-59/20 * * * * <cmd>” got each new post today. Making it “7-59/20” might be a little safer.
The way the graphing stuff works, I think I could put this data in a separate file, add to the current .php to parse it into a separate array, but then plot that data on the same plot area as my rain gauge data. More work ahead. (Spoiler, 6/26/19: I finally got USGS rain data displayed alongside my rain collector’s data. Long stories, but it seems to work!)
Update 6/1/13: After living with this for a couple of months, I can say it’s great. No more slow page loading!
I put a trap in for stale graphs, and that has fired several times. I think they’ve all been because the ISP ping stats graph has been stale. And I think that’s because somebody hides that graph on the monitor page – and then the .png doesn’t get updated. It took a couple of tries, but I think I finally got rid of any way anybody can turn that graph off, and since then I don’t think that stale graph warning has come up.
Anyway, it’s pretty good. I still want to do a round robin database or something, but this is a big improvement.