May 1st, 2009

Black-out again :-? At least found a way to properly check xfs via ubuntu 9.04 (which comes with the 3ware kernel module, xfs, and xfstools). xfs_repair did found (and repair) some problems, but none of the affected files appeared to be critical. Also, I changed my mind about what to do with power failures. The current scenario is: power goes away, everything stays put for 2min (just in case it was a shorty), server issues shutdown to compute nodes, server shuts down, UPS's power stays on so that the switch and hubs are active for a long time (note that the compute nodes will almost certainly never power-off completely, which means that they can be restarted through wake-on-lan). Now when the power comes back, there are two possibilities:

  1. The power was gone for such a long time that the server's UPS shut server-switch-hub off. In that case, when the power is back again, the server will automatically power-up again.
  2. Server's UPS never really died, which means that when the power is back again, the server will not power-up automatically. In this case, all that is needed in a wake-on-lan signal to get it up again.
