Command disabled: backlink

May 21st, 2011

It seems that this temperature-monitoring script does work:

May 21 20:02:11 norma logger: [temp] Alert, current average core temperature is 59 deg C
May 21 20:02:11 norma logger: [temp] Temperature alert, taking everything down now ...
May 21 20:02:11 norma logger: [temp] Issued shutdown to n0001
May 21 20:02:11 norma logger: [temp] Issued shutdown to n0002
May 21 20:02:11 norma logger: [temp] Issued shutdown to n0003
May 21 20:02:11 norma logger: [temp] Issued shutdown to n0004
May 21 20:02:11 norma logger: [temp] Issued shutdown to n0005
May 21 20:02:11 norma logger: [temp] Issued shutdown to n0006
May 21 20:02:11 norma logger: [temp] Issued shutdown to n0007
May 21 20:02:11 norma logger: [temp] Issued shutdown to n0008
May 21 20:02:11 norma logger: [temp] Issued shutdown to n0009
May 21 20:02:11 norma logger: [temp] All nodes taken down. Now the server ...
May 21 20:02:11 norma shutdown[19421]: shutting down for system halt
May 21 20:02:14 norma init: Switching to runlevel: 0
May 21 20:02:17 norma wulfd[3479]: Terminating with exit code 254
May 21 20:02:24 norma xinetd[2834]: Exiting...
May 21 20:03:18 norma kernel: Kernel logging (proc) stopped.
May 21 20:03:18 norma kernel: Kernel log daemon terminating.
May 21 20:03:19 norma exiting on signal 15

and then …


May 21 23:07:58 norma syslogd 1.4.1: restart.
May 21 23:07:58 norma kernel: klogd 1.4.1, log source = /proc/kmsg started.
May 21 23:07:58 norma kernel: Inspecting /boot/System.map-2.6.26.5-2.nsa1
May 21 23:07:59 norma kernel: Loaded 25661 symbols from /boot/System.map-2.6.26.5-2.nsa1.
May 21 23:07:59 norma kernel: Symbols match kernel version 2.6.26.

.....

May 21 23:13:21 norma logger: [WAKE-ME-UP] First reading is 32 deg C. Going to sleep now ...

.....


May 21 23:23:22 norma logger: [WAKE-ME-UP] Second reading is 31 deg C.
May 21 23:23:22 norma logger: [WAKE-ME-UP] Not cold enough ? Better do nothing. Bye.

.....

Sounds like a broken A/C unit, but it is too late on Saturday to check. Given that the temperature is going up, take (remotely) head node down as well … (and in case I don't see you again, good-morning, good-evening and good-night).


Sunday's update: it was a false alarm in that the A/C didn't fail. The temperatures on the second attempt were high because the nodes failed to shutdown properly. Modified shutdown script and took it up again.

maintenance/may_21st_2011.txt · Last modified: 2011/05/22 12:30 (external edit)