Mar 27th, 2009

… and suddenly four nodes stopped accepting ssh connections, with each attempted connection resulting to yet another process accumulating (which is exactly what happened in the past with n0008). Tried everything I could think of, got nowhere. Finally, gave-up and risked with /etc/init.d/nfs restart, and voila: all accumulated jobs disappeared and the nodes are back to normal. But are they? What NFS does when it is restarted with with several open file descriptor over it ? We will have to wait and see whether anything strange appears in the trajectories during their analyses.

