about:benchmarks:namdv28cudagtx460 [Norma]

NAMD 2.8 with and without CUDA on Q6600 quads with nvidia's GTX460 cards

ApoA1

The ApoA1 benchmark is as distributed by NAMD developers. The tests were performed on identical machines based on intel's Q6600 with nvidia's GTX460 on a gigabit interconnect. All measurements are in nanoseconds per day.

Nodes/Cores/GPUs	With CUDA	Without CUDA	Times faster with CUDA

1 / 4 / 1	1.11	0.22	5.04

2 / 8 / 2	1.79	0.42	4.25

4 / 16 / 4	2.27	0.77	2.94

For comparison, the whole quad cluster (8 nodes, 32 cores, no CUDA) is producing 1.21 nanoseconds per day, whereas a single i7+GTX295 alone produces 1.73 nanoseconds per day.

Peptide systems

For small (~5-6K atoms) systems you will have to stick to a single node. Notably, and due to the small number of atoms, it will not make any difference whether you work with the i7+GTX295 or with a Q6600+GTX460 node : your simulation will plateau at ~26 nanoseconds per day, which is approximately 30% faster from what you would have gotten from two Q6600 nodes without GPUs (or, 70% faster than a single i7 box without GPU).

For a slightly larger, 12,000-atom system, the numbers (in nanoseconds per day) are :

Nodes/Cores/GPUs	With CUDA	Without CUDA	Times faster with CUDA

1 / 4 / 1	14.7	7.57	1.94

2 / 8 / 2	20.0	12.20	1.64

4 / 16 / 4		9.90

For a system with 26,000 atoms, and using a set of 10-12-14 cutoffs + 2-1-2 steps, two nodes (with CUDA) give 8 ns/day.