New box with an 8-core AMD FX-8150, an NVIDIA GTX 570 and 4 Gbytes of memory at 1333 MHz arrived. After a few problems with perceus (rejected hardware address starting from 8C:…), all else appear to have worked smoothly. Started testing it with NAMD and comparing it with the i7-extreme-GTX295 machine (all measurements in nanoseconds per day) …
Multicore version (Non-CUDA) v.2.8
The multicore NAMD version used for the i7 and Q6600 machines was somewhat older than the one used for the AMD machine. To be on the safe side, subtract ~10-20% from the speedups quoted below.
AMD 8-core | i7, eight threads | Percent faster | Q6600-based quad | Percent faster | ||||
---|---|---|---|---|---|---|---|---|
100K (ApoA1) | 0.40 | 0.33 | 23% | 0.19 | 116% | |||
60K atoms | 2.18 | 1.61 | 35% | 0.95 | 130% | |||
25K atoms | 6.25 | 4.35 | 43% | 2.53 | 147% | |||
6.5K atoms | 24.0 | 15.4 | 55% | 9.8 | 145% | |||
1.6K atoms | 108 | 79.4 | 36% | 48 | 125% | |||
0.9K atoms | 192 | 145 | 32% | 82 | 134% |
With CUDA
For the AMD-GTX570 machine and for large problems (~100K atoms), adding a line in the spirit of +devices 0,0,0,0,…
significantly improved performance. This was not the case with the i7-GTX295 hardware.
AMD 8-core FX-8150 + GTX 570 | i7 extreme + GTX295, four threads | |
---|---|---|
100K (ApoA1) | 2.33 (with '+devices') | 1.92 |
60K atoms | 5.98 (with '+devices') | 6.28 |
25K atoms | 13.7 | 12.2 |
8.1K atoms | 37 | 25 |
1.6K atoms | 172 | 166 |
Other comparisons
5.5K atoms → 56 ns/day on FX-8150+GTX570 (CUDA) vs. 32 ns/day on Q6600+GTX460 (CUDA)
4.5K atoms → 68 ns/day on FX-8150+GTX570 (CUDA) vs. 25 ns/day on two Q6600 (8 cores, no CUDA) vs. 19 ns/day on one Q6600 (4 cores, no CUDA)