The solute is a peptapeptide. The system comprises 1632 atoms and the simulation involves full electrostatics on a 32x32x27 grid(see included script below).
NAMD script used for these tests
#
# Input files
#
structure ionized.psf
coordinates heat_out.coor
velocities heat_out.vel
extendedSystem heat_out.xsc
parameters par_all27_prot_na.inp
paraTypeCharmm on
#
# Output files & writing frequency for DCD
# and restart files
#
outputname output/equi_out
binaryoutput off
restartname output/restart
restartfreq 1000
binaryrestart yes
dcdFile output/equi_out.dcd
dcdFreq 200
DCDunitcell on
#
# Frequencies for logs and the xst file
#
outputEnergies 20
outputTiming 200
xstFreq 200
#
# Timestep & friends
#
timestep 2.0
stepsPerCycle 8
nonBondedFreq 2
fullElectFrequency 4
#
# Simulation space partitioning
#
switching on
switchDist 8
cutoff 9
pairlistdist 11
#
# Basic dynamics
#
COMmotion no
dielectric 1.0
exclude scaled1-4
1-4scaling 1.0
rigidbonds all
#
# Particle Mesh Ewald parameters.
#
Pme on
PmeGridsizeX 32 # <===== CHANGE ME
PmeGridsizeY 32 # <===== CHANGE ME
PmeGridsizeZ 27 # <===== CHANGE ME
#
# Periodic boundary things
#
wrapWater on
wrapNearest on
wrapAll on
#
# Langevin dynamics parameters
#
langevin on
langevinDamping 1
langevinTemp 320 # <===== Check me
langevinHydrogen on
langevinPiston on
langevinPistonTarget 1.01325
langevinPistonPeriod 200
langevinPistonDecay 100
langevinPistonTemp 320 # <===== Check me
useGroupPressure yes
firsttimestep 9600 # <===== CHANGE ME
run 100000000 ;# <===== CHANGE ME
The performance table goes as follows:
| Days per nsec | nsec per day |
1 core | 0.105 | 9.5 |
2 cores - one node | 0.102 | 9.8 |
2 cores - two nodes | 0.075 | 13.3 |
4 cores - one node | 0.073 | 13.7 |
4 cores - two nodes | 0.066 | 15.1 |
4 cores - four nodes | 0.064 | 15.6 |
8 cores - two nodes | 0.061 | 16.4 |
Combinations like 8 cores in four nodes, or anything with more than 8 cores do not scale.
Given (i) the huge difference between CPU time and Wall clock time, and, (ii) the fact that the size of the messages exchanged must be very small, these numbers can probably be improved further by adjusting partitioning parameters (or things like '+strategy USE_HYPERCUBE').