I am currently encountering parallel issues when performing G0W0 calculations. The last part of the log file is as follows:
Code: Select all
<24s> P84-n37: [PARALLEL Response_G_space_and_IO for K(bz) on 1 CPU] Loaded/Total (Percentual):576/576(100%)
<24s> P84-n37: [PARALLEL Response_G_space_and_IO for Q(ibz) on 1 CPU] Loaded/Total (Percentual):576/576(100%)
<24s> P84-n37: [PARALLEL Response_G_space_and_IO for G-vectors on 1 CPU]
<24s> P84-n37: [PARALLEL Response_G_space_and_IO for K-q(ibz) on 1 CPU] Loaded/Total (Percentual):576/576(100%)
<25s> P84-n37: [LA@Response_G_space_and_IO] PARALLEL linear algebra uses a 6x6 SLK grid (36 cpu)
<25s> P84-n37: [PARALLEL Response_G_space_and_IO for K(ibz) on 1 CPU] Loaded/Total (Percentual):576/576(100%)
<25s> P84-n37: [PARALLEL Response_G_space_and_IO for CON bands on 72 CPU] Loaded/Total (Percentual):5/328(2%)
<25s> P84-n37: [PARALLEL Response_G_space_and_IO for VAL bands on 3 CPU] Loaded/Total (Percentual):24/72(33%)
<25s> P84-n37: [PARALLEL distribution for RL vectors(X) on 1 CPU] Loaded/Total (Percentual):540225/540225(100%)
<33s> P84-n37: [MEMORY] Alloc WF%c( 8.984925 [Gb]) TOTAL: 11.47166 [Gb] (traced) 55.00800 [Mb] (memstat)
I tried modifying the job script, like:
Code: Select all
#SBATCH --cpus-per-task=1
#SBATCH --tasks-per-node=36
Code: Select all
DIP_CPU= "1 72 3" # [PARALLEL] CPUs for each role
DIP_ROLEs= "k c v" # [PARALLEL] CPUs roles (k,c,v)
DIP_Threads= 0 # [OPENMP/X] Number of threads for dipoles
X_and_IO_CPU= "1 1 1 72 3" # [PARALLEL] CPUs for each role
X_and_IO_ROLEs= "q g k c v" # [PARALLEL] CPUs roles (q,g,k,c,v)
X_and_IO_nCPU_LinAlg_INV= 216 # [PARALLEL] CPUs for Linear Algebra
X_Threads= 0 # [OPENMP/X] Number of threads for response functions
SE_CPU= " 1 216 1" # [PARALLEL] CPUs for each role
SE_ROLEs= "q qp b" # [PARALLEL] CPUs roles (q,qp,b)
SE_Threads= 0
Sincerely,
Jingda Guo