Hi,
I am trying to use idb to debug a simple mpi program (Fortran) that write hostname of each rank, to see how it works, but i have a problem.
First, i set IDB_HOME, IDB_PARALLEL_SHELL and MPIEXEC_DEBUG=1.
Then i try to start idb:
[cscppm59@imip15 MPI-INTEL]$ mpiexec.hydra -idb -f ./dbg.hosts -n 2 a.out
mpiexec: idb -pid 17765 -mpi2 -parallel mpiexec.hydra
Intel(R) Debugger for applications running on Intel(R) 64, Version 13.0, Build [80.483.23]
Attaching to program: /opt/intel/impi/4.1.3.048/intel64/bin/mpiexec.hydra, process 17765
[New Thread 17765 (LWP 17765)]
Reading symbols from /opt/intel/impi/4.1.3.048/intel64/bin/mpiexec.hydra...done.
__select_nocancel () in /lib64/libc-2.12.so
Continuing.
MPIR_Breakpoint () at /tmp/7b663e0dc22b2304e487307e376dc132.xtmpdir.nnlmpicl211.25617_32e/mpi4.32e.nnlmpibld11.20140124/dev/src/pm/hydra/tools/debugger/debugger.c:24
No source file named /tmp/7b663e0dc22b2304e487307e376dc132.xtmpdir.nnlmpicl211.25617_32e/mpi4.32e.nnlmpibld11.20140124/dev/src/pm/hydra/tools/debugger/debugger.c.
(idb)
[0:1] Intel(R) Debugger for applications running on Intel(R) 64, Version 13.0, Build [80.483.23]
%1 [0:1] Attaching to program: /home/users/cscppm59/Prove/MPI-INTEL/a.out, process [17770;17771]
%2 [0:1] [New Thread [17770;17771] (LWP [17770;17771])]
[0:1] Reading symbols from /home/users/cscppm59/Prove/MPI-INTEL/a.out...done.
[0:1] MPIR_WaitForDebugger () at /tmp/7b663e0dc22b2304e487307e376dc132.xtmpdir.nnlmpicl211.25617_32e/mpi4.32e.nnlmpibld11.20140124/dev/src/mpi/debugger/dbginit.c:270
[0:1] No source file named /tmp/7b663e0dc22b2304e487307e376dc132.xtmpdir.nnlmpicl211.25617_32e/mpi4.32e.nnlmpibld11.20140124/dev/src/mpi/debugger/dbginit.c.
(idb)
[0:1] error: cannot return to function main
(idb)
[0:1] Source file not found or not readable, tried...
[0:1] /tmp/7b663e0dc22b2304e487307e376dc132.xtmpdir.nnlmpicl211.25617_32e/mpi4.32e.nnlmpibld11.20140124/dev/src/mpi/debugger/dbginit.c
[0:1] ./dbginit.c
[0:1] /home/users/cscppm59/Prove/MPI-INTEL/dbginit.c
(idb)
Using commands 'where' and 'up' i am able to set a breakpoint, and it works fine:
(idb) where
(idb)
%3 [0:1] #0 0x00007f05ab36234b in MPIR_WaitForDebugger () at /tmp/76630222304487307376132b211.25617324.3211.20140124e270dcbeedc.xtmpdir.nnlmpicl_e/mpie.nnlmpibld/dev/src/mpi/debugger/dbginit.c:
%4 [0:1] #1 0x00007f05ab3e1870 in MPIR_Init_thread (argc=0x0, argv=0x0, required=0, provided=0x7fff8b16f25c) at /tmp/76630222304487307376132b211.25617324.3211.20140124e733dcbeedc.xtmpdir.nnlmpicl_e/mpie.nnlmpibld%5 [0:1] #2 0x00007f05ab3ce290 in PMPI_Init (argc=0x0, argv=0x0) at /tmp/76630222304487307376132b211.25617324.3211.20140124e195dcbeedc.xtmpdir.nnlmpicl_e/mpie.nnlmpibld/dev/src/mpi/init/init.c:%6 [0:1] #3 0x00007f05ab99531f in mpi_init_ () in /opt/intel/impi/4.1.3.048/intel64/lib/libmpigf.so..4.1
[0:1] #4 0x0000000000401140 in mpitest () at /home/users/cscppm59/Prove/MPI-INTEL/mpi.f:9
(idb) up
(idb)
%7 [0:1] #1 0x00007f63613ad870 in MPIR_Init_thread (argc=0x0, argv=0x0, required=0, provided=0x7fff16c203dc) at /tmp/76630222304487307376132b211.25617324.3211.20140124e733dcbeedc.xtmpdir.nnlmpicl_e/mpie.nnlmpibld [0:1] No source file named /tmp/7b663e0dc22b2304e487307e376dc132.xtmpdir.nnlmpicl211.25617_32e/mpi4.32e.nnlmpibld11.20140124/dev/src/mpi/init/initthread.c.
(idb)
(idb)
%8 [0:1] #2 0x00007f05ab3ce290 in PMPI_Init (argc=0x0, argv=0x0) at /tmp/76630222304487307376132b211.25617324.3211.20140124e195dcbeedc.xtmpdir.nnlmpicl_e/mpie.nnlmpibld/dev/src/mpi/init/init.c: [0:1] No source file named /tmp/7b663e0dc22b2304e487307e376dc132.xtmpdir.nnlmpicl211.25617_32e/mpi4.32e.nnlmpibld11.20140124/dev/src/mpi/init/init.c.
(idb)
(idb)
%9 [0:1] #3 0x00007f636196131f in mpi_init_ () in /opt/intel/impi/4.1.3.048/intel64/lib/libmpigf.so..4.1
(idb)
(idb)
[0:1] #4 0x0000000000401140 in mpitest () at /home/users/cscppm59/Prove/MPI-INTEL/mpi.f:9
[0:1] 9 call MPI_INIT(ierr)
(idb) b 15
(idb)
[0:1] Breakpoint 1 at 0x401316: file /home/users/cscppm59/Prove/MPI-INTEL/mpi.f, line 15.
(idb) c
(idb)
[0:1] Continuing.
[0:1]
[0:1] Breakpoint 1, mpitest () at /home/users/cscppm59/Prove/MPI-INTEL/mpi.f:15
[0:1] 15 aus=0.87
(idb) where
(idb)
[0:1] #0 0x0000000000401316 in mpitest () at /home/users/cscppm59/Prove/MPI-INTEL/mpi.f:15
(idb) n
(idb)
[0:1] 17 do j=1,1
(idb) p aus
(idb)
[0:1] $1 = 0.870000005
(idb)
This is a simple program without error.
If i try simply to quit idb, program exits normally:
(idb) q
imip15.ba.imip.cnr.it
imip15.ba.imip.cnr.it
[cscppm59@imip15 MPI-INTEL]$
Instead, if i try to continue, i obtain 'program exited normally' like when i use gdb, but idb starts to write on the screen infinitely (idb) in a way like this:
(idb) c
(idb)
[0:1] Continuing.
imip15.ba.imip.cnr.it
imip15.ba.imip.cnr.it
[0:1] Program exited normally.
(idb) (idb) (idb) [cscppm59@imip15 MPI-INTEL]$ (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb)
To stop this i must kill the process.
1) Is there a way to avoid this?
2) Is there a way to start an MPI parallel debug session in GUI mode?
Thanks
Pierpaolo