Problem with local MPI

How to use openEMS. Discussion on examples, tutorials etc

Moderator: thorsten

pascal14
Posts: 32
Joined: Thu 02 Jan 2014, 23:22

Problem with local MPI

Post by pascal14 » Tue 26 Jan 2016, 17:01

Hello Thorsten,
I try to run OpenEMS with local MPI on an octo-processor server (opteron 6176 : 48 cores (8x6)).
I took a sample of the tutorial "Horn_antenna.m".
I feel that the following command has no effect because the calculation time is unchanged (with or without MPI):
FDTD = SetupMPI (FDTD, 'SplitN_X', 2, 'SplitN_Y', 2, 'SplitN_Z', 2);

"FDTD simulation size: 204x204x203" (Always, for each MPI process, is the whole volume still unsplit ?? )

See attached files.
thank you,
Attachments
openEMS.log
(32.23 KiB) Downloaded 532 times
Horn_Antenna_2.m
(6.77 KiB) Downloaded 531 times

thorsten
Posts: 1468
Joined: Mon 27 Jun 2011, 12:26

Re: Problem with local MPI

Post by thorsten » Tue 26 Jan 2016, 18:42

Hi,

you are certainly not using MPI, otherwise there would be no doubt since openEMS would say so prominently.

The problem is, that you cannot just use it, it has to be compiled into openEMS (which it is not by default).
And to make things worse, I haven't looked into it for quite some years as nobody seems to use is.
Thus it is currently not even possible with the new cmake build system and the recent changes to activate it.
But I will use this opportunity to look into it. But it may take some time as I don't know how hard it will get ;)
I at least hope you use Linux, because it will certainly not (and never was) be available on Windows...
Furthermore, be aware that using MPI is only beneficial if your simulation is reasonable large in the first place.
Usually you get away with much smaller mesh sizes, which is another reason why MPI is barely ever used or necessary.

regards
Thorsten

thorsten
Posts: 1468
Joined: Mon 27 Jun 2011, 12:26

Re: Problem with local MPI

Post by thorsten » Tue 26 Jan 2016, 18:44

I just for fun had a look at your log.
You were definitely running 3 (non-MPI) openEMS in parallel doing each the same :D
The only result of that is excess waste heat, but nothing more ;)

regards
Thorsten

pascal14
Posts: 32
Joined: Thu 02 Jan 2014, 23:22

Re: Problem with local MPI

Post by pascal14 » Tue 26 Jan 2016, 19:21

Yes, I work with Linux (CentOS 7) and I use openEMS v0.33 with openMPi (or Mpich2).
I don't know how to recompile openEMS source with openMPi ....

The modifications in the Matlab file are :
1) FDTD = SetupMPI(FDTD,'SplitN_X',2,'SplitN_Y',2,'SplitN_Z',2);
2) opts='--engine=multithreaded --numThreads=6'
3) Settings.MPI.Binary='/home/cousin/openEMS/bin/openEMS.sh';
4) Settings.MPI.NrProc=8;
5) RunOpenEMS_MPI(Sim_Path, Sim_CSX, opts, Settings)

Yes, that's true : with these modifications, it seems that 8 non-mpi openEMS processes are launched in parallel (each the same).

NB : In order to optimise the memory bandwidth on my NUMA server, i have modified "openEMS.sh" script like this :
hugectl --heap numactl --cpunodebind=$PMI_RANK --membind=$PMI_RANK $openEMS_PATH/openEMS $@

But of course the problem is the same with or without this modification.

openEMS dependencies :

$> ldd openEMS

linux-vdso.so.1 => (0x00007fffa8e9b000)
libCSXCAD.so.0 => /home/cousin/openEMS/lib/libCSXCAD.so.0 (0x00007f6c45ac2000)
libfparser.so.4 => /home/cousin/openEMS/lib/libfparser.so.4 (0x00007f6c457e8000)
libtinyxml.so.0 => /lib64/libtinyxml.so.0 (0x00007f6c4559a000)
libz.so.1 => /lib64/libz.so.1 (0x00007f6c45384000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f6c4517f000)
libhdf5_hl.so.8 => /lib64/libhdf5_hl.so.8 (0x00007f6c44f4b000)
libhdf5.so.8 => /lib64/libhdf5.so.8 (0x00007f6c44957000)
libboost_thread-mt.so.1.53.0 => /lib64/libboost_thread-mt.so.1.53.0 (0x00007f6c4473f000)
libboost_system-mt.so.1.53.0 => /lib64/libboost_system-mt.so.1.53.0 (0x00007f6c4453b000)
libboost_date_time-mt.so.1.53.0 => /lib64/libboost_date_time-mt.so.1.53.0 (0x00007f6c4432a000)
libboost_serialization-mt.so.1.53.0 => /lib64/libboost_serialization-mt.so.1.53.0 (0x00007f6c440bd000)
libboost_chrono-mt.so.1.53.0 => /lib64/libboost_chrono-mt.so.1.53.0 (0x00007f6c43eb5000)
libvtkCommonCore.so.1 => /usr/lib64/vtk/libvtkCommonCore.so.1 (0x00007f6c4398b000)
libvtkCommonDataModel.so.1 => /usr/lib64/vtk/libvtkCommonDataModel.so.1 (0x00007f6c4346d000)
libvtkIOLegacy.so.1 => /usr/lib64/vtk/libvtkIOLegacy.so.1 (0x00007f6c431dc000)
libvtkIOXML.so.1 => /usr/lib64/vtk/libvtkIOXML.so.1 (0x00007f6c42ef4000)
libvtkIOGeometry.so.1 => /usr/lib64/vtk/libvtkIOGeometry.so.1 (0x00007f6c42bc2000)
libvtkIOPLY.so.1 => /usr/lib64/vtk/libvtkIOPLY.so.1 (0x00007f6c429aa000)
libvtksys.so.1 => /usr/lib64/vtk/libvtksys.so.1 (0x00007f6c42766000)
libvtkIOCore.so.1 => /usr/lib64/vtk/libvtkIOCore.so.1 (0x00007f6c424f0000)
libvtkIOXMLParser.so.1 => /usr/lib64/vtk/libvtkIOXMLParser.so.1 (0x00007f6c422d8000)
libvtkCommonExecutionModel.so.1 => /usr/lib64/vtk/libvtkCommonExecutionModel.so.1 (0x00007f6c4203a000)
libvtkCommonMisc.so.1 => /usr/lib64/vtk/libvtkCommonMisc.so.1 (0x00007f6c41e25000)
libvtkCommonSystem.so.1 => /usr/lib64/vtk/libvtkCommonSystem.so.1 (0x00007f6c41c12000)
libvtkCommonTransforms.so.1 => /usr/lib64/vtk/libvtkCommonTransforms.so.1 (0x00007f6c419e6000)
libvtkCommonMath.so.1 => /usr/lib64/vtk/libvtkCommonMath.so.1 (0x00007f6c417c4000)
libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f6c414bc000)
libm.so.6 => /lib64/libm.so.6 (0x00007f6c411ba000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f6c40fa3000)
libc.so.6 => /lib64/libc.so.6 (0x00007f6c40be2000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f6c409c6000)
libCGAL.so.10 => /lib64/libCGAL.so.10 (0x00007f6c4079d000)
/lib64/ld-linux-x86-64.so.2 (0x00005625520af000)
librt.so.1 => /lib64/librt.so.1 (0x00007f6c40595000)
libjsoncpp.so.0 => /lib64/libjsoncpp.so.0 (0x00007f6c40372000)
libexpat.so.1 => /lib64/libexpat.so.1 (0x00007f6c40147000)
libmpfr.so.4 => /lib64/libmpfr.so.4 (0x00007f6c3feec000)
libgmp.so.10 => /lib64/libgmp.so.10 (0x00007f6c3fc74000)

pascal14
Posts: 32
Joined: Thu 02 Jan 2014, 23:22

Re: Problem with local MPI

Post by pascal14 » Tue 26 Jan 2016, 19:49

To add MPI-engine, is it enough to add "engine mpi.cpp" and "openems_fdtd_mpi.cpp" in the "CMakeLists.txt" file before recompiling the source code ?

thorsten
Posts: 1468
Joined: Mon 27 Jun 2011, 12:26

Re: Problem with local MPI

Post by thorsten » Tue 26 Jan 2016, 21:14

Hi,
To add MPI-engine, is it enough to add "engine mpi.cpp" and "openems_fdtd_mpi.cpp" in the "CMakeLists.txt" file before recompiling the source code ?
The cmake build system is not prepared for this yet. But you can give it a try.
You need to add "operator_mpi.cpp" too.

Furthermore a
ADD_DEFINITIONS(-DMPI_SUPPORT)
find_package(MPI REQUIRED)
set(CMAKE_CXX_COMPILE_FLAGS ${CMAKE_CXX_COMPILE_FLAGS} ${MPI_COMPILE_FLAGS})
set(CMAKE_CXX_LINK_FLAGS ${CMAKE_CXX_LINK_FLAGS} ${MPI_LINK_FLAGS})
INCLUDE_DIRECTORIES( ${MPI_INCLUDE_PATH} )
and a "${MPI_LIBRARIES}" to the target link library...

But I'm not sure if this is enough...

Unfortunatly for the latest github sources I changed the main "openems.cpp/h" a lot and it is not compatible to MPI anymore at all. I'm just now trying to adept it to it...

regards
Thorsten

pascal14
Posts: 32
Joined: Thu 02 Jan 2014, 23:22

Re: Problem with local MPI

Post by pascal14 » Wed 27 Jan 2016, 11:47

Hi,

Thank you very much for your advice, I modified your makefile as you have shown me but here is the build log message :

CMake Error at /usr/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:108 (message):
Could NOT find MPI_CXX (missing: MPI_CXX_LIBRARIES MPI_CXX_INCLUDE_PATH)


Attached is the modified "CMakeList.txt" file : I have perhaps not add the lines in the right place ??

I did a test on a sample with 65 Mcells (with only multithreading activated) on my 48 cores Server : I did not improve speed between 24 cores and 48 cores (175 Mcells/s)
The only way to improve this score on this machine would indeed be able to use the MPI which is very effective on a NUMA machine (best scalability).

I hope you will manage one day to adapt your code to make work MPI parallelization.

Thanks again.
Attachments
CMakeLists.txt
(4.88 KiB) Downloaded 531 times

thorsten
Posts: 1468
Joined: Mon 27 Jun 2011, 12:26

Re: Problem with local MPI

Post by thorsten » Wed 27 Jan 2016, 19:22

Hi,

well that sounds more like you have no openmpi package installed?
Look for something like libopenmpi-dev

I have made good progess yesterday. I was able to run an MPI session on my machine.
But the speed dropped to one third with MPI vs. just multi-threading ;)
But that wasn't surprising. On my 8 year old dual core CPU using two MPI processes competing is surely not a good idea.
Especially since the simulation was tiny and thus the overhead bigger than anything else...

I will hopefully give this some polish later tonight and upload it.

regards
Thorsten

thorsten
Posts: 1468
Joined: Mon 27 Jun 2011, 12:26

Re: Problem with local MPI

Post by thorsten » Wed 27 Jan 2016, 19:59

Hi,

done, please give it a try!
Pleas be aware that there are quite a few changes since 0.0.33, not only MPI related. But I hope everything works as it should. 8-)

Code: Select all

git clone --recursive https://github.com/thliebig/openEMS-Project.git
cd openEMS-Project
./update_openEMS.sh ~/opt/openEMS --with-MPI
Make sure the dev-package of openmpi can be found. openEMS will link against mpich too, but I found that this lib only works for multiple hosts over a network.

You can check if openEMS was build with MPI support:

Code: Select all

~/opt/openEMS/bin/openEMS
The usage hint should say something like:

Code: Select all

                --engine=MPI                    engine using compressed operator + sse vector extensions + MPI parallel processing
                --engine=multithreaded          engine using compressed operator + sse vector extensions + MPI + multithreading
And if you run openEMS, there should be no more than one output as only rank 0 is allowed to print any output!
But if the mpirun does not work properly, you get multiple independent openEMS instances, all printing at random...

I expect some nice results and speed numbers ;)

regards
Thorsten

pascal14
Posts: 32
Joined: Thu 02 Jan 2014, 23:22

Re: Problem with local MPI

Post by pascal14 » Thu 28 Jan 2016, 10:15

Hi Thorsten,

Thank you very much for your very quick efforts !!
I will try to recompile this new version of openEMS hoping that it works on my Centos 7 with MPI because the packages are not exactly the same as Ubuntu or Opensuse and this causes me some problems to find them particularly concerning AppCSXCAD. It compiles well but at runtime I have a message like: Segmentation fault (core dumped) (with the previous version 0.33)
"openEMS" has it ever been compiled in Redhat or Centos? I suggest that this might be a good idea that you try to do it because it is very often these 2 systems which are installed by default on big HPC servers...
Anyway, I hold you informed of the results on my server.
Thanks again !!

Post Reply