Which hardware for openEMS?

How to use openEMS. Discussion on examples, tutorials etc

Moderator: thorsten

n4es
Posts: 8
Joined: Tue 12 Jun 2018, 21:34

Re: Which hardware for openEMS?

Post by n4es » Thu 12 Jul 2018, 17:47

Hale_812:
Thank you for the information. I have 32 GB of non-ECC DDR3 1600 MHz RAM now on the i7-4790.
It appears that an unlocked i7-8700K or i7-8086K with 3200 MHz DDR4 would nearly double my performance. You mentioned that an unlocked CPU motherboard would use non-ECC RAM.

Regarding buffered/ECC RAM. Which do you feel is more important, the buffering, the error correction, or both?

n4es
Posts: 8
Joined: Tue 12 Jun 2018, 21:34

Re: Which hardware for openEMS?

Post by n4es » Sat 21 Jul 2018, 17:42

I ran more simulations with a larger model: 3-pole cavity band-pass filter with a fine mesh, 3.37 MCells. Both the i5 and i7 were fastest using 3 cores. The Linux Mint 18.3 advantage over Windows 8.1 was less with the larger model, probably because the Windows background tasks were stealing a lower percentage of the CPU. The CPU utilization was higher.

i7-4790. 3.6 GHz. DDR3 1600 MHz. Win 8.1. 3 threads. 171.45 MC/s.
i7-4790. 3.6 GHz. DDR3 1600 MHz. LM 18.3. 3 threads. 198.46 MC/s. (1.16 times faster than Win 8.1)
i5-3470. 3.2 GHz. DDR3 1333 MHz. LM 18.3. 3 threads. 178.70 MC/s. (1.04 times faster than i7 Win 8.1)

I am really quite impressed with the speed of openEMS. Trying to understand the factors that affect speed has been very educational.

Thanks!
Bill N4ES

Hale_812
Posts: 171
Joined: Fri 13 May 2016, 02:54

Re: Which hardware for openEMS?

Post by Hale_812 » Tue 24 Jul 2018, 10:19

n4es wrote:Hale_812:Regarding buffered/ECC RAM. Which do you feel is more important, the buffering, the error correction, or both?
In modern Intel-based workstations that is usually a combined solution.

Buffering boosts RAM subsystem electrical reliability enormously, and ECC is a failsafe mechanism for unrehearsed events, like degradation of BGA soldering, failure of IC-package wiring due to thermal deformations (It was common in Samsung DRAM ICs ten or so years ago),... or even getting under gamma/space radiation, which can happen in some unique conditions, like in observatories.

In my experience, in the University we had a DELL Xeon 128Gb WS, and did not know that one DRAM module was failing for over than a year, before I checked MB bios logs for ECC errors.
Another 256Gb workstation was failing hard after assembly. Checking the logs allowed to pinpoint the defective RAM slot (replaced the mobo)

In addition, with Xeon's ECC controller you get many other delicious features, like advanced interleaving, scrambling, throttling, and other thermal control features, not normally accessible/available in unbuffered i7 based systems.
Last edited by Hale_812 on Tue 24 Jul 2018, 10:37, edited 1 time in total.

Hale_812
Posts: 171
Joined: Fri 13 May 2016, 02:54

Re: Which hardware for openEMS?

Post by Hale_812 » Tue 24 Jul 2018, 10:29

>171.45 MC/s
These MC/s are not comparative, and totally rely on the problem size. (and available cache/bus throughput of course). I guess, you have used the same problem for benchmarking?
With smaller model, windows probably can be slower because of the memory organization, where all the unused ram is constantly added to the system cache... so, if no room for computation growth is reserved, the system can add overhead for taking and giving these RAM pages.... That is just a guess.

PaulUK
Posts: 76
Joined: Wed 28 Oct 2015, 14:23

Re: Which hardware for openEMS?

Post by PaulUK » Tue 27 Nov 2018, 02:53

Is there a way to specify how many cores to use with openEMS/Octave? Regarding memory I have quad channel DDR4 memory but the computer is only running at half the bandwidth that the CPUs are capable of because I have just two memory cards per CPU in a dual CPU motherboard. I'm sure it will be faster for simulations when I get a further two memory Dimms per CPU allowing the system to have full memory bandwidth.

Hale_812
Posts: 171
Joined: Fri 13 May 2016, 02:54

Re: Which hardware for openEMS?

Post by Hale_812 » Thu 29 Nov 2018, 02:02

help RunOpenEMS
RunOpenEMS(Sim_Path, Sim_File, <opts, Settings>)
--numThreads=<n>

However, I am not sure it is working. If I remember correctly, when I tried --engine=basic, something went wrong... Well, just try it yourself.

Octave is single thread by default. Unless there are external binary tools, like openems, which are no part of octave and can work unreliably on newer versions (with new frontend). There are some parallel execution commands in octave, but those are very primitive.

PaulUK
Posts: 76
Joined: Wed 28 Oct 2015, 14:23

Re: Which hardware for openEMS?

Post by PaulUK » Thu 29 Nov 2018, 11:23

Thank you, I'll give it a try.

It would be nice if it could run on GPUs. I'm not at expert at programming but I did some simple exercises to rewrite loops that run on a CUDA GPU. I've not looked at the source yet but will do so when I get some time.

thorsten
Posts: 1406
Joined: Mon 27 Jun 2011, 12:26

Re: Which hardware for openEMS?

Post by thorsten » Thu 29 Nov 2018, 11:42

If I remember correctly, when I tried --engine=basic, something went wrong...
Well that's no surprise. The basic engine does not do multi-threading (MT) at all. Thus setting the number of threads is pointless.
At some point I'm thinking about removing all the intermediate engines, maybe only keep the basic and the full (SSE+compression+MT) engine.
The advantage of the basic engine would be that it's a bit easier to implement and test new features, but not by that much. I'm not sure.
It would be nice if it could run on GPUs. I'm not at expert at programming but I did some simple exercises to rewrite loops that run on a CUDA GPU. I've not looked at the source yet but will do so when I get some time.
You will be up to a huge challenge. You will pretty much need to write a new FDTD kernel and all its extensions. I see no easy way to port the current implementation on a gpu. Nevertheless it would be cool of course to at least have a basic FDTD kernel for a GPU to compare with.
The ideal approach would be of course to fine tune the FDTD kernel for CPU's as that will give the best speed at the end. The reason for this is of course that the FDTD algorithm is ideally suited for a CPU and its extremely fast internal cache. A GPU and its (in comparison) slow memory will never be able to keep up even with its much higher computing power. FDTD is just data bandwidth limited, not by computing power.
See the chart I reference in [1]: For example Empire XPU can run its FDTD kernel with ~4000 MCells/s on an older Core i7, while a Tesla K80 GPU "only" reaches about 3000MCells/s while being much much more expensive. Two equally priced Quad Xeons run at 21BCells/s :lol:

regards
Thorsten

[1] http://empire.de/page236.html&sid=10a3b ... d24b028080

thorsten
Posts: 1406
Joined: Mon 27 Jun 2011, 12:26

Re: Which hardware for openEMS?

Post by thorsten » Thu 29 Nov 2018, 11:56

One more note on this topic. Even going the route of advanced CPU (Cache) optimization would require a completely different and new approach FDTD kernel too...
The current openEMS FDTD kernel was never developed with max. speed in mind, but with most flexibility to implement and test new cool features and good maintainability. It is a university lab code after all..

regards
Thorsten

PaulUK
Posts: 76
Joined: Wed 28 Oct 2015, 14:23

Re: Which hardware for openEMS?

Post by PaulUK » Thu 29 Nov 2018, 12:12

Thanks for the info.

I see. Well, If I can make use of all my cores available that would be good. I think it will be faster when I'm running my machine with quad channel memory rather than just dual. From what I understand I'll have to get the same memory and can't mix different memory capacities which is a shame as it was very expensive (for a further 4 x 32Gb!). And I'll have to double it.

I shall download your source code to study, just for interest if nothing else. I read your publication on openEMS and understand it. Although I dabbled before I'm finally getting to do more useful things and writing up my own manual as as when I have time. When I have some of my own models/projects working and documented you can add them to the tutorials if you wish.

My only experience with CUDA is how to run basic loops on GPU anyway. I'm no expert at CEM, more of a user but interested in learning the math behind the methods.

Post Reply