WP 3. Scalable parallel pre-processing mesh generation and mesh splitting domain decomposition

Up to now, usual ways to scale current codes to billion elements implies to use preliminary meshes built with non-parallelised pre-processors and then scale the number of elements by refinement techniques within the solvers, but losing relevant geometrical information. This workpackage deals with this big issue, that is, to develop parallelisable pre-processors capable to transfer geometrical information to all the needed elements for the solver.

In this task we will develop scalable algorithms for the main operations involved in the preprocessing of analysis data; namely: data import, data decimation or enhancement (refinement), data creation (surfaces, lines) or deletion, specification of boundary conditions for the field solvers and specification of the desired element size and shape distribution in space. The main compute-intensive pre-processing tasks include: intersection of many surfaces, error checking for topological consistency, and specification of element size/shape in space for large, complex geometry cases.

Given the size of problems targeted, it is estimated that we will develop and implement many new data handling techniques and algorithms.

As a base code we will employ in NUMEXAS the GiD pre-processing system developed at CIMNE (www.gidhome.com). Both NTUA and LUH have ready access and knowledge of this code and this will help, reducing learning/training times and enhancing the chances of success of this task.

Emphasis will be put in the identification of commonalities (as described in section B1.1.4 of the DoW) at pre-processing level that can be of general interest and applicability for other HPC developments.

Task leader: **QUANTECH**. Partners involved: **CIMNE, LUH-IKM, NTUA, QUANTECH**

- The first is to subdivide space in regions that will generate approximately the same number of elements, and;
- The second level is to perform the parallel grid generation.

These two tasks will be carried out with different grid generation techniques/codes, making the approach very general.

Given that the number of elements and points decreases with the 3rd power of the element size, a mesh with elements whose side-lengths are n times as large as the desired one will only contain n3 elements as the (fine) mesh desired. The idea is to generate, starting from the fine surface mesh, a mesh whose elements are considerably larger than the grid desired. A factor of n=10 will lead to a mesh generated in roughly 1/1000-th of the time required for the fine mesh. For n=20, the factor is 1/8000. Thus, the mesh obtained conforms to the general size distribution required by the user and is completely general. This will also allow us determining exactly and easily which regions of space need to be gridded (one of the problematic aspects of earlier parallel grid generators).

Task leader: **CIMNE**. Partners involved: **CIMNE, LUH-IKM, NTUA**

Given an initial mesh, the next task is to subdivide this mesh so as to obtain regions in which roughly the same numbers of elements will be generated. A number of load balancing techniques and codes have been developed over the last two decades. In principle, any of these can be used in order to obtain the subdivision required.

Once the subdivision of space is obtained, the mesh is generated in parallel. In NUMEXAS we will use an ‘inside-out’ technique, whereby the zones inside the subdivision domains are generated first. The zones bordering the regions, which are left empty after this first volume-filling pass, are then meshed, in parallel, by pairing two domains (or groups of domains) at a time. By using a colouring technique, most of these inter-domain regions can be meshed completely in parallel.

The remainder regions are then meshed via agglomeration of the smaller regions, which can (again via a suitable colouring technique) be meshed completely in parallel.

The remainder regions are then agglomerated further and meshed, resulting in logarithmic parallelism. However, it is estimated that the volume fraction occupied by these last regions is very small, and hence the CPU overhead should be negligible.

Given the number of processors and domains targeted, this task will lead to a whole series of new mesh generation algorithms that were hitherto not required given the limited size of problems currently being run.

Emphasis will be put in the identification of commonalities (as described in section B1.1.4 of the DoW) at mesh generation level that can be of general interest and applicability for other HPC developments.

Task leader: **NTUA**. Partners involved: **CIMNE, LUH-IKM, NTUA**

Lead beneficiary: **NTUA**

Lead beneficiary: **CIMNE**

Lead beneficiary: **NTUA**