package command

Syntax:

package style args
  • style = kokkos

  • args = arguments specific to the style

    • kokkos args = keyword value …

      zero or more keyword/value pairs may be appended

      keywords = comm or reduction

      • comm value = threaded or classic
        • threaded = perform pack/unpack using KOKKOS (e.g. on GPU or using OpenMP) (default)
        • classic = perform communication pack/unpack in non-KOKKOS mode
      • reduction = parallel/reduce or atomic
        • parallel/reduce = use parallel reduction for statistics (default)
        • atomic = use atomic reduction for statistics
      • collide/extra = factor
        • factor = increase memory used for collisions by this factor (default)
      • collide/retry = yes or no
        • yes = retry collision algorithm until successful
        • no = do not retry collision algorithm (default)
      • gpu/direct = yes or no - yes = use GPU-direct, i.e. CUDA-aware MPI (default) - no = do not use GPU-direct

Examples:

package kokkos comm classic
package kokkos comm threaded reduction atomic
package kokkos gpu/direct no

Description:

This command invokes package-specific settings for the KOKKOS accelerator package available in SPARTA.

If this command is specified in an input script, it must be near the top of the script, before the simulation box has been created. This is because it specifies settings that the accelerator package used in its initialization, before a simulation is defined.

This command can also be specified from the command-line when launching SPARTA, using the “-pk” command-line switch. The syntax is exactly the same as when used in an input script.

Note that the KOKKOS accelerator package requires the package command to be specified, if the package is to be used in a simulation (SPARTA can be built with the accelerator package without using it in a particular simulation). However, a default version of the command is typically invoked by other accelerator settings. For example, the KOKKOS package requires a “-k on” command-line switch respectively, which invokes a “package kokkos” command with default settings.

Note

A package command for a particular style can be invoked multiple times when a simulation is setup, e.g. by the “-k on”, “-sf”, and “-pk” command-line switches, and by using this command in an input script. Each time it is used all of the style options are set, either to default values or to specified settings. I.e. settings from previous invocations do not persist across multiple invocations.

See the the Accelerating SPARTA section of the manual for more details about using the various accelerator packages for speeding up SPARTA simulations.


The kokkos style invokes settings associated with the use of the KOKKOS package.

All of the settings are optional keyword/value pairs. Each has a default value as listed below.

The reduction keyword sets the type of reduction used to gather statistics. The parallel/reduce option uses a parallel reduction and is typically the preferred method when running on CPUs and Xeon Phis. The atomic option uses thread atomics and is typically faster when running on GPUs.

Chemical reactions can increase the number of particles in the simulation, which requires extra memory storage. It is not possible to resize Kokkos data structures during the collide routine, so two workarounds are provided. The default is to use the collide/extra keyword, which ensures there is extra memory allocated to store new particles. For example, if collide/extra is set to 1.1, then the memory is over-allocated by 10%. If this space is still not sufficient to hold new particles, the code will error out and the simulation must be restarted using a larger value for collide/extra. Alternatively, if the collide/retry option is set to yes, backup copies of the Kokkos data structures are created. If space is exceeded during the collide routine, the Kokkos data structures are restored from backup, their size is increased, and the collide routine is started over from the beginning. This guarantees that the collide routine will eventually succeed without producing an error, but increased memory by a factor of 2 and also has overhead from making a backup copy of the data. If the collide/retry option is set to yes, the collide/extra keyword will be ignored. If reactions are not defined, both these options will be ignored.

The comm keyword determines whether the host or device performs the packing and unpacking of data when communicating per-atom data between processors. The value options are threaded or classic.

The optimal choice for this keyword depends on the hardware used. When running on CPUs or Xeon Phi, the classic option is typically fastest. When using GPUs, the threaded value will typically be optimal. In this case data can stay on the GPU for many timesteps without being moved between the host and GPU. This requires that your MPI is able to access GPU memory directly. Currently that is true for OpenMPI 1.8 (or later versions), Mvapich2 1.9 (or later), and CrayMPI.

The gpu/direct keyword chooses whether GPU-direct will be used. When this keyword is set to on, buffers in GPU memory are passed directly through MPI send/receive calls. This can reduce overhead of first copying the data to the host CPU. However GPU-direct is not supported on all systems, which can lead to segmentation faults and would require using a value of off.


Restrictions:

This command cannot be used after the simulation box is defined by a create_box command.

The kk style of this command can only be invoked if SPARTA was built with the KOKKOS package. See the Making SPARTA section for more info.

Default:

For the KOKKOS package, the option defaults are comm = threaded, reduction = parallel/reduce, collide/extra = 1.1, and collide/retry = no, gpu/direct yes. These settings are made automatically by the required “-k on” command-line switch. You can change them by using the package kokkos command in your input script or via the “-pk kokkos” command-line switch.