Parallel (MCNP)

Why Use Parallel MCNP?

Monte Carlo simulations are naturally parallel. Each particle history is independent, making MCNP ideal for parallel computing.

Benefits

Reduce runtime dramatically
Run more particles for better statistics
Handle larger, more complex models
Utilize modern multi-core hardware

Key Concepts

Speedup: How much faster with more cores
Efficiency: How well cores are utilized
Scaling: Performance vs. processor count
Overhead: Cost of coordination

Parallel Methods

MCNP offers three main approaches to parallel computing, each suited for different hardware.

MPI (Message Passing Interface)

Best for clusters and distributed systems. Each process has its own memory.

bash

# Run with 16 MPI processes
mcnp6 i=input n=output tasks 16

# Using mpirun (alternative)
mpirun -np 16 mcnp6.mpi i=input n=output

OpenMP (Shared Memory)

Best for multi-core workstations. Multiple threads share memory.

bash

# Run with 8 OpenMP threads
export OMP_NUM_THREADS=8
mcnp6 i=input n=output

# Set thread affinity for better performance
export OMP_PROC_BIND=true
export OMP_NUM_THREADS=8
mcnp6 i=input n=output

Hybrid (MPI + OpenMP)

Combines both methods. Best for clusters with multi-core nodes.

bash

# 4 MPI processes with 6 threads each (24 cores total)
export OMP_NUM_THREADS=6
mpirun -np 4 mcnp6.mpi i=input n=output

Choosing the Right Method

Single Workstation

Use OpenMP (threads) for simplicity
Set threads = number of physical cores
Avoid hyperthreading for MCNP
Ensure adequate memory per thread

HPC Cluster

Use hybrid MPI+OpenMP
MPI tasks = number of nodes
Threads = cores per node
Consider network performance

Performance Tips

Optimal Configuration

Start with threads = physical cores for single machines
For clusters, use 1 MPI task per node with threads = cores per node
Always test different configurations for your specific problem
Monitor memory usage - each process needs sufficient RAM

Common Mistakes to Avoid

Over-subscription: More tasks than cores
Memory issues: Insufficient RAM per process
I/O bottlenecks: Too many output files
Poor load balancing: Uneven work distribution

Testing Performance

Always test your parallel setup with a representative problem before running production calculations.

Simple Scaling Test

bash

# Test with different thread counts (set before each run)
OMP_NUM_THREADS=1 mcnp6 i=test n=out1    # Baseline
OMP_NUM_THREADS=2 mcnp6 i=test n=out2    # 2 cores
OMP_NUM_THREADS=4 mcnp6 i=test n=out4    # 4 cores
OMP_NUM_THREADS=8 mcnp6 i=test n=out8    # 8 cores

# Compare runtimes and calculate speedup
# Speedup = Time(1 core) / Time(N cores)
# Efficiency = Speedup / N cores

Typical Performance Expectations

Good speedup up to 8-16 cores on most problems
Diminishing returns beyond 32-64 cores
Complex geometries may scale better than simple ones
Shielding problems often scale well due to variance reduction

Getting Started

Start with OpenMP on your workstation using threads = cores
Test with a small problem to verify setup works
Monitor performance and memory usage
Scale up gradually, testing at each step
Document what works best for your problem types

MCNP Guide

Parallel Computing in MCNP

Why Use Parallel MCNP?

Benefits

Key Concepts

Parallel Methods

MPI (Message Passing Interface)

OpenMP (Shared Memory)

Hybrid (MPI + OpenMP)

Choosing the Right Method

Single Workstation

HPC Cluster

Performance Tips

Optimal Configuration

Common Mistakes to Avoid

Testing Performance

Simple Scaling Test

Typical Performance Expectations

Getting Started

Related Topics

Prerequisite

Variance Reduction

Next Steps

Advanced Features

Related

MCNP Guide Overview

Monte Carlo Basics