Talks

Dynamic Non-Uniform Randomization in Asynchronous Linear Solvers

Copper Mountain Conference on Iterative Methods, Copper Mountain, CO (Virtual) (2022) (Slides)(Code for the slides)

Abstract/Overview

One approach to improving the performance of stationary linear solvers is to utilize asynchronous communication between the processors. In order to establish bounds on convergence rate, randomized various of asynchronous linear solvers have been studied. This approach has been examined previously for the case where the random selection is done uniformly and non-uniformly, however the probability of selecting any given component remains fixed through the algorithm and does not dynamically change. The idea behind the solvers considered here is for each processor to select the next component to update randomly, using a distribution that more heavily weights selection of components that are somehow more important to the solution. The main contribution is to evaluate the viability of changing the component ranking dynamically, and investigate the effect of various distributions in selecting from this dynamic list.

This updating procedure is motivated by the Southwell iteration, which selects the component with the largest contribution to the residual at each iteration and which can converge in fewer iterations than traditional relaxation schemes; previous work has also shown that Southwell type iterations can converge faster than uniform random selection. There is a balance between the extra computational time required to intelligently select components, and the savings in total iterations. Updating all the residuals every iteration likely introduces too much computational overhead to be of practical use, and techniques are explored for lessening this computational burden.

Scalable Methods in Numerical Linear Algebra

Applied Math and Data Analysis Seminar, Naval Surface Warfare Center, Dahlgren Division, Dahlgren, VA (2019)

Abstract/Overview

Took an opportunity to discuss some generalizations of a few different scalability studies. Topics were centered around ways to incorporate randomization and asynchronous communication into some of the lower levels of mathematical routines. First studies were done with a randomized, asynchronous Jacobi and preliminary work extending the results to coordinate descent methods were discussed.

Fault Tolerant Methods for Scientific Computing

Office of Naval Research ILIR/IAR Symposium, Office of Naval Research, Arlington, VA (2018)

Abstract/Overview

Summary of results developed under this Office of Naval Research grant including studies on simulating faults and creating fault tolerant algorithms.

Convergence and Resilience of the Fine-Grained Parallel Incomplete LU Factorization for Non-Symmetric Problems

High Performance Computing Symposium at the Spring Simulation Multi-Conference 18, Baltimore, MD (2018) (Slides)

Abstract/Overview

An investigation into the convergence of the fine-grained parallel algorithm for computing an incomplete LU factorization for non-symmetric and indefinite matrices. The fine-grained parallel incomplete LU factorization is a nonlinear fixed point iteration and convergence has not been extensively studied for problems that are not symmetric positive definite. This work investigates the convergence of the algorithm for these more difficult problems and additionally investigates how the occurrence of a computing fault may impact the convergence of the algorithm for the problems being studied. The results obtained suggest that this class of problems presents challenges for the fine-grained parallel incomplete LU factorization (less than 30% of configurations converge naturally), and that while the occurrence of a fault can cause significant negative effects, the simple algorithmic change advocated here can completely ameliorate the effects of a fault.

Simulation Framework for Asynchronous Iterative Methods

Naval Surface Warfare Center Dahlgren Division, Technical Seminar, Dahlgren, VA (2018)

Abstract/Overview

As high-performance computing (HPC) platforms progress towards exascale, computational methods must be revamped to successfully leverage them. In particular,(1) asynchronous methods become of great importance because synchronization becomes prohibitively expensive and (2) resilience of computations must be achieved, eg, using checkpointing selectively which may otherwise become prohibitively expensive due to the sheer scale of the computing environment. In this work, a simulation framework for asynchronous iterative methods is proposed and tested on HPC accelerator (shared-memory) architecture. The design proposed here offers a lightweight alternative to existing computational frameworks to allow for easy experimentation with various relaxation iterative techniques, solution updating schemes, and predicted performance. The simulation framework is implemented in MATLAB® using function handles, which offers a modular and easily extensible design. An example of a case study using the simulation framework is presented to examine the efficacy of different checkpointing schemes for asynchronous relaxation methods.

Fault Tolerant Variants of the Fine-Grained Parallel Incomplete LU Factorization

High Performance Computing Symposium at the Spring Simulation Multi-Conference 17, Virginia Beach, VA (2017) (Slides)

Abstract/Overview

Results concerning the convergence of the algorithm with respect to the occurrence of faults, and the impact of any sub-optimality in the produced incomplete factors in Krylov subspace solvers are given. Numerical tests show that the simple algorithmic changes suggested here can ensure convergence of the fine-grained parallel incomplete factorization, and improve the performance of the use of the resulting factors as preconditioners in Krylov subspace solvers if faults do occur.

Fault Tolerance for Fine-Grained Parallel Iterative Methods

Virginia Modeling, Analysis and Simulation Center Capstone Conference, Norfolk, VA (2017)

Abstract/Overview

Talk showcasing preliminary work taking initial work on Krylov subspace solvers into the so-called “fine-grained” domain. That is, working towards performing any algorithm-based fault tolerance on a small, granular level so that each processing element (e.g., CPU core) can operate independently in an attempt to minimize data transfer and corresponding communication costs.

Fault Tolerant Methods in Scientific Computing,

In-House Laboratory Independent Research (ILIR) Presentation, Naval Surface Warfare Center Dahlgren Division, Dahlgren, VA (2017)

Abstract/Overview

Summary of results developed under this Office of Naval Research grant including studies on simulating faults and creating fault tolerant algorithms.

Evaluating a perturbation-based soft fault model on preconditioned iterative methods

International Conference on Parallel and Distributed Processing Techniques and Applications, Las Vegas, NV (2016)

Abstract/Overview

This talk was linked to the corresponding conference paper and showcased some work towards developing a way to simulate the occurrence of a soft fault without the need to inject a large enough number of bit flips to generate a statistically significant sample.

A Comparison and Analysis of Soft Fault Error Models

Virginia Modeling, Analysis and Simulation Center Capstone Conference, Norfolk, VA (2016)

Abstract/Overview

This talk covered a couple of studies that were designed to judge the impact of soft faults on Krylov subspace solvers. To that end, the impact of undetected computational faults was experimented with and the results were used to feed future studies to help create fault tolerant algorithms.