Great Edge-pectations: How Edge and Exascale Found Love

It is no secret that the world’s largest supercomputers are often quite lonely. Until recently, live-streaming sensor data into massive simulations was more exhibit booth demo than reality. Extreme-scale computational models were isolated, admiring from afar the excitement of edge computing, the Internet of Things, smart cities, autonomous cars and intelligent laboratory experiments. From the

Resilient Error-Bounded Lossy Compressor for Data Transfer

Today’s exascale scientific applications or advanced instruments are producing vast volumes of data, which need to be shared/transferred through the network/devices with relatively low bandwidth (e.g., WAN). Lossy compression is an important strategy to resolve the big data issue, however, little work was done to make it resilient against silent errors, which may happen during

KAISA: An Adaptive Second-Order Optimizer Framework for Deep Neural Networks

Kronecker-factored Approximate Curvature (K-FAC) has recently been shown to converge faster in deep neural network (DNN) training than stochastic gradient descent (SGD); however, K-FAC’s larger memory footprint hinders its applicability to large models. We present KAISA, a K-FAC-enabled, Adaptable, Improved, and ScAlable second-order optimizer framework that adapts the memory footprint, communication, and computation given specific