Mixed-Precision Training Efficiency in Compute Constrained ML Systems

Authors

  • Rowan Westerdale

Keywords:

Mixed-Precision Training, Gradient Stability, Generalization Robustness

Abstract

Mixed-precision training has become a practical approach for accelerating deep neural network
training in compute-constrained environments, but its effectiveness depends on maintaining gradient
fidelity and stable convergence behavior. By executing forward and backward passes in reduced
precision while retaining master parameters in higher precision, mixed-precision techniques reduce
memory usage and improve arithmetic throughput. However, precision reduction introduces
quantization noise and increases the risk of gradient underflow, making loss scaling and selective
precision control essential. This study evaluates mixed-precision training across multiple neural
architectures, examining gradient stability, convergence trajectories, and generalization performance
relative to full-precision training. The results show that when dynamic scaling and controlled
precision retention are applied, mixed-precision models achieve comparable or improved
generalization by converging toward flatter minima, while significantly increasing training efficiency.
These findings demonstrate that mixed-precision training is not merely an optimization for hardware
utilization, but a convergence-shaping strategy that influences training dynamics and model
robustness.

Downloads

Published

2024-12-03

How to Cite

Rowan Westerdale. (2024). Mixed-Precision Training Efficiency in Compute Constrained ML Systems. Journal of Artificial Intelligence in Fluid Dynamics, 3(2), 28–34. Retrieved from https://theeducationjournals.com/index.php/jaifd/article/view/345

Issue

Section

Articles