The quantization of synaptic weights using emerging nonvolatile memory (NVM) devices has emerged as a promising solution to implement computationally efficient neural networks on resource constrained hardware. However, the practical implementation of such synaptic weights is hampered by the imperfect memory characteristics, specifically the availability of limited number of quantized states and the presence of large intrinsic device variation and stochasticity involved in writing the synaptic states. This article presents on-chip training and inference of a neural network using quantized magnetic domain wall (DW)-based synaptic array and CMOS peripheral circuits. A rigorous model of the magnetic DW device considering stochasticity and process variations has been utilized for the synapse. To achieve stable quantized weights, DW pinning has been achieved by means of physical constrictions. Finally, VGG8 architecture for CIFAR-10 image classification has been simulated by using the extracted synaptic device characteristics. The performance in terms of accuracy, energy, latency, and area consumption has been evaluated while considering the process variations and nonidealities in the DW device as well as the peripheral circuits. The proposed quantized neural network (QNN) architecture achieves efficient on-chip learning with 92.4% and 90.4% training and inference accuracy, respectively. In comparison to pure CMOS-based design, it demonstrates an overall improvement in area, energy, and latency by 13.8 × , 9.6 × , and 3.5 × , respectively.