Optoelectronic nonlinear Softmax operator based on diffractive neural networks

Opt Express. 2024 Jul 15;32(15):26458-26469. doi: 10.1364/OE.527843.

Abstract

Softmax, a pervasive nonlinear operation, plays a pivotal role in numerous statistics and deep learning (DL) models such as ChatGPT. To compute it is expensive especially for at-scale models. Several software and hardware speed-up strategies are proposed but still suffer from low efficiency, poor scalability. Here we propose a photonic-computing solution including massive programmable neurons that is capable to execute such operation in an accurate, computation-efficient, robust and scalable manner. Experimental results show our diffraction-based computing system exhibits salient generalization ability in diverse artificial and real-world tasks (mean square error <10-5). We further analyze its performances against several realistic restricted factors. Such flexible system not only contributes to optimizing Softmax operation mechanism but may provide an inspiration of manufacturing a plug-and-play module for general optoelectronic accelerators.