Search | arXiv e-print repository

doi 10.1364/AOP.484119

Artificial Neural Networks for Photonic Applications: From Algorithms to Implementation

Authors: Pedro Freire, Egor Manuylovich, Jaroslaw E. Prilepsky, Sergei K. Turitsy

Abstract: This tutorial-review on applications of artificial neural networks in photonics targets a broad audience, ranging from optical research and engineering communities to computer science and applied mathematics. We focus here on the research areas at the interface between these disciplines, attempting to find the right balance between technical details specific to each domain and overall clarity. Fir… ▽ More This tutorial-review on applications of artificial neural networks in photonics targets a broad audience, ranging from optical research and engineering communities to computer science and applied mathematics. We focus here on the research areas at the interface between these disciplines, attempting to find the right balance between technical details specific to each domain and overall clarity. First, we briefly recall key properties and peculiarities of some core neural network types, which we believe are the most relevant to photonics, also linking the layer's theoretical design to some photonics hardware realizations. After that, we elucidate the question of how to fine-tune the selected model's design to perform the required task with optimized accuracy. Then, in the review part, we discuss recent developments and progress for several selected applications of neural networks in photonics, including multiple aspects relevant to optical communications, imaging, sensing, and the design of new materials and lasers. In the following section, we put a special emphasis on how to accurately evaluate the complexity of neural networks in the context of the transition from algorithms to hardware implementation. The introduced complexity characteristics are used to analyze the applications of neural networks in optical communications, as a specific, albeit highly important example, comparing those with some benchmark signal processing methods. We combine the description of the well-known model compression strategies used in machine learning, with some novel techniques introduced recently in optical applications of neural networks. It is important to stress that although our focus in this tutorial-review is on photonics, we believe that the methods and techniques presented here can be handy in a much wider range of scientific and engineering applications. △ Less

Submitted 2 August, 2024; originally announced August 2024.

Journal ref: Pedro Freire, Egor Manuylovich, Jaroslaw E. Prilepsky, and Sergei K. Turitsyn, "Artificial neural networks for photonic applications - from algorithms to implementation: tutorial," Adv. Opt. Photon. 15, 739-834 (2023)

arXiv:2402.11777 [pdf, other]

Uncovering Latent Human Wellbeing in Language Model Embeddings

Authors: Pedro Freire, ChengCheng Tan, Adam Gleave, Dan Hendrycks, Scott Emmons

Abstract: Do language models implicitly learn a concept of human wellbeing? We explore this through the ETHICS Utilitarianism task, assessing if scaling enhances pretrained models' representations. Our initial finding reveals that, without any prompt engineering or finetuning, the leading principal component from OpenAI's text-embedding-ada-002 achieves 73.9% accuracy. This closely matches the 74.6% of BERT… ▽ More Do language models implicitly learn a concept of human wellbeing? We explore this through the ETHICS Utilitarianism task, assessing if scaling enhances pretrained models' representations. Our initial finding reveals that, without any prompt engineering or finetuning, the leading principal component from OpenAI's text-embedding-ada-002 achieves 73.9% accuracy. This closely matches the 74.6% of BERT-large finetuned on the entire ETHICS dataset, suggesting pretraining conveys some understanding about human wellbeing. Next, we consider four language model families, observing how Utilitarianism accuracy varies with increased parameters. We find performance is nondecreasing with increased model size when using sufficient numbers of principal components. △ Less

Submitted 18 February, 2024; originally announced February 2024.

Comments: 10 pages, 5 figures, 1 table

ACM Class: I.2.7

arXiv:2307.15217 [pdf, other]

Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

Authors: Stephen Casper, Xander Davies, Claudia Shi, Thomas Krendl Gilbert, Jérémy Scheurer, Javier Rando, Rachel Freedman, Tomasz Korbak, David Lindner, Pedro Freire, Tony Wang, Samuel Marks, Charbel-Raphaël Segerie, Micah Carroll, Andi Peng, Phillip Christoffersen, Mehul Damani, Stewart Slocum, Usman Anwar, Anand Siththaranjan, Max Nadeau, Eric J. Michaud, Jacob Pfau, Dmitrii Krasheninnikov, Xin Chen , et al. (7 additional authors not shown)

Abstract: Reinforcement learning from human feedback (RLHF) is a technique for training AI systems to align with human goals. RLHF has emerged as the central method used to finetune state-of-the-art large language models (LLMs). Despite this popularity, there has been relatively little public work systematizing its flaws. In this paper, we (1) survey open problems and fundamental limitations of RLHF and rel… ▽ More Reinforcement learning from human feedback (RLHF) is a technique for training AI systems to align with human goals. RLHF has emerged as the central method used to finetune state-of-the-art large language models (LLMs). Despite this popularity, there has been relatively little public work systematizing its flaws. In this paper, we (1) survey open problems and fundamental limitations of RLHF and related methods; (2) overview techniques to understand, improve, and complement RLHF in practice; and (3) propose auditing and disclosure standards to improve societal oversight of RLHF systems. Our work emphasizes the limitations of RLHF and highlights the importance of a multi-faceted approach to the development of safer AI systems. △ Less

Submitted 11 September, 2023; v1 submitted 27 July, 2023; originally announced July 2023.

arXiv:2307.05374 [pdf, other]

Multi-Task Learning to Enhance Generalizability of Neural Network Equalizers in Coherent Optical Systems

Authors: Sasipim Srivallapanondh, Pedro J. Freire, Ashraful Alam, Nelson Costa, Bernhard Spinnler, Antonio Napoli, Egor Sedov, Sergei K. Turitsyn, Jaroslaw E. Prilepsky

Abstract: For the first time, multi-task learning is proposed to improve the flexibility of NN-based equalizers in coherent systems. A "single" NN-based equalizer improves Q-factor by up to 4 dB compared to CDC, without re-training, even with variations in launch power, symbol rate, or transmission distance. For the first time, multi-task learning is proposed to improve the flexibility of NN-based equalizers in coherent systems. A "single" NN-based equalizer improves Q-factor by up to 4 dB compared to CDC, without re-training, even with variations in launch power, symbol rate, or transmission distance. △ Less

Submitted 3 November, 2023; v1 submitted 4 July, 2023; originally announced July 2023.

Comments: 4 pages, European Conference on Optical Communication (ECOC)

arXiv:2305.09495 [pdf, other]

Hardware Realization of Nonlinear Activation Functions for NN-based Optical Equalizers

Authors: Sasipim Srivallapanondh, Pedro J. Freire, Antonio Napoli, Sergei K. Turitsyn, Jaroslaw E. Prilepsky

Abstract: To reduce the complexity of the hardware implementation of neural network-based optical channel equalizers, we demonstrate that the performance of the biLSTM equalizer with approximated activation functions is close to that of the original model. To reduce the complexity of the hardware implementation of neural network-based optical channel equalizers, we demonstrate that the performance of the biLSTM equalizer with approximated activation functions is close to that of the original model. △ Less

Submitted 16 May, 2023; originally announced May 2023.

Comments: 2 pages, 1 figure, 1 table, Conference on Lasers & Electro-Optics 2023

arXiv:2304.04966 [pdf]

Computer Vision-Aided Intelligent Monitoring of Coffee: Towards Sustainable Coffee Production

Authors: Francisco Eron, Muhammad Noman, Raphael Ricon de Oliveira, Deigo de Souza Marques, Rafael Serapilha Durelli, Andre Pimenta Freire, Antonio Chalfun Junior

Abstract: Coffee which is prepared from the grinded roasted seeds of harvested coffee cherries, is one of the most consumed beverage and traded commodity, globally. To manually monitor the coffee field regularly, and inform about plant and soil health, as well as estimate yield and harvesting time, is labor-intensive, time-consuming and error-prone. Some recent studies have developed sensors for estimating… ▽ More Coffee which is prepared from the grinded roasted seeds of harvested coffee cherries, is one of the most consumed beverage and traded commodity, globally. To manually monitor the coffee field regularly, and inform about plant and soil health, as well as estimate yield and harvesting time, is labor-intensive, time-consuming and error-prone. Some recent studies have developed sensors for estimating coffee yield at the time of harvest, however a more inclusive and applicable technology to remotely monitor multiple parameters of the field and estimate coffee yield and quality even at pre-harvest stage, was missing. Following precision agriculture approach, we employed machine learning algorithm YOLO, for image processing of coffee plant. In this study, the latest version of the state-of-the-art algorithm YOLOv7 was trained with 324 annotated images followed by its evaluation with 82 unannotated images as test data. Next, as an innovative approach for annotating the training data, we trained K-means models which led to machine-generated color classes of coffee fruit and could thus characterize the informed objects in the image. Finally, we attempted to develop an AI-based handy mobile application which would not only efficiently predict harvest time, estimate coffee yield and quality, but also inform about plant health. Resultantly, the developed model efficiently analyzed the test data with a mean average precision of 0.89. Strikingly, our innovative semi-supervised method with an mean average precision of 0.77 for multi-class mode surpassed the supervised method with mean average precision of only 0.60, leading to faster and more accurate annotation. The mobile application we designed based on the developed code, was named CoffeApp, which possesses multiple features of analyzing fruit from the image taken by phone camera with in field and can thus track fruit ripening in real time. △ Less

Submitted 11 April, 2023; originally announced April 2023.

arXiv:2212.04703 [pdf, other]

doi 10.1109/JLT.2023.3272011

Implementing Neural Network-Based Equalizers in a Coherent Optical Transmission System Using Field-Programmable Gate Arrays

Authors: Pedro J. Freire, Sasipim Srivallapanondh, Michael Anderson, Bernhard Spinnler, Thomas Bex, Tobias A. Eriksson, Antonio Napoli, Wolfgang Schairer, Nelson Costa, Michaela Blott, Sergei K. Turitsyn, Jaroslaw E. Prilepsky

Abstract: In this work, we demonstrate the offline FPGA realization of both recurrent and feedforward neural network (NN)-based equalizers for nonlinearity compensation in coherent optical transmission systems. First, we present a realization pipeline showing the conversion of the models from Python libraries to the FPGA chip synthesis and implementation. Then, we review the main alternatives for the hardwa… ▽ More In this work, we demonstrate the offline FPGA realization of both recurrent and feedforward neural network (NN)-based equalizers for nonlinearity compensation in coherent optical transmission systems. First, we present a realization pipeline showing the conversion of the models from Python libraries to the FPGA chip synthesis and implementation. Then, we review the main alternatives for the hardware implementation of nonlinear activation functions. The main results are divided into three parts: a performance comparison, an analysis of how activation functions are implemented, and a report on the complexity of the hardware. The performance in Q-factor is presented for the cases of bidirectional long-short-term memory coupled with convolutional NN (biLSTM + CNN) equalizer, CNN equalizer, and standard 1-StpS digital back-propagation (DBP) for the simulation and experiment propagation of a single channel dual-polarization (SC-DP) 16QAM at 34 GBd along 17x70km of LEAF. The biLSTM+CNN equalizer provides a similar result to DBP and a 1.7 dB Q-factor gain compared with the chromatic dispersion compensation baseline in the experimental dataset. After that, we assess the Q-factor and the impact of hardware utilization when approximating the activation functions of NN using Taylor series, piecewise linear, and look-up table (LUT) approximations. We also show how to mitigate the approximation errors with extra training and provide some insights into possible gradient problems in the LUT approximation. Finally, to evaluate the complexity of hardware implementation to achieve 200G and 400G throughput, fixed-point NN-based equalizers with approximated activation functions are developed and implemented in an FPGA. △ Less

Submitted 19 February, 2023; v1 submitted 9 December, 2022; originally announced December 2022.

Comments: Invited paper at Journal of Lightwave Technology - IEEE

arXiv:2212.04569 [pdf, other]

Knowledge Distillation Applied to Optical Channel Equalization: Solving the Parallelization Problem of Recurrent Connection

Authors: Sasipim Srivallapanondh, Pedro J. Freire, Bernhard Spinnler, Nelson Costa, Antonio Napoli, Sergei K. Turitsyn, Jaroslaw E. Prilepsky

Abstract: To circumvent the non-parallelizability of recurrent neural network-based equalizers, we propose knowledge distillation to recast the RNN into a parallelizable feedforward structure. The latter shows 38\% latency decrease, while impacting the Q-factor by only 0.5dB. To circumvent the non-parallelizability of recurrent neural network-based equalizers, we propose knowledge distillation to recast the RNN into a parallelizable feedforward structure. The latter shows 38\% latency decrease, while impacting the Q-factor by only 0.5dB. △ Less

Submitted 8 December, 2022; originally announced December 2022.

Comments: Paper Accepted for Oral presentation - OFC 2023 (Optical Fiber Communication Conference)

arXiv:2208.12866 [pdf, other]

doi 10.1109/JLT.2023.3234327

Reducing Computational Complexity of Neural Networks in Optical Channel Equalization: From Concepts to Implementation

Authors: Pedro J. Freire, Antonio Napoli, Diego Arguello Ron, Bernhard Spinnler, Michael Anderson, Wolfgang Schairer, Thomas Bex, Nelson Costa, Sergei K. Turitsyn, Jaroslaw E. Prilepsky

Abstract: In this paper, a new methodology is proposed that allows for the low-complexity development of neural network (NN) based equalizers for the mitigation of impairments in high-speed coherent optical transmission systems. In this work, we provide a comprehensive description and comparison of various deep model compression approaches that have been applied to feed-forward and recurrent NN designs. Add… ▽ More In this paper, a new methodology is proposed that allows for the low-complexity development of neural network (NN) based equalizers for the mitigation of impairments in high-speed coherent optical transmission systems. In this work, we provide a comprehensive description and comparison of various deep model compression approaches that have been applied to feed-forward and recurrent NN designs. Additionally, we evaluate the influence these strategies have on the performance of each NN equalizer. Quantization, weight clustering, pruning, and other cutting-edge strategies for model compression are taken into consideration. In this work, we propose and evaluate a Bayesian optimization-assisted compression, in which the hyperparameters of the compression are chosen to simultaneously reduce complexity and improve performance. In conclusion, the trade-off between the complexity of each compression approach and its performance is evaluated by utilizing both simulated and experimental data in order to complete the analysis. By utilizing optimal compression approaches, we show that it is possible to design an NN-based equalizer that is simpler to implement and has better performance than the conventional digital back-propagation (DBP) equalizer with only one step per span. This is accomplished by reducing the number of multipliers used in the NN equalizer after applying the weighted clustering and pruning algorithms. Furthermore, we demonstrate that an equalizer based on NN can also achieve superior performance while still maintaining the same degree of complexity as the full electronic chromatic dispersion compensation block. We conclude our analysis by highlighting open questions and existing challenges, as well as possible future research directions. △ Less

Submitted 26 November, 2022; v1 submitted 26 August, 2022; originally announced August 2022.

arXiv:2206.12191 [pdf, other]

Computational Complexity Evaluation of Neural Network Applications in Signal Processing

Authors: Pedro Freire, Sasipim Srivallapanondh, Antonio Napoli, Jaroslaw E. Prilepsky, Sergei K. Turitsyn

Abstract: In this paper, we provide a systematic approach for assessing and comparing the computational complexity of neural network layers in digital signal processing. We provide and link four software-to-hardware complexity measures, defining how the different complexity metrics relate to the layers' hyper-parameters. This paper explains how to compute these four metrics for feed-forward and recurrent la… ▽ More In this paper, we provide a systematic approach for assessing and comparing the computational complexity of neural network layers in digital signal processing. We provide and link four software-to-hardware complexity measures, defining how the different complexity metrics relate to the layers' hyper-parameters. This paper explains how to compute these four metrics for feed-forward and recurrent layers, and defines in which case we ought to use a particular metric depending on whether we characterize a more soft- or hardware-oriented application. One of the four metrics, called `the number of additions and bit shifts (NABS)', is newly introduced for heterogeneous quantization. NABS characterizes the impact of not only the bitwidth used in the operation but also the type of quantization used in the arithmetical operations. We intend this work to serve as a baseline for the different levels (purposes) of complexity estimation related to the neural networks' application in real-time digital signal processing, aiming at unifying the computational complexity estimation. △ Less

Submitted 10 March, 2024; v1 submitted 24 June, 2022; originally announced June 2022.

arXiv:2206.12180 [pdf, other]

Towards FPGA Implementation of Neural Network-Based Nonlinearity Mitigation Equalizers in Coherent Optical Transmission Systems

Authors: Pedro J. Freire, Michael Anderson, Bernhard Spinnler, Thomas Bex, Jaroslaw E. Prilepsky, Tobias A. Eriksson, Nelson Costa, Wolfgang Schairer, Michaela Blott, Antonio Napoli, Sergei K. Turitsyn

Abstract: For the first time, recurrent and feedforward neural network-based equalizers for nonlinearity compensation are implemented in an FPGA, with a level of complexity comparable to that of a dispersion equalizer. We demonstrate that the NN-based equalizers can outperform a 1 step-per-span DBP. For the first time, recurrent and feedforward neural network-based equalizers for nonlinearity compensation are implemented in an FPGA, with a level of complexity comparable to that of a dispersion equalizer. We demonstrate that the NN-based equalizers can outperform a 1 step-per-span DBP. △ Less

Submitted 24 June, 2022; originally announced June 2022.

Comments: Accepted Oral in the European Conference on Optical Communication (ECOC) 2022

arXiv:2202.12689 [pdf, other]

Domain Adaptation: the Key Enabler of Neural Network Equalizers in Coherent Optical Systems

Authors: Pedro J. Freire, Bernhard Spinnler, Daniel Abode, Jaroslaw E. Prilepsky, Abdallah A. I. Ali, Nelson Costa, Wolfgang Schairer, Antonio Napoli, Andrew D. Ellis, Sergei K. Turitsyn

Abstract: We introduce the domain adaptation and randomization approach for calibrating neural network-based equalizers for real transmissions, using synthetic data. The approach renders up to 99\% training process reduction, which we demonstrate in three experimental setups. We introduce the domain adaptation and randomization approach for calibrating neural network-based equalizers for real transmissions, using synthetic data. The approach renders up to 99\% training process reduction, which we demonstrate in three experimental setups. △ Less

Submitted 25 February, 2022; originally announced February 2022.

Comments: Paper Accepted at OFC 2022

arXiv:2109.08711 [pdf, ps, other]

Experimental Evaluation of Computational Complexity for Different Neural Network Equalizers in Optical Communications

Authors: Pedro J. Freire, Yevhenii Osadchuk, Antonio Napoli, Bernhard Spinnler, Wolfgang Schairer, Nelson Costa, Jaroslaw E. Prilepsky, Sergei K. Turitsyn

Abstract: Addressing the neural network-based optical channel equalizers, we quantify the trade-off between their performance and complexity by carrying out the comparative analysis of several neural network architectures, presenting the results for TWC and SSMF set-ups. Addressing the neural network-based optical channel equalizers, we quantify the trade-off between their performance and complexity by carrying out the comparative analysis of several neural network architectures, presenting the results for TWC and SSMF set-ups. △ Less

Submitted 17 September, 2021; originally announced September 2021.

Comments: ORAL presentation at the Asia Communications and Photonics Conference (ACP 2021)

arXiv:2012.01365 [pdf, other]

DERAIL: Diagnostic Environments for Reward And Imitation Learning

Authors: Pedro Freire, Adam Gleave, Sam Toyer, Stuart Russell

Abstract: The objective of many real-world tasks is complex and difficult to procedurally specify. This makes it necessary to use reward or imitation learning algorithms to infer a reward or policy directly from human data. Existing benchmarks for these algorithms focus on realism, testing in complex environments. Unfortunately, these benchmarks are slow, unreliable and cannot isolate failures. As a complem… ▽ More The objective of many real-world tasks is complex and difficult to procedurally specify. This makes it necessary to use reward or imitation learning algorithms to infer a reward or policy directly from human data. Existing benchmarks for these algorithms focus on realism, testing in complex environments. Unfortunately, these benchmarks are slow, unreliable and cannot isolate failures. As a complementary approach, we develop a suite of simple diagnostic tasks that test individual facets of algorithm performance in isolation. We evaluate a range of common reward and imitation learning algorithms on our tasks. Our results confirm that algorithm performance is highly sensitive to implementation details. Moreover, in a case-study into a popular preference-based reward learning implementation, we illustrate how the suite can pinpoint design flaws and rapidly evaluate candidate solutions. The environments are available at https://github.com/HumanCompatibleAI/seals . △ Less

Submitted 2 December, 2020; originally announced December 2020.

arXiv:1807.09619 [pdf, other]

Multiple sclerosis lesion enhancement and white matter region estimation using hyperintensities in FLAIR images

Authors: Paulo G. L. Freire, Ricardo J. Ferrari

Abstract: Multiple sclerosis (MS) is a demyelinating disease that affects more than 2 million people worldwide. The most used imaging technique to help in its diagnosis and follow-up is magnetic resonance imaging (MRI). Fluid Attenuated Inversion Recovery (FLAIR) images are usually acquired in the context of MS because lesions often appear hyperintense in this particular image weight, making it easier for p… ▽ More Multiple sclerosis (MS) is a demyelinating disease that affects more than 2 million people worldwide. The most used imaging technique to help in its diagnosis and follow-up is magnetic resonance imaging (MRI). Fluid Attenuated Inversion Recovery (FLAIR) images are usually acquired in the context of MS because lesions often appear hyperintense in this particular image weight, making it easier for physicians to identify them. Though lesions have a bright intensity profile, it may overlap with white matter (WM) and gray matter (GM) tissues, posing difficulties to be accurately segmented. In this sense, we propose a lesion enhancement technique to dim down WM and GM regions and highlight hyperintensities, making them much more distinguishable than other tissues. We applied our technique to the ISBI 2015 MS Lesion Segmentation Challenge and took the average gray level intensity of MS lesions, WM and GM on FLAIR and enhanced images. The lesion intensity profile in FLAIR was on average 25% and 19% brighter than white matter and gray matter, respectively; comparatively, the same profile in our enhanced images was on average 444% and 264% brighter. Such results mean a significant improvement on the intensity distinction among these three clusters, which may come as aid both for experts and automated techniques. Moreover, a byproduct of our proposal is that the enhancement can be used to automatically estimate a mask encompassing WM and MS lesions, which may be useful for brain tissue volume assessment and improve MS lesion segmentation accuracy in future works. △ Less

Submitted 25 July, 2018; originally announced July 2018.

Showing 1–15 of 15 results for author: Freire, P