-
Mashee at SemEval-2024 Task 8: The Impact of Samples Quality on the Performance of In-Context Learning for Machine Text Classification
Authors:
Areeg Fahad Rasheed,
M. Zarkoosh
Abstract:
Within few-shot learning, in-context learning (ICL) has become a potential method for leveraging contextual information to improve model performance on small amounts of data or in resource-constrained environments where training models on large datasets is prohibitive. However, the quality of the selected sample in a few shots severely limits the usefulness of ICL. The primary goal of this paper i…
▽ More
Within few-shot learning, in-context learning (ICL) has become a potential method for leveraging contextual information to improve model performance on small amounts of data or in resource-constrained environments where training models on large datasets is prohibitive. However, the quality of the selected sample in a few shots severely limits the usefulness of ICL. The primary goal of this paper is to enhance the performance of evaluation metrics for in-context learning by selecting high-quality samples in few-shot learning scenarios. We employ the chi-square test to identify high-quality samples and compare the results with those obtained using low-quality samples. Our findings demonstrate that utilizing high-quality samples leads to improved performance with respect to all evaluated metrics.
△ Less
Submitted 28 May, 2024;
originally announced June 2024.
-
Diagnostic Digital Twin for Anomaly Detection in Floating Offshore Wind Energy
Authors:
Florian Stadtmann,
Adil Rasheed
Abstract:
The demand for condition-based and predictive maintenance is rising across industries, especially for remote, high-value, and high-risk assets. In this article, the diagnostic digital twin concept is introduced, discussed, and implemented for a floating offshore turbine. A diagnostic digital twin is a virtual representation of an asset that combines real-time data and models to monitor damage, det…
▽ More
The demand for condition-based and predictive maintenance is rising across industries, especially for remote, high-value, and high-risk assets. In this article, the diagnostic digital twin concept is introduced, discussed, and implemented for a floating offshore turbine. A diagnostic digital twin is a virtual representation of an asset that combines real-time data and models to monitor damage, detect anomalies, and diagnose failures, thereby enabling condition-based and predictive maintenance. By applying diagnostic digital twins to offshore assets, unexpected failures can be alleviated, but the implementation can prove challenging. Here, a diagnostic digital twin is implemented for an operational floating offshore wind turbine. The asset is monitored through measurements. Unsupervised learning methods are employed to build a normal operation model, detect anomalies, and provide a fault diagnosis. Warnings and diagnoses are sent through text messages, and a more detailed diagnosis can be accessed in a virtual reality interface. The diagnostic digital twin successfully detected an anomaly with high confidence hours before a failure occurred. The paper concludes by discussing diagnostic digital twins in the broader context of offshore engineering. The presented approach can be generalized to other offshore assets to improve maintenance and increase the lifetime, efficiency, and sustainability of offshore assets.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Hal: A Language-General Framework for Analysis of User-Specified Monotone Frameworks [DRAFT]
Authors:
Abdullah Rasheed
Abstract:
Writing dataflow analyzers requires both language and domain-specificity. That is to say, each programming language and each program property requires its own analyzer. To enable a streamlined, user-driven approach to dataflow analyzers, we introduce the theoretical framework for a user-specified dataflow analysis. This framework is constructed in such a way that the user has to specify as little…
▽ More
Writing dataflow analyzers requires both language and domain-specificity. That is to say, each programming language and each program property requires its own analyzer. To enable a streamlined, user-driven approach to dataflow analyzers, we introduce the theoretical framework for a user-specified dataflow analysis. This framework is constructed in such a way that the user has to specify as little as possible, while the analyzer infers and computes everything else, including interprocedural embellishments. This theoretical framework was also implemented in Java, where users can specify a program property alongside minimal extra information to induce a dataflow analysis. This framework (both theoretical and in implementation) is language-general, meaning that it is independent of syntax and semantics (as all necessary syntactic and semantic information is provided by the user, and this information is provided only once for a given language). In this paper, we introduce basic notions of intraprocedural and interprocedural dataflow analyses, the proposed "Implicit Monotone Framework," and a rigorous framework for partial functions as a property space.
△ Less
Submitted 19 May, 2024; v1 submitted 10 May, 2024;
originally announced May 2024.
-
Exploring Urban Mobility Trends using Cellular Network Data
Authors:
Oluwaleke Yusuf,
Adil Rasheed,
Frank Lindseth
Abstract:
The growth of urban areas intensifies the need for sustainable, efficient transportation infrastructure and mobility systems, driving initiatives to enhance infrastructure and public transport while reducing congestion and emissions. By utilizing real-world mobility data, a data-driven approach can provide crucial insights for planning and decision-making.
This study explores the efficacy of lev…
▽ More
The growth of urban areas intensifies the need for sustainable, efficient transportation infrastructure and mobility systems, driving initiatives to enhance infrastructure and public transport while reducing congestion and emissions. By utilizing real-world mobility data, a data-driven approach can provide crucial insights for planning and decision-making.
This study explores the efficacy of leveraging telecoms data from cellular network signals for studying crowd movement patterns, focusing on Trondheim, Norway. It examines routing reports to understand the spatiotemporal dynamics of various transportation routes and modes.
A data preprocessing and feature engineering framework was developed to process raw routing reports for historical analysis. This enabled the examination of geospatial trends and temporal patterns, including a comparative analysis of various transportation modes, along with public transit usage. Specific routes and areas were analyzed in-depth to compare their mobility patterns with the broader city context.
The study highlights the potential of cellular network data as a resource for shaping urban transportation and mobility systems. By identifying deficiencies and potential improvements, city planners and stakeholders can foster more sustainable and effective transportation solutions.
△ Less
Submitted 31 March, 2024;
originally announced April 2024.
-
Privacy Re-identification Attacks on Tabular GANs
Authors:
Abdallah Alshantti,
Adil Rasheed,
Frank Westad
Abstract:
Generative models are subject to overfitting and thus may potentially leak sensitive information from the training data. In this work. we investigate the privacy risks that can potentially arise from the use of generative adversarial networks (GANs) for creating tabular synthetic datasets. For the purpose, we analyse the effects of re-identification attacks on synthetic data, i.e., attacks which a…
▽ More
Generative models are subject to overfitting and thus may potentially leak sensitive information from the training data. In this work. we investigate the privacy risks that can potentially arise from the use of generative adversarial networks (GANs) for creating tabular synthetic datasets. For the purpose, we analyse the effects of re-identification attacks on synthetic data, i.e., attacks which aim at selecting samples that are predicted to correspond to memorised training samples based on their proximity to the nearest synthetic records. We thus consider multiple settings where different attackers might have different access levels or knowledge of the generative model and predictive, and assess which information is potentially most useful for launching more successful re-identification attacks. In doing so we also consider the situation for which re-identification attacks are formulated as reconstruction attacks, i.e., the situation where an attacker uses evolutionary multi-objective optimisation for perturbing synthetic samples closer to the training space. The results indicate that attackers can indeed pose major privacy risks by selecting synthetic samples that are likely representative of memorised training samples. In addition, we notice that privacy threats considerably increase when the attacker either has knowledge or has black-box access to the generative models. We also find that reconstruction attacks through multi-objective optimisation even increase the risk of identifying confidential samples.
△ Less
Submitted 31 March, 2024;
originally announced April 2024.
-
Variational Autoencoders for exteroceptive perception in reinforcement learning-based collision avoidance
Authors:
Thomas Nakken Larsen,
Eirik Runde Barlaug,
Adil Rasheed
Abstract:
Modern control systems are increasingly turning to machine learning algorithms to augment their performance and adaptability. Within this context, Deep Reinforcement Learning (DRL) has emerged as a promising control framework, particularly in the domain of marine transportation. Its potential for autonomous marine applications lies in its ability to seamlessly combine path-following and collision…
▽ More
Modern control systems are increasingly turning to machine learning algorithms to augment their performance and adaptability. Within this context, Deep Reinforcement Learning (DRL) has emerged as a promising control framework, particularly in the domain of marine transportation. Its potential for autonomous marine applications lies in its ability to seamlessly combine path-following and collision avoidance with an arbitrary number of obstacles. However, current DRL algorithms require disproportionally large computational resources to find near-optimal policies compared to the posed control problem when the searchable parameter space becomes large. To combat this, our work delves into the application of Variational AutoEncoders (VAEs) to acquire a generalized, low-dimensional latent encoding of a high-fidelity range-finding sensor, which serves as the exteroceptive input to a DRL agent. The agent's performance, encompassing path-following and collision avoidance, is systematically tested and evaluated within a stochastic simulation environment, presenting a comprehensive exploration of our proposed approach in maritime control systems.
△ Less
Submitted 31 March, 2024;
originally announced April 2024.
-
Computationally and Memory-Efficient Robust Predictive Analytics Using Big Data
Authors:
Daniel Menges,
Adil Rasheed
Abstract:
In the current data-intensive era, big data has become a significant asset for Artificial Intelligence (AI), serving as a foundation for developing data-driven models and providing insight into various unknown fields. This study navigates through the challenges of data uncertainties, storage limitations, and predictive data-driven modeling using big data. We utilize Robust Principal Component Anal…
▽ More
In the current data-intensive era, big data has become a significant asset for Artificial Intelligence (AI), serving as a foundation for developing data-driven models and providing insight into various unknown fields. This study navigates through the challenges of data uncertainties, storage limitations, and predictive data-driven modeling using big data. We utilize Robust Principal Component Analysis (RPCA) for effective noise reduction and outlier elimination, and Optimal Sensor Placement (OSP) for efficient data compression and storage. The proposed OSP technique enables data compression without substantial information loss while simultaneously reducing storage needs. While RPCA offers an enhanced alternative to traditional Principal Component Analysis (PCA) for high-dimensional data management, the scope of this work extends its utilization, focusing on robust, data-driven modeling applicable to huge data sets in real-time. For that purpose, Long Short-Term Memory (LSTM) networks, a type of recurrent neural network, are applied to model and predict data based on a low-dimensional subset obtained from OSP, leading to a crucial acceleration of the training phase. LSTMs are feasible for capturing long-term dependencies in time series data, making them particularly suited for predicting the future states of physical systems on historical data. All the presented algorithms are not only theorized but also simulated and validated using real thermal imaging data mapping a ship's engine.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
Digital Twin for Wind Energy: Latest updates from the NorthWind project
Authors:
Adil Rasheed,
Florian Stadtmann,
Eivind Fonn,
Mandar Tabib,
Vasileios Tsiolakis,
Balram Panjwani,
Kjetil Andre Johannessen,
Trond Kvamsdal,
Omer San,
John Olav Tande,
Idar Barstad,
Tore Christiansen,
Elling Rishoff,
Lars Frøyd,
Tore Rasmussen
Abstract:
NorthWind, a collaborative research initiative supported by the Research Council of Norway, industry stakeholders, and research partners, aims to advance cutting-edge research and innovation in wind energy. The core mission is to reduce wind power costs and foster sustainable growth, with a key focus on the development of digital twins. A digital twin is a virtual representation of physical assets…
▽ More
NorthWind, a collaborative research initiative supported by the Research Council of Norway, industry stakeholders, and research partners, aims to advance cutting-edge research and innovation in wind energy. The core mission is to reduce wind power costs and foster sustainable growth, with a key focus on the development of digital twins. A digital twin is a virtual representation of physical assets or processes that uses data and simulators to enable real-time forecasting, optimization, monitoring, control and informed decision-making. Recently, a hierarchical scale ranging from 0 to 5 (0 - Standalone, 1 - Descriptive, 2 - Diagnostic, 3 - Predictive, 4 - Prescriptive, 5 - Autonomous has been introduced within the NorthWind project to assess the capabilities of digital twins. This paper elaborates on our progress in constructing digital twins for wind farms and their components across various capability levels.
△ Less
Submitted 26 March, 2024; v1 submitted 21 February, 2024;
originally announced March 2024.
-
Digital Twin of Autonomous Surface Vessels for Safe Maritime Navigation Enabled through Predictive Modeling and Reinforcement Learning
Authors:
Daniel Menges,
Andreas Von Brandis,
Adil Rasheed
Abstract:
Autonomous surface vessels (ASVs) play an increasingly important role in the safety and sustainability of open sea operations. Since most maritime accidents are related to human failure, intelligent algorithms for autonomous collision avoidance and path following can drastically reduce the risk in the maritime sector. A DT is a virtual representative of a real physical system and can enhance the s…
▽ More
Autonomous surface vessels (ASVs) play an increasingly important role in the safety and sustainability of open sea operations. Since most maritime accidents are related to human failure, intelligent algorithms for autonomous collision avoidance and path following can drastically reduce the risk in the maritime sector. A DT is a virtual representative of a real physical system and can enhance the situational awareness (SITAW) of such an ASV to generate optimal decisions. This work builds on an existing DT framework for ASVs and demonstrates foundations for enabling predictive, prescriptive, and autonomous capabilities. In this context, sophisticated target tracking approaches are crucial for estimating and predicting the position and motion of other dynamic objects. The applied tracking method is enabled by real-time automatic identification system (AIS) data and synthetic light detection and ranging (Lidar) measurements. To guarantee safety during autonomous operations, we applied a predictive safety filter, based on the concept of nonlinear model predictive control (NMPC). The approaches are implemented into a DT built with the Unity game engine. As a result, this work demonstrates the potential of a DT capable of making predictions, playing through various what-if scenarios, and providing optimal control decisions according to its enhanced SITAW.
△ Less
Submitted 30 March, 2024; v1 submitted 8 January, 2024;
originally announced January 2024.
-
Modular Control Architecture for Safe Marine Navigation: Reinforcement Learning and Predictive Safety Filters
Authors:
Aksel Vaaler,
Svein Jostein Husa,
Daniel Menges,
Thomas Nakken Larsen,
Adil Rasheed
Abstract:
Many autonomous systems face safety challenges, requiring robust closed-loop control to handle physical limitations and safety constraints. Real-world systems, like autonomous ships, encounter nonlinear dynamics and environmental disturbances. Reinforcement learning is increasingly used to adapt to complex scenarios, but standard frameworks ensuring safety and stability are lacking. Predictive Saf…
▽ More
Many autonomous systems face safety challenges, requiring robust closed-loop control to handle physical limitations and safety constraints. Real-world systems, like autonomous ships, encounter nonlinear dynamics and environmental disturbances. Reinforcement learning is increasingly used to adapt to complex scenarios, but standard frameworks ensuring safety and stability are lacking. Predictive Safety Filters (PSF) offer a promising solution, ensuring constraint satisfaction in learning-based control without explicit constraint handling. This modular approach allows using arbitrary control policies, with the safety filter optimizing proposed actions to meet physical and safety constraints. We apply this approach to marine navigation, combining RL with PSF on a simulated Cybership II model. The RL agent is trained on path following and collision avpodance, while the PSF monitors and modifies control actions for safety. Results demonstrate the PSF's effectiveness in maintaining safety without hindering the RL agent's learning rate and performance, evaluated against a standard RL agent without PSF.
△ Less
Submitted 2 April, 2024; v1 submitted 4 December, 2023;
originally announced December 2023.
-
PG-Video-LLaVA: Pixel Grounding Large Video-Language Models
Authors:
Shehan Munasinghe,
Rusiru Thushara,
Muhammad Maaz,
Hanoona Abdul Rasheed,
Salman Khan,
Mubarak Shah,
Fahad Khan
Abstract:
Extending image-based Large Multimodal Models (LMMs) to videos is challenging due to the inherent complexity of video data. The recent approaches extending image-based LMMs to videos either lack the grounding capabilities (e.g., VideoChat, Video-ChatGPT, Video-LLaMA) or do not utilize the audio-signals for better video understanding (e.g., Video-ChatGPT). Addressing these gaps, we propose PG-Video…
▽ More
Extending image-based Large Multimodal Models (LMMs) to videos is challenging due to the inherent complexity of video data. The recent approaches extending image-based LMMs to videos either lack the grounding capabilities (e.g., VideoChat, Video-ChatGPT, Video-LLaMA) or do not utilize the audio-signals for better video understanding (e.g., Video-ChatGPT). Addressing these gaps, we propose PG-Video-LLaVA, the first LMM with pixel-level grounding capability, integrating audio cues by transcribing them into text to enrich video-context understanding. Our framework uses an off-the-shelf tracker and a novel grounding module, enabling it to spatially localize objects in videos following user instructions. We evaluate PG-Video-LLaVA using video-based generative and question-answering benchmarks and introduce new benchmarks specifically designed to measure prompt-based object grounding performance in videos. Further, we propose the use of Vicuna over GPT-3.5, as utilized in Video-ChatGPT, for video-based conversation benchmarking, ensuring reproducibility of results which is a concern with the proprietary nature of GPT-3.5. Our framework builds on SoTA image-based LLaVA model and extends its advantages to the video domain, delivering promising gains on video-based conversation and grounding tasks. Project Page: https://github.com/mbzuai-oryx/Video-LLaVA
△ Less
Submitted 13 December, 2023; v1 submitted 22 November, 2023;
originally announced November 2023.
-
Enhancing wind field resolution in complex terrain through a knowledge-driven machine learning approach
Authors:
Jacob Wulff Wold,
Florian Stadtmann,
Adil Rasheed,
Mandar Tabib,
Omer San,
Jan-Tore Horn
Abstract:
Atmospheric flows are governed by a broad variety of spatio-temporal scales, thus making real-time numerical modeling of such turbulent flows in complex terrain at high resolution computationally intractable. In this study, we demonstrate a neural network approach motivated by Enhanced Super-Resolution Generative Adversarial Networks to upscale low-resolution wind fields to generate high-resolutio…
▽ More
Atmospheric flows are governed by a broad variety of spatio-temporal scales, thus making real-time numerical modeling of such turbulent flows in complex terrain at high resolution computationally intractable. In this study, we demonstrate a neural network approach motivated by Enhanced Super-Resolution Generative Adversarial Networks to upscale low-resolution wind fields to generate high-resolution wind fields in an actual wind farm in Bessaker, Norway. The neural network-based model is shown to successfully reconstruct fully resolved 3D velocity fields from a coarser scale while respecting the local terrain and that it easily outperforms trilinear interpolation. We also demonstrate that by using appropriate cost function based on domain knowledge, we can alleviate the use of adversarial training.
△ Less
Submitted 2 April, 2024; v1 submitted 18 September, 2023;
originally announced September 2023.
-
CasTGAN: Cascaded Generative Adversarial Network for Realistic Tabular Data Synthesis
Authors:
Abdallah Alshantti,
Damiano Varagnolo,
Adil Rasheed,
Aria Rahmati,
Frank Westad
Abstract:
Generative adversarial networks (GANs) have drawn considerable attention in recent years for their proven capability in generating synthetic data which can be utilised for multiple purposes. While GANs have demonstrated tremendous successes in producing synthetic data samples that replicate the dynamics of the original datasets, the validity of the synthetic data and the underlying privacy concern…
▽ More
Generative adversarial networks (GANs) have drawn considerable attention in recent years for their proven capability in generating synthetic data which can be utilised for multiple purposes. While GANs have demonstrated tremendous successes in producing synthetic data samples that replicate the dynamics of the original datasets, the validity of the synthetic data and the underlying privacy concerns represent major challenges which are not sufficiently addressed. In this work, we design a cascaded tabular GAN framework (CasTGAN) for generating realistic tabular data with a specific focus on the validity of the output. In this context, validity refers to the the dependency between features that can be found in the real data, but is typically misrepresented by traditional generative models. Our key idea entails that employing a cascaded architecture in which a dedicated generator samples each feature, the synthetic output becomes more representative of the real data. Our experimental results demonstrate that our model is capable of generating synthetic tabular data that can be used for fitting machine learning models. In addition, our model captures well the constraints and the correlations between the features of the real data, especially the high dimensional datasets. Furthermore, we evaluate the risk of white-box privacy attacks on our model and subsequently show that applying some perturbations to the auxiliary learners in CasTGAN increases the overall robustness of our model against targeted attacks.
△ Less
Submitted 22 January, 2024; v1 submitted 1 July, 2023;
originally announced July 2023.
-
Digital Twins in Wind Energy: Emerging Technologies and Industry-Informed Future Directions
Authors:
Florian Stadtman,
Adil Rasheed,
Trond Kvamsdal,
Kjetil André Johannessen,
Omer San,
Konstanze Kölle,
John Olav Giæver Tande,
Idar Barstad,
Alexis Benhamou,
Thomas Brathaug,
Tore Christiansen,
Anouk-Letizia Firle,
Alexander Fjeldly,
Lars Frøyd,
Alexander Gleim,
Alexander Høiberget,
Catherine Meissner,
Guttorm Nygård,
Jørgen Olsen,
Håvard Paulshus,
Tore Rasmussen,
Elling Rishoff,
Francesco Scibilia,
John Olav Skogås
Abstract:
This article presents a comprehensive overview of the digital twin technology and its capability levels, with a specific focus on its applications in the wind energy industry. It consolidates the definitions of digital twin and its capability levels on a scale from 0-5; 0-standalone, 1-descriptive, 2-diagnostic, 3-predictive, 4-prescriptive, 5-autonomous. It then, from an industrial perspective, i…
▽ More
This article presents a comprehensive overview of the digital twin technology and its capability levels, with a specific focus on its applications in the wind energy industry. It consolidates the definitions of digital twin and its capability levels on a scale from 0-5; 0-standalone, 1-descriptive, 2-diagnostic, 3-predictive, 4-prescriptive, 5-autonomous. It then, from an industrial perspective, identifies the current state of the art and research needs in the wind energy sector. The article proposes approaches to the identified challenges from the perspective of research institutes and offers a set of recommendations for diverse stakeholders to facilitate the acceptance of the technology. The contribution of this article lies in its synthesis of the current state of knowledge and its identification of future research needs and challenges from an industry perspective, ultimately providing a roadmap for future research and development in the field of digital twin and its applications in the wind energy industry.
△ Less
Submitted 14 October, 2023; v1 submitted 16 April, 2023;
originally announced April 2023.
-
Deep active learning for nonlinear system identification
Authors:
Erlend Torje Berg Lundby,
Adil Rasheed,
Ivar Johan Halvorsen,
Dirk Reinhardt,
Sebastien Gros,
Jan Tommy Gravdahl
Abstract:
The exploding research interest for neural networks in modeling nonlinear dynamical systems is largely explained by the networks' capacity to model complex input-output relations directly from data. However, they typically need vast training data before they can be put to any good use. The data generation process for dynamical systems can be an expensive endeavor both in terms of time and resource…
▽ More
The exploding research interest for neural networks in modeling nonlinear dynamical systems is largely explained by the networks' capacity to model complex input-output relations directly from data. However, they typically need vast training data before they can be put to any good use. The data generation process for dynamical systems can be an expensive endeavor both in terms of time and resources. Active learning addresses this shortcoming by acquiring the most informative data, thereby reducing the need to collect enormous datasets. What makes the current work unique is integrating the deep active learning framework into nonlinear system identification. We formulate a general static deep active learning acquisition problem for nonlinear system identification. This is enabled by exploring system dynamics locally in different regions of the input space to obtain a simulated dataset covering the broader input space. This simulated dataset can be used in a static deep active learning acquisition scheme referred to as global explorations. The global exploration acquires a batch of initial states corresponding to the most informative state-action trajectories according to a batch acquisition function. The local exploration solves an optimal control problem, finding the control trajectory that maximizes some measure of information. After a batch of informative initial states is acquired, a new round of local explorations from the initial states in the batch is conducted to obtain a set of corresponding control trajectories that are to be applied on the system dynamics to get data from the system. Information measures used in the acquisition scheme are derived from the predictive variance of an ensemble of neural networks. The novel method outperforms standard data acquisition methods used for system identification of nonlinear dynamical systems in the case study performed on simulated data.
△ Less
Submitted 24 February, 2023;
originally announced February 2023.
-
Sparse neural networks with skip-connections for identification of aluminum electrolysis cell
Authors:
Erlend Torje Berg Lundby,
Haakon Robinsson,
Adil Rasheed,
Ivar Johan Halvorsen,
Jan Tommy Gravdahl
Abstract:
Neural networks are rapidly gaining interest in nonlinear system identification due to the model's ability to capture complex input-output relations directly from data. However, despite the flexibility of the approach, there are still concerns about the safety of these models in this context, as well as the need for large amounts of potentially expensive data. Aluminum electrolysis is a highly non…
▽ More
Neural networks are rapidly gaining interest in nonlinear system identification due to the model's ability to capture complex input-output relations directly from data. However, despite the flexibility of the approach, there are still concerns about the safety of these models in this context, as well as the need for large amounts of potentially expensive data. Aluminum electrolysis is a highly nonlinear production process, and most of the data must be sampled manually, making the sampling process expensive and infrequent. In the case of infrequent measurements of state variables, the accuracy and open-loop stability of the long-term predictions become highly important. Standard neural networks struggle to provide stable long-term predictions with limited training data. In this work, we investigate the effect of combining concatenated skip-connections and the sparsity-promoting $\ell_1$ regularization on the open-loop stability and accuracy of forecasts with short, medium, and long prediction horizons. The case study is conducted on a high-dimensional and nonlinear simulator representing an aluminum electrolysis cell's mass and energy balance. The proposed model structure contains concatenated skip connections from the input layer and all intermittent layers to the output layer, referred to as InputSkip. $\ell_1$ regularized InputSkip is called sparse InputSkip. The results show that sparse InputSkip outperforms dense and sparse standard feedforward neural networks and dense InputSkip regarding open-loop stability and long-term predictive accuracy. The results are significant when models are trained on datasets of all sizes (small, medium, and large training sets) and for all prediction horizons (short, medium, and long prediction horizons.)
△ Less
Submitted 27 April, 2023; v1 submitted 2 January, 2023;
originally announced January 2023.
-
Artificial intelligence-driven digital twin of a modern house demonstrated in virtual reality
Authors:
Elias Mohammed Elfarri,
Adil Rasheed,
Omer San
Abstract:
A digital twin is a powerful tool that can help monitor and optimize physical assets in real-time. Simply put, it is a virtual representation of a physical asset, enabled through data and simulators, that can be used for a variety of purposes such as prediction, monitoring, and decision-making. However, the concept of digital twin can be vague and difficult to understand, which is why a new concep…
▽ More
A digital twin is a powerful tool that can help monitor and optimize physical assets in real-time. Simply put, it is a virtual representation of a physical asset, enabled through data and simulators, that can be used for a variety of purposes such as prediction, monitoring, and decision-making. However, the concept of digital twin can be vague and difficult to understand, which is why a new concept called "capability level" has been introduced. This concept categorizes digital twins based on their capability and defines a scale from zero to five, with each level indicating an increasing level of functionality. These levels are standalone, descriptive, diagnostic, predictive, prescriptive, and autonomous. By understanding the capability level of a digital twin, we can better understand its potential and limitations. To demonstrate the concepts, we use a modern house as an example. The house is equipped with a range of sensors that collect data about its internal state, which can then be used to create digital twins of different capability levels. These digital twins can be visualized in virtual reality, allowing users to interact with and manipulate the virtual environment. The current work not only presents a blueprint for developing digital twins but also suggests future research directions to enhance this technology. Digital twins have the potential to transform the way we monitor and optimize physical assets, and by understanding their capabilities, we can unlock their full potential.
△ Less
Submitted 27 February, 2023; v1 submitted 14 December, 2022;
originally announced December 2022.
-
A novel corrective-source term approach to modeling unknown physics in aluminum extraction process
Authors:
Haakon Robinson,
Erlend Lundby,
Adil Rasheed,
Jan Tommy Gravdahl
Abstract:
With the ever-increasing availability of data, there has been an explosion of interest in applying modern machine learning methods to fields such as modeling and control. However, despite the flexibility and surprising accuracy of such black-box models, it remains difficult to trust them. Recent efforts to combine the two approaches aim to develop flexible models that nonetheless generalize well;…
▽ More
With the ever-increasing availability of data, there has been an explosion of interest in applying modern machine learning methods to fields such as modeling and control. However, despite the flexibility and surprising accuracy of such black-box models, it remains difficult to trust them. Recent efforts to combine the two approaches aim to develop flexible models that nonetheless generalize well; a paradigm we call Hybrid Analysis and modeling (HAM). In this work we investigate the Corrective Source Term Approach (CoSTA), which uses a data-driven model to correct a misspecified physics-based model. This enables us to develop models that make accurate predictions even when the underlying physics of the problem is not well understood. We apply CoSTA to model the Hall-Héroult process in an aluminum electrolysis cell. We demonstrate that the method improves both accuracy and predictive stability, yielding an overall more trustworthy model.
△ Less
Submitted 10 February, 2023; v1 submitted 22 September, 2022;
originally announced September 2022.
-
Sparse deep neural networks for modeling aluminum electrolysis dynamics
Authors:
Erlend Torje Berg Lundby,
Adil Rasheed,
Ivar Johan Halvorsen,
Jan Tommy Gravdahl
Abstract:
Deep neural networks have become very popular in modeling complex nonlinear processes due to their extraordinary ability to fit arbitrary nonlinear functions from data with minimal expert intervention. However, they are almost always overparameterized and challenging to interpret due to their internal complexity. Furthermore, the optimization process to find the learned model parameters can be uns…
▽ More
Deep neural networks have become very popular in modeling complex nonlinear processes due to their extraordinary ability to fit arbitrary nonlinear functions from data with minimal expert intervention. However, they are almost always overparameterized and challenging to interpret due to their internal complexity. Furthermore, the optimization process to find the learned model parameters can be unstable due to the process getting stuck in local minima. In this work, we demonstrate the value of sparse regularization techniques to significantly reduce the model complexity. We demonstrate this for the case of an aluminium extraction process, which is highly nonlinear system with many interrelated subprocesses. We trained a densely connected deep neural network to model the process and then compared the effects of sparsity promoting l1 regularization on generalizability, interpretability, and training stability. We found that the regularization significantly reduces model complexity compared to a corresponding dense neural network. We argue that this makes the model more interpretable, and show that training an ensemble of sparse neural networks with different parameter initializations often converges to similar model structures with similar learned input features. Furthermore, the empirical study shows that the resulting sparse models generalize better from small training sets than their dense counterparts.
△ Less
Submitted 13 January, 2023; v1 submitted 13 September, 2022;
originally announced September 2022.
-
Prospects of federated machine learning in fluid dynamics
Authors:
Omer San,
Suraj Pawar,
Adil Rasheed
Abstract:
Physics-based models have been mainstream in fluid dynamics for developing predictive models. In recent years, machine learning has offered a renaissance to the fluid community due to the rapid developments in data science, processing units, neural network based technologies, and sensor adaptations. So far in many applications in fluid dynamics, machine learning approaches have been mostly focused…
▽ More
Physics-based models have been mainstream in fluid dynamics for developing predictive models. In recent years, machine learning has offered a renaissance to the fluid community due to the rapid developments in data science, processing units, neural network based technologies, and sensor adaptations. So far in many applications in fluid dynamics, machine learning approaches have been mostly focused on a standard process that requires centralizing the training data on a designated machine or in a data center. In this letter, we present a federated machine learning approach that enables localized clients to collaboratively learn an aggregated and shared predictive model while keeping all the training data on each edge device. We demonstrate the feasibility and prospects of such decentralized learning approach with an effort to forge a deep learning surrogate model for reconstructing spatiotemporal fields. Our results indicate that federated machine learning might be a viable tool for designing highly accurate predictive decentralized digital twins relevant to fluid dynamics.
△ Less
Submitted 15 August, 2022;
originally announced August 2022.
-
Variational multiscale reinforcement learning for discovering reduced order closure models of nonlinear spatiotemporal transport systems
Authors:
Omer San,
Suraj Pawar,
Adil Rasheed
Abstract:
A central challenge in the computational modeling and simulation of a multitude of science applications is to achieve robust and accurate closures for their coarse-grained representations due to underlying highly nonlinear multiscale interactions. These closure models are common in many nonlinear spatiotemporal systems to account for losses due to reduced order representations, including many tran…
▽ More
A central challenge in the computational modeling and simulation of a multitude of science applications is to achieve robust and accurate closures for their coarse-grained representations due to underlying highly nonlinear multiscale interactions. These closure models are common in many nonlinear spatiotemporal systems to account for losses due to reduced order representations, including many transport phenomena in fluids. Previous data-driven closure modeling efforts have mostly focused on supervised learning approaches using high fidelity simulation data. On the other hand, reinforcement learning (RL) is a powerful yet relatively uncharted method in spatiotemporally extended systems. In this study, we put forth a modular dynamic closure modeling and discovery framework to stabilize the Galerkin projection based reduced order models that may arise in many nonlinear spatiotemporal dynamical systems with quadratic nonlinearity. However, a key element in creating a robust RL agent is to introduce a feasible reward function, which can be constituted of any difference metrics between the RL model and high fidelity simulation data. First, we introduce a multi-modal RL (MMRL) to discover mode-dependant closure policies that utilize the high fidelity data in rewarding our RL agent. We then formulate a variational multiscale RL (VMRL) approach to discover closure models without requiring access to the high fidelity data in designing the reward function. Specifically, our chief innovation is to leverage variational multiscale formalism to quantify the difference between modal interactions in Galerkin systems. Our results in simulating the viscous Burgers equation indicate that the proposed VMRL method leads to robust and accurate closure parameterizations, and it may potentially be used to discover scale-aware closure models for complex dynamical systems.
△ Less
Submitted 7 July, 2022;
originally announced July 2022.
-
Decentralized digital twins of complex dynamical systems
Authors:
Omer San,
Suraj Pawar,
Adil Rasheed
Abstract:
In this paper, we introduce a decentralized digital twin (DDT) framework for dynamical systems and discuss the prospects of the DDT modeling paradigm in computational science and engineering applications. The DDT approach is built on a federated learning concept, a branch of machine learning that encourages knowledge sharing without sharing the actual data. This approach enables clients to collabo…
▽ More
In this paper, we introduce a decentralized digital twin (DDT) framework for dynamical systems and discuss the prospects of the DDT modeling paradigm in computational science and engineering applications. The DDT approach is built on a federated learning concept, a branch of machine learning that encourages knowledge sharing without sharing the actual data. This approach enables clients to collaboratively learn an aggregated model while keeping all the training data on each client. We demonstrate the feasibility of the DDT framework with various dynamical systems, which are often considered prototypes for modeling complex transport phenomena in spatiotemporally extended systems. Our results indicate that federated machine learning might be a key enabler for designing highly accurate decentralized digital twins in complex nonlinear spatiotemporal systems.
△ Less
Submitted 7 July, 2022;
originally announced July 2022.
-
Combining physics-based and data-driven techniques for reliable hybrid analysis and modeling using the corrective source term approach
Authors:
Sindre Stenen Blakseth,
Adil Rasheed,
Trond Kvamsdal,
Omer San
Abstract:
Upcoming technologies like digital twins, autonomous, and artificial intelligent systems involving safety-critical applications require models which are accurate, interpretable, computationally efficient, and generalizable. Unfortunately, the two most commonly used modeling approaches, physics-based modeling (PBM) and data-driven modeling (DDM) fail to satisfy all these requirements. In the curren…
▽ More
Upcoming technologies like digital twins, autonomous, and artificial intelligent systems involving safety-critical applications require models which are accurate, interpretable, computationally efficient, and generalizable. Unfortunately, the two most commonly used modeling approaches, physics-based modeling (PBM) and data-driven modeling (DDM) fail to satisfy all these requirements. In the current work, we demonstrate how a hybrid approach combining the best of PBM and DDM can result in models which can outperform them both. We do so by combining partial differential equations based on first principles describing partially known physics with a black box DDM, in this case, a deep neural network model compensating for the unknown physics. First, we present a mathematical argument for why this approach should work and then apply the hybrid approach to model two dimensional heat diffusion problem with an unknown source term. The result demonstrates the method's superior performance in terms of accuracy, and generalizability. Additionally, it is shown how the DDM part can be interpreted within the hybrid framework to make the overall approach reliable.
△ Less
Submitted 7 June, 2022;
originally announced June 2022.
-
Physics Guided Machine Learning for Variational Multiscale Reduced Order Modeling
Authors:
Shady E. Ahmed,
Omer San,
Adil Rasheed,
Traian Iliescu,
Alessandro Veneziani
Abstract:
We propose a new physics guided machine learning (PGML) paradigm that leverages the variational multiscale (VMS) framework and available data to dramatically increase the accuracy of reduced order models (ROMs) at a modest computational cost. The hierarchical structure of the ROM basis and the VMS framework enable a natural separation of the resolved and unresolved ROM spatial scales. Modern PGML…
▽ More
We propose a new physics guided machine learning (PGML) paradigm that leverages the variational multiscale (VMS) framework and available data to dramatically increase the accuracy of reduced order models (ROMs) at a modest computational cost. The hierarchical structure of the ROM basis and the VMS framework enable a natural separation of the resolved and unresolved ROM spatial scales. Modern PGML algorithms are used to construct novel models for the interaction among the resolved and unresolved ROM scales. Specifically, the new framework builds ROM operators that are closest to the true interaction terms in the VMS framework. Finally, machine learning is used to reduce the projection error and further increase the ROM accuracy. Our numerical experiments for a two-dimensional vorticity transport problem show that the novel PGML-VMS-ROM paradigm maintains the low computational cost of current ROMs, while significantly increasing the ROM accuracy.
△ Less
Submitted 24 May, 2022;
originally announced May 2022.
-
Physics guided neural networks for modelling of non-linear dynamics
Authors:
Haakon Robinson,
Suraj Pawar,
Adil Rasheed,
Omer San
Abstract:
The success of the current wave of artificial intelligence can be partly attributed to deep neural networks, which have proven to be very effective in learning complex patterns from large datasets with minimal human intervention. However, it is difficult to train these models on complex dynamical systems from data alone due to their low data efficiency and sensitivity to hyperparameters and initia…
▽ More
The success of the current wave of artificial intelligence can be partly attributed to deep neural networks, which have proven to be very effective in learning complex patterns from large datasets with minimal human intervention. However, it is difficult to train these models on complex dynamical systems from data alone due to their low data efficiency and sensitivity to hyperparameters and initialisation. This work demonstrates that injection of partially known information at an intermediate layer in a DNN can improve model accuracy, reduce model uncertainty, and yield improved convergence during the training. The value of these physics-guided neural networks has been demonstrated by learning the dynamics of a wide variety of nonlinear dynamical systems represented by five well-known equations in nonlinear systems theory: the Lotka-Volterra, Duffing, Van der Pol, Lorenz, and Henon-Heiles systems.
△ Less
Submitted 13 May, 2022;
originally announced May 2022.
-
Data Processing Framework for Ship Performance Analysis
Authors:
Prateek Gupta,
Young-Rong Kim,
Sverre Steen,
Adil Rasheed
Abstract:
The hydrodynamic performance of a sea-going ship can be analysed using the data obtained from the ship. Such data can be gathered from different sources, like onboard recorded in-service data, AIS data, and noon reports. Each of these sources is known to have their inherent problems. The current work gives a brief introduction to these data sources as well as the common problems associated with th…
▽ More
The hydrodynamic performance of a sea-going ship can be analysed using the data obtained from the ship. Such data can be gathered from different sources, like onboard recorded in-service data, AIS data, and noon reports. Each of these sources is known to have their inherent problems. The current work gives a brief introduction to these data sources as well as the common problems associated with them, along with some examples. In order to resolve most of these problems, a streamlined semi-automatic data processing framework for fast data processing is developed and presented here. The data processing framework can be used to process the data obtained from any of the above three mentioned sources. The framework incorporates processing steps like interpolating weather hindcast (metocean) data to ship's location in time, deriving additional features, validating data, estimating resistance components, data cleaning, and outlier detection. A brief description of each of the processing steps is provided with examples from existing datasets. The processed data can be further used to analyse the hydrodynamic performance of a ship.
△ Less
Submitted 2 February, 2022;
originally announced February 2022.
-
Nonlinear proper orthogonal decomposition for convection-dominated flows
Authors:
Shady E. Ahmed,
Omer San,
Adil Rasheed,
Traian Iliescu
Abstract:
Autoencoder techniques find increasingly common use in reduced order modeling as a means to create a latent space. This reduced order representation offers a modular data-driven modeling approach for nonlinear dynamical systems when integrated with a time series predictive model. In this letter, we put forth a nonlinear proper orthogonal decomposition (POD) framework, which is an end-to-end Galerk…
▽ More
Autoencoder techniques find increasingly common use in reduced order modeling as a means to create a latent space. This reduced order representation offers a modular data-driven modeling approach for nonlinear dynamical systems when integrated with a time series predictive model. In this letter, we put forth a nonlinear proper orthogonal decomposition (POD) framework, which is an end-to-end Galerkin-free model combining autoencoders with long short-term memory networks for dynamics. By eliminating the projection error due to the truncation of Galerkin models, a key enabler of the proposed nonintrusive approach is the kinematic construction of a nonlinear mapping between the full-rank expansion of the POD coefficients and the latent space where the dynamics evolve. We test our framework for model reduction of a convection-dominated system, which is generally challenging for reduced order models. Our approach not only improves the accuracy, but also significantly reduces the computational cost of training and testing.
△ Less
Submitted 5 November, 2021; v1 submitted 15 October, 2021;
originally announced October 2021.
-
Ship Performance Monitoring using Machine-learning
Authors:
Prateek Gupta,
Adil Rasheed,
Sverre Steen
Abstract:
The hydrodynamic performance of a sea-going ship varies over its lifespan due to factors like marine fouling and the condition of the anti-fouling paint system. In order to accurately estimate the power demand and fuel consumption for a planned voyage, it is important to assess the hydrodynamic performance of the ship. The current work uses machine-learning (ML) methods to estimate the hydrodynami…
▽ More
The hydrodynamic performance of a sea-going ship varies over its lifespan due to factors like marine fouling and the condition of the anti-fouling paint system. In order to accurately estimate the power demand and fuel consumption for a planned voyage, it is important to assess the hydrodynamic performance of the ship. The current work uses machine-learning (ML) methods to estimate the hydrodynamic performance of a ship using the onboard recorded in-service data. Three ML methods, NL-PCR, NL-PLSR and probabilistic ANN, are calibrated using the data from two sister ships. The calibrated models are used to extract the varying trend in ship's hydrodynamic performance over time and predict the change in performance through several propeller and hull cleaning events. The predicted change in performance is compared with the corresponding values estimated using the fouling friction coefficient ($ΔC_F$). The ML methods are found to be performing well while modelling the hydrodynamic state variables of the ships with probabilistic ANN model performing the best, but the results from NL-PCR and NL-PLSR are not far behind, indicating that it may be possible to use simple methods to solve such problems with the help of domain knowledge.
△ Less
Submitted 13 December, 2022; v1 submitted 7 October, 2021;
originally announced October 2021.
-
Deep neural network enabled corrective source term approach to hybrid analysis and modeling
Authors:
Sindre Stenen Blakseth,
Adil Rasheed,
Trond Kvamsdal,
Omer San
Abstract:
In this work, we introduce, justify and demonstrate the Corrective Source Term Approach (CoSTA) -- a novel approach to Hybrid Analysis and Modeling (HAM). The objective of HAM is to combine physics-based modeling (PBM) and data-driven modeling (DDM) to create generalizable, trustworthy, accurate, computationally efficient and self-evolving models. CoSTA achieves this objective by augmenting the go…
▽ More
In this work, we introduce, justify and demonstrate the Corrective Source Term Approach (CoSTA) -- a novel approach to Hybrid Analysis and Modeling (HAM). The objective of HAM is to combine physics-based modeling (PBM) and data-driven modeling (DDM) to create generalizable, trustworthy, accurate, computationally efficient and self-evolving models. CoSTA achieves this objective by augmenting the governing equation of a PBM model with a corrective source term generated using a deep neural network. In a series of numerical experiments on one-dimensional heat diffusion, CoSTA is found to outperform comparable DDM and PBM models in terms of accuracy -- often reducing predictive errors by several orders of magnitude -- while also generalizing better than pure DDM. Due to its flexible but solid theoretical foundation, CoSTA provides a modular framework for leveraging novel developments within both PBM and DDM. Its theoretical foundation also ensures that CoSTA can be used to model any system governed by (deterministic) partial differential equations. Moreover, CoSTA facilitates interpretation of the DNN-generated source term within the context of PBM, which results in improved explainability of the DNN. These factors make CoSTA a potential door-opener for data-driven techniques to enter high-stakes applications previously reserved for pure PBM.
△ Less
Submitted 30 November, 2021; v1 submitted 24 May, 2021;
originally announced May 2021.
-
Self-Supervised Learning for Fine-Grained Visual Categorization
Authors:
Muhammad Maaz,
Hanoona Abdul Rasheed,
Dhanalaxmi Gaddam
Abstract:
Recent research in self-supervised learning (SSL) has shown its capability in learning useful semantic representations from images for classification tasks. Through our work, we study the usefulness of SSL for Fine-Grained Visual Categorization (FGVC). FGVC aims to distinguish objects of visually similar sub categories within a general category. The small inter-class, but large intra-class variati…
▽ More
Recent research in self-supervised learning (SSL) has shown its capability in learning useful semantic representations from images for classification tasks. Through our work, we study the usefulness of SSL for Fine-Grained Visual Categorization (FGVC). FGVC aims to distinguish objects of visually similar sub categories within a general category. The small inter-class, but large intra-class variations within the dataset makes it a challenging task. The limited availability of annotated labels for such a fine-grained data encourages the need for SSL, where additional supervision can boost learning without the cost of extra annotations. Our baseline achieves $86.36\%$ top-1 classification accuracy on CUB-200-2011 dataset by utilizing random crop augmentation during training and center crop augmentation during testing. In this work, we explore the usefulness of various pretext tasks, specifically, rotation, pretext invariant representation learning (PIRL), and deconstruction and construction learning (DCL) for FGVC. Rotation as an auxiliary task promotes the model to learn global features, and diverts it from focusing on the subtle details. PIRL that uses jigsaw patches attempts to focus on discriminative local regions, but struggles to accurately localize them. DCL helps in learning local discriminating features and outperforms the baseline by achieving $87.41\%$ top-1 accuracy. The deconstruction learning forces the model to focus on local object parts, while reconstruction learning helps in learning the correlation between the parts. We perform extensive experiments to reason our findings. Our code is available at https://github.com/mmaaz60/ssl_for_fgvc.
△ Less
Submitted 18 May, 2021;
originally announced May 2021.
-
Hybrid analysis and modeling, eclecticism, and multifidelity computing toward digital twin revolution
Authors:
Omer San,
Adil Rasheed,
Trond Kvamsdal
Abstract:
Most modeling approaches lie in either of the two categories: physics-based or data-driven. Recently, a third approach which is a combination of these deterministic and statistical models is emerging for scientific applications. To leverage these developments, our aim in this perspective paper is centered around exploring numerous principle concepts to address the challenges of (i) trustworthiness…
▽ More
Most modeling approaches lie in either of the two categories: physics-based or data-driven. Recently, a third approach which is a combination of these deterministic and statistical models is emerging for scientific applications. To leverage these developments, our aim in this perspective paper is centered around exploring numerous principle concepts to address the challenges of (i) trustworthiness and generalizability in developing data-driven models to shed light on understanding the fundamental trade-offs in their accuracy and efficiency, and (ii) seamless integration of interface learning and multifidelity coupling approaches that transfer and represent information between different entities, particularly when different scales are governed by different physics, each operating on a different level of abstraction. Addressing these challenges could enable the revolution of digital twin technologies for scientific and engineering applications.
△ Less
Submitted 26 March, 2021;
originally announced March 2021.
-
Geometric Change Detection in Digital Twins using 3D Machine Learning
Authors:
Tiril Sundby,
Julia Maria Graham,
Adil Rasheed,
Mandar Tabib,
Omer San
Abstract:
Digital twins are meant to bridge the gap between real-world physical systems and virtual representations. Both stand-alone and descriptive digital twins incorporate 3D geometric models, which are the physical representations of objects in the digital replica. Digital twin applications are required to rapidly update internal parameters with the evolution of their physical counterpart. Due to an es…
▽ More
Digital twins are meant to bridge the gap between real-world physical systems and virtual representations. Both stand-alone and descriptive digital twins incorporate 3D geometric models, which are the physical representations of objects in the digital replica. Digital twin applications are required to rapidly update internal parameters with the evolution of their physical counterpart. Due to an essential need for having high-quality geometric models for accurate physical representations, the storage and bandwidth requirements for storing 3D model information can quickly exceed the available storage and bandwidth capacity. In this work, we demonstrate a novel approach to geometric change detection in the context of a digital twin. We address the issue through a combined solution of Dynamic Mode Decomposition (DMD) for motion detection, YOLOv5 for object detection, and 3D machine learning for pose estimation. DMD is applied for background subtraction, enabling detection of moving foreground objects in real-time. The video frames containing detected motion are extracted and used as input to the change detection network. The object detection algorithm YOLOv5 is applied to extract the bounding boxes of detected objects in the video frames. Furthermore, the rotational pose of each object is estimated in a 3D pose estimation network. A series of convolutional neural networks conducts feature extraction from images and 3D model shapes. Then, the network outputs the estimated Euler angles of the camera orientation with respect to the object in the input image. By only storing data associated with a detected change in pose, we minimize necessary storage and bandwidth requirements while still being able to recreate the 3D scene on demand.
△ Less
Submitted 15 March, 2021;
originally announced March 2021.
-
District Wise Price Forecasting of Wheat in Pakistan using Deep Learning
Authors:
Ahmed Rasheed,
Muhammad Shahzad Younis,
Farooq Ahmad,
Junaid Qadir,
Muhammad Kashif
Abstract:
Wheat is the main agricultural crop of Pakistan and is a staple food requirement of almost every Pakistani household making it the main strategic commodity of the country whose availability and affordability is the government's main priority. Wheat food availability can be vastly affected by multiple factors included but not limited to the production, consumption, financial crisis, inflation, or v…
▽ More
Wheat is the main agricultural crop of Pakistan and is a staple food requirement of almost every Pakistani household making it the main strategic commodity of the country whose availability and affordability is the government's main priority. Wheat food availability can be vastly affected by multiple factors included but not limited to the production, consumption, financial crisis, inflation, or volatile market. The government ensures food security by particular policy and monitory arrangements, which keeps up purchase parity for the poor. Such arrangements can be made more effective if a dynamic analysis is carried out to estimate the future yield based on certain current factors. Future planning of commodity pricing is achievable by forecasting their future price anticipated by the current circumstances. This paper presents a wheat price forecasting methodology, which uses the price, weather, production, and consumption trends for wheat prices taken over the past few years and analyzes them with the help of advance neural networks architecture Long Short Term Memory (LSTM) networks. The proposed methodology presented significantly improved results versus other conventional machine learning and statistical time series analysis methods.
△ Less
Submitted 5 March, 2021;
originally announced March 2021.
-
Use of Transfer Learning and Wavelet Transform for Breast Cancer Detection
Authors:
Ahmed Rasheed,
Muhammad Shahzad Younis,
Junaid Qadir,
Muhammad Bilal
Abstract:
Breast cancer is one of the most common cause of deaths among women. Mammography is a widely used imaging modality that can be used for cancer detection in its early stages. Deep learning is widely used for the detection of cancerous masses in the images obtained via mammography. The need to improve accuracy remains constant due to the sensitive nature of the datasets so we introduce segmentation…
▽ More
Breast cancer is one of the most common cause of deaths among women. Mammography is a widely used imaging modality that can be used for cancer detection in its early stages. Deep learning is widely used for the detection of cancerous masses in the images obtained via mammography. The need to improve accuracy remains constant due to the sensitive nature of the datasets so we introduce segmentation and wavelet transform to enhance the important features in the image scans. Our proposed system aids the radiologist in the screening phase of cancer detection by using a combination of segmentation and wavelet transforms as pre-processing augmentation that leads to transfer learning in neural networks. The proposed system with these pre-processing techniques significantly increases the accuracy of detection on Mini-MIAS.
△ Less
Submitted 5 March, 2021;
originally announced March 2021.
-
Accelerating Recursive Partition-Based Causal Structure Learning
Authors:
Md. Musfiqur Rahman,
Ayman Rasheed,
Md. Mosaddek Khan,
Mohammad Ali Javidian,
Pooyan Jamshidi,
Md. Mamun-Or-Rashid
Abstract:
Causal structure discovery from observational data is fundamental to the causal understanding of autonomous systems such as medical decision support systems, advertising campaigns and self-driving cars. This is essential to solve well-known causal decision making and prediction problems associated with those real-world applications. Recently, recursive causal discovery algorithms have gained parti…
▽ More
Causal structure discovery from observational data is fundamental to the causal understanding of autonomous systems such as medical decision support systems, advertising campaigns and self-driving cars. This is essential to solve well-known causal decision making and prediction problems associated with those real-world applications. Recently, recursive causal discovery algorithms have gained particular attention among the research community due to their ability to provide good results by using Conditional Independent (CI) tests in smaller sub-problems. However, each of such algorithms needs a refinement function to remove undesired causal relations of the discovered graphs. Notably, with the increase of the problem size, the computation cost (i.e., the number of CI-tests) of the refinement function makes an algorithm expensive to deploy in practice. This paper proposes a generic causal structure refinement strategy that can locate the undesired relations with a small number of CI-tests, thus speeding up the algorithm for large and complex problems. We theoretically prove the correctness of our algorithm. We then empirically evaluate its performance against the state-of-the-art algorithms in terms of solution quality and completion time in synthetic and real datasets.
△ Less
Submitted 23 February, 2021;
originally announced February 2021.
-
Uncertainty aware and explainable diagnosis of retinal disease
Authors:
Amitojdeep Singh,
Sourya Sengupta,
Mohammed Abdul Rasheed,
Varadharajan Jayakumar,
Vasudevan Lakshminarayanan
Abstract:
Deep learning methods for ophthalmic diagnosis have shown considerable success in tasks like segmentation and classification. However, their widespread application is limited due to the models being opaque and vulnerable to making a wrong decision in complicated cases. Explainability methods show the features that a system used to make prediction while uncertainty awareness is the ability of a sys…
▽ More
Deep learning methods for ophthalmic diagnosis have shown considerable success in tasks like segmentation and classification. However, their widespread application is limited due to the models being opaque and vulnerable to making a wrong decision in complicated cases. Explainability methods show the features that a system used to make prediction while uncertainty awareness is the ability of a system to highlight when it is not sure about the decision. This is one of the first studies using uncertainty and explanations for informed clinical decision making. We perform uncertainty analysis of a deep learning model for diagnosis of four retinal diseases - age-related macular degeneration (AMD), central serous retinopathy (CSR), diabetic retinopathy (DR), and macular hole (MH) using images from a publicly available (OCTID) dataset. Monte Carlo (MC) dropout is used at the test time to generate a distribution of parameters and the predictions approximate the predictive posterior of a Bayesian model. A threshold is computed using the distribution and uncertain cases can be referred to the ophthalmologist thus avoiding an erroneous diagnosis. The features learned by the model are visualized using a proven attribution method from a previous study. The effects of uncertainty on model performance and the relationship between uncertainty and explainability are discussed in terms of clinical significance. The uncertainty information along with the heatmaps make the system more trustworthy for use in clinical settings.
△ Less
Submitted 26 January, 2021;
originally announced January 2021.
-
Physics guided machine learning using simplified theories
Authors:
Suraj Pawar,
Omer San,
Burak Aksoylu,
Adil Rasheed,
Trond Kvamsdal
Abstract:
Recent applications of machine learning, in particular deep learning, motivate the need to address the generalizability of the statistical inference approaches in physical sciences. In this letter, we introduce a modular physics guided machine learning framework to improve the accuracy of such data-driven predictive engines. The chief idea in our approach is to augment the knowledge of the simplif…
▽ More
Recent applications of machine learning, in particular deep learning, motivate the need to address the generalizability of the statistical inference approaches in physical sciences. In this letter, we introduce a modular physics guided machine learning framework to improve the accuracy of such data-driven predictive engines. The chief idea in our approach is to augment the knowledge of the simplified theories with the underlying learning process. To emphasise on their physical importance, our architecture consists of adding certain features at intermediate layers rather than in the input layer. To demonstrate our approach, we select a canonical airfoil aerodynamic problem with the enhancement of the potential flow theory. We include features obtained by a panel method that can be computed efficiently for an unseen configuration in our training procedure. By addressing the generalizability concerns, our results suggest that the proposed feature enhancement approach can be effectively used in many scientific machine learning applications, especially for the systems where we can use a theoretical, empirical, or simplified model to guide the learning module.
△ Less
Submitted 18 December, 2020;
originally announced December 2020.
-
On the effectiveness of signal decomposition, feature extraction and selection on lung sound classification
Authors:
Andrine Elsetrønning,
Adil Rasheed,
Jon Bekker,
Omer San
Abstract:
Lung sounds refer to the sound generated by air moving through the respiratory system. These sounds, as most biomedical signals, are non-linear and non-stationary. A vital part of using the lung sound for disease detection is discrimination between normal lung sound and abnormal lung sound. In this paper, several approaches for classifying between no-crackle and crackle lung sounds are explored. D…
▽ More
Lung sounds refer to the sound generated by air moving through the respiratory system. These sounds, as most biomedical signals, are non-linear and non-stationary. A vital part of using the lung sound for disease detection is discrimination between normal lung sound and abnormal lung sound. In this paper, several approaches for classifying between no-crackle and crackle lung sounds are explored. Decomposition methods such as Empirical Mode Decomposition, Ensemble Empirical Mode Decomposition, and Discrete Wavelet Transform are used along with several feature extraction techniques like Principal Component Analysis and Autoencoder, to explore how various classifiers perform for the given task. An open-source dataset downloaded from Kaggle, containing chest auscultation of varying quality is used to determine the results of using the different decomposition and feature extraction combinations. It is found that when higher-order statistical and spectral features along with the Mel-frequency cepstral coefficients are fed to the classier we get the best performance with the kNN classifier giving the best accuracy. Furthermore, it is also demonstrated that using a combination of feature selection methods one can significantly reduce the number of input features without adversely affecting the accuracy of the classifiers.
△ Less
Submitted 21 December, 2020;
originally announced December 2020.
-
Quantitative and Qualitative Evaluation of Explainable Deep Learning Methods for Ophthalmic Diagnosis
Authors:
Amitojdeep Singh,
J. Jothi Balaji,
Mohammed Abdul Rasheed,
Varadharajan Jayakumar,
Rajiv Raman,
Vasudevan Lakshminarayanan
Abstract:
Background: The lack of explanations for the decisions made by algorithms such as deep learning has hampered their acceptance by the clinical community despite highly accurate results on multiple problems. Recently, attribution methods have emerged for explaining deep learning models, and they have been tested on medical imaging problems. The performance of attribution methods is compared on stand…
▽ More
Background: The lack of explanations for the decisions made by algorithms such as deep learning has hampered their acceptance by the clinical community despite highly accurate results on multiple problems. Recently, attribution methods have emerged for explaining deep learning models, and they have been tested on medical imaging problems. The performance of attribution methods is compared on standard machine learning datasets and not on medical images. In this study, we perform a comparative analysis to determine the most suitable explainability method for retinal OCT diagnosis.
Methods: A commonly used deep learning model known as Inception v3 was trained to diagnose 3 retinal diseases - choroidal neovascularization (CNV), diabetic macular edema (DME), and drusen. The explanations from 13 different attribution methods were rated by a panel of 14 clinicians for clinical significance. Feedback was obtained from the clinicians regarding the current and future scope of such methods.
Results: An attribution method based on a Taylor series expansion, called Deep Taylor was rated the highest by clinicians with a median rating of 3.85/5. It was followed by two other attribution methods, Guided backpropagation and SHAP (SHapley Additive exPlanations).
Conclusion: Explanations of deep learning models can make them more transparent for clinical diagnosis. This study compared different explanations methods in the context of retinal OCT diagnosis and found that the best performing method may not be the one considered best for other deep learning tasks. Overall, there was a high degree of acceptance from the clinicians surveyed in the study.
Keywords: explainable AI, deep learning, machine learning, image processing, Optical coherence tomography, retina, Diabetic macular edema, Choroidal Neovascularization, Drusen
△ Less
Submitted 24 March, 2021; v1 submitted 26 September, 2020;
originally announced September 2020.
-
A nudged hybrid analysis and modeling approach for realtime wake-vortex transport and decay prediction
Authors:
Shady Ahmed,
Suraj Pawar,
Omer San,
Adil Rasheed,
Mandar Tabib
Abstract:
We put forth a long short-term memory (LSTM) nudging framework for the enhancement of reduced order models (ROMs) of fluid flows utilizing noisy measurements for air traffic improvements. Toward emerging applications of digital twins in aviation, the proposed approach allows for constructing a realtime predictive tool for wake-vortex transport and decay systems. We build on the fact that in realis…
▽ More
We put forth a long short-term memory (LSTM) nudging framework for the enhancement of reduced order models (ROMs) of fluid flows utilizing noisy measurements for air traffic improvements. Toward emerging applications of digital twins in aviation, the proposed approach allows for constructing a realtime predictive tool for wake-vortex transport and decay systems. We build on the fact that in realistic application, there are uncertainties in initial and boundary conditions, model parameters, as well as measurements. Moreover, conventional nonlinear ROMs based on Galerkin projection (GROMs) suffer from imperfection and solution instabilities, especially for advection-dominated flows with slow decay in the Kolmogorov width. In the presented LSTM nudging (LSTM-N) approach, we fuse forecasts from a combination of imperfect GROM and uncertain state estimates, with sparse Eulerian sensor measurements to provide more reliable predictions in a dynamical data assimilation framework. We illustrate our concept by solving a two-dimensional vorticity transport equation. We investigate the effects of measurements noise and state estimate uncertainty on the performance of the LSTM-N behavior. We also demonstrate that it can sufficiently handle different levels of temporal and spatial measurement sparsity, and offer a huge potential in developing next-generation digital twin technologies.
△ Less
Submitted 5 March, 2021; v1 submitted 5 August, 2020;
originally announced August 2020.
-
Interface learning of multiphysics and multiscale systems
Authors:
Shady E. Ahmed,
Omer San,
Kursat Kara,
Rami Younis,
Adil Rasheed
Abstract:
Complex natural or engineered systems comprise multiple characteristic scales, multiple spatiotemporal domains, and even multiple physical closure laws. To address such challenges, we introduce an interface learning paradigm and put forth a data-driven closure approach based on memory embedding to provide physically correct boundary conditions at the interface. To enable the interface learning for…
▽ More
Complex natural or engineered systems comprise multiple characteristic scales, multiple spatiotemporal domains, and even multiple physical closure laws. To address such challenges, we introduce an interface learning paradigm and put forth a data-driven closure approach based on memory embedding to provide physically correct boundary conditions at the interface. To enable the interface learning for hyperbolic systems by considering the domain of influence and wave structures into account, we put forth the concept of upwind learning towards a physics-informed domain decomposition. The promise of the proposed approach is shown for a set of canonical illustrative problems. We highlight that high-performance computing environments can benefit from this methodology to reduce communication costs among processing units in emerging machine learning ready heterogeneous platforms toward exascale era.
△ Less
Submitted 31 October, 2020; v1 submitted 17 June, 2020;
originally announced June 2020.
-
Deep Reinforcement Learning Controller for 3D Path-following and Collision Avoidance by Autonomous Underwater Vehicles
Authors:
Simen Theie Havenstrøm,
Adil Rasheed,
Omer San
Abstract:
Control theory provides engineers with a multitude of tools to design controllers that manipulate the closed-loop behavior and stability of dynamical systems. These methods rely heavily on insights about the mathematical model governing the physical system. However, in complex systems, such as autonomous underwater vehicles performing the dual objective of path-following and collision avoidance, d…
▽ More
Control theory provides engineers with a multitude of tools to design controllers that manipulate the closed-loop behavior and stability of dynamical systems. These methods rely heavily on insights about the mathematical model governing the physical system. However, in complex systems, such as autonomous underwater vehicles performing the dual objective of path-following and collision avoidance, decision making becomes non-trivial. We propose a solution using state-of-the-art Deep Reinforcement Learning (DRL) techniques, to develop autonomous agents capable of achieving this hybrid objective without having à priori knowledge about the goal or the environment. Our results demonstrate the viability of DRL in path-following and avoiding collisions toward achieving human-level decision making in autonomous vehicle systems within extreme obstacle configurations.
△ Less
Submitted 17 June, 2020;
originally announced June 2020.
-
COLREG-Compliant Collision Avoidance for Unmanned Surface Vehicle using Deep Reinforcement Learning
Authors:
Eivind Meyer,
Amalie Heiberg,
Adil Rasheed,
Omer San
Abstract:
Path Following and Collision Avoidance, be it for unmanned surface vessels or other autonomous vehicles, are two fundamental guidance problems in robotics. For many decades, they have been subject to academic study, leading to a vast number of proposed approaches. However, they have mostly been treated as separate problems, and have typically relied on non-linear first-principles models with param…
▽ More
Path Following and Collision Avoidance, be it for unmanned surface vessels or other autonomous vehicles, are two fundamental guidance problems in robotics. For many decades, they have been subject to academic study, leading to a vast number of proposed approaches. However, they have mostly been treated as separate problems, and have typically relied on non-linear first-principles models with parameters that can only be determined experimentally. The rise of Deep Reinforcement Learning (DRL) in recent years suggests an alternative approach: end-to-end learning of the optimal guidance policy from scratch by means of a trial-and-error based approach. In this article, we explore the potential of Proximal Policy Optimization (PPO), a DRL algorithm with demonstrated state-of-the-art performance on Continuous Control tasks, when applied to the dual-objective problem of controlling an underactuated Autonomous Surface Vehicle in a COLREGs compliant manner such that it follows an a priori known desired path while avoiding collisions with other vessels along the way. Based on high-fidelity elevation and AIS tracking data from the Trondheim Fjord, an inlet of the Norwegian sea, we evaluate the trained agent's performance in challenging, dynamic real-world scenarios where the ultimate success of the agent rests upon its ability to navigate non-uniform marine terrain while handling challenging, but realistic vessel encounters.
△ Less
Submitted 16 June, 2020;
originally announced June 2020.
-
Marine life through You Only Look Once's perspective
Authors:
Herman Stavelin,
Adil Rasheed,
Omer San,
Arne Johan Hestnes
Abstract:
With the rise of focus on man made changes to our planet and wildlife therein, more and more emphasis is put on sustainable and responsible gathering of resources. In an effort to preserve maritime wildlife the Norwegian government has decided that it is necessary to create an overview over the presence and abundance of various species of wildlife in the Norwegian fjords and oceans. In this paper…
▽ More
With the rise of focus on man made changes to our planet and wildlife therein, more and more emphasis is put on sustainable and responsible gathering of resources. In an effort to preserve maritime wildlife the Norwegian government has decided that it is necessary to create an overview over the presence and abundance of various species of wildlife in the Norwegian fjords and oceans. In this paper we apply and analyze an object detection scheme that detects fish in camera images. The data is sampled from a submerged data station at Fulehuk in Norway. We implement You Only Look Once (YOLO) version 3 and create a dataset consisting of 99,961 images with a mAP of $\sim 0.88$. We also investigate intermediate results within YOLO, gaining insight into how it performs object detection.
△ Less
Submitted 11 February, 2020;
originally announced March 2020.
-
Proportional integral derivative controller assisted reinforcement learning for path following by autonomous underwater vehicles
Authors:
Simen Theie Havenstrøm,
Camilla Sterud,
Adil Rasheed,
Omer San
Abstract:
Control theory provides engineers with a multitude of tools to design controllers that manipulate the closed-loop behavior and stability of dynamical systems. These methods rely heavily on insights about the mathematical model governing the physical system. However, if a system is highly complex, it might be infeasible to produce a reliable mathematical model of the system. Without a model most of…
▽ More
Control theory provides engineers with a multitude of tools to design controllers that manipulate the closed-loop behavior and stability of dynamical systems. These methods rely heavily on insights about the mathematical model governing the physical system. However, if a system is highly complex, it might be infeasible to produce a reliable mathematical model of the system. Without a model most of the theoretical tools to develop control laws break down. In these settings, machine learning controllers become attractive: Controllers that can learn and adapt to complex systems, developing control laws where the engineer cannot. This article focuses on utilizing machine learning controllers in practical applications, specifically using deep reinforcement learning in motion control systems for an autonomous underwater vehicle with six degrees-of-freedom. Two methods are considered: end-to-end learning, where the vehicle is left entirely alone to explore the solution space in its search for an optimal policy, and PID assisted learning, where the DRL controller is essentially split into three separate parts, each controlling its own actuator.
△ Less
Submitted 22 September, 2020; v1 submitted 3 February, 2020;
originally announced February 2020.
-
Classification of Chest Diseases using Wavelet Transforms and Transfer Learning
Authors:
Ahmed Rasheed,
Muhammad Shahzad Younis,
Muhammad Bilal,
Maha Rasheed
Abstract:
Chest X-ray scan is a most often used modality by radiologists to diagnose many chest related diseases in their initial stages. The proposed system aids the radiologists in making decision about the diseases found in the scans more efficiently. Our system combines the techniques of image processing for feature enhancement and deep learning for classification among diseases. We have used the ChestX…
▽ More
Chest X-ray scan is a most often used modality by radiologists to diagnose many chest related diseases in their initial stages. The proposed system aids the radiologists in making decision about the diseases found in the scans more efficiently. Our system combines the techniques of image processing for feature enhancement and deep learning for classification among diseases. We have used the ChestX-ray14 database in order to train our deep learning model on the 14 different labeled diseases found in it. The proposed research shows the significant improvement in the results by using wavelet transforms as pre-processing technique.
△ Less
Submitted 3 February, 2020;
originally announced February 2020.
-
Taming an autonomous surface vehicle for path following and collision avoidance using deep reinforcement learning
Authors:
Eivind Meyer,
Haakon Robinson,
Adil Rasheed,
Omer San
Abstract:
In this article, we explore the feasibility of applying proximal policy optimization, a state-of-the-art deep reinforcement learning algorithm for continuous control tasks, on the dual-objective problem of controlling an underactuated autonomous surface vehicle to follow an a priori known path while avoiding collisions with non-moving obstacles along the way. The artificial intelligent agent, whic…
▽ More
In this article, we explore the feasibility of applying proximal policy optimization, a state-of-the-art deep reinforcement learning algorithm for continuous control tasks, on the dual-objective problem of controlling an underactuated autonomous surface vehicle to follow an a priori known path while avoiding collisions with non-moving obstacles along the way. The artificial intelligent agent, which is equipped with multiple rangefinder sensors for obstacle detection, is trained and evaluated in a challenging, stochastically generated simulation environment based on the OpenAI gym python toolkit. Notably, the agent is provided with real-time insight into its own reward function, allowing it to dynamically adapt its guidance strategy. Depending on its strategy, which ranges from radical path-adherence to radical obstacle avoidance, the trained agent achieves an episodic success rate between 84 and 100%.
△ Less
Submitted 18 December, 2019;
originally announced December 2019.
-
Dissecting Deep Neural Networks
Authors:
Haakon Robinson,
Adil Rasheed,
Omer San
Abstract:
In exchange for large quantities of data and processing power, deep neural networks have yielded models that provide state of the art predication capabilities in many fields. However, a lack of strong guarantees on their behaviour have raised concerns over their use in safety-critical applications. A first step to understanding these networks is to develop alternate representations that allow for…
▽ More
In exchange for large quantities of data and processing power, deep neural networks have yielded models that provide state of the art predication capabilities in many fields. However, a lack of strong guarantees on their behaviour have raised concerns over their use in safety-critical applications. A first step to understanding these networks is to develop alternate representations that allow for further analysis. It has been shown that neural networks with piecewise affine activation functions are themselves piecewise affine, with their domains consisting of a vast number of linear regions. So far, the research on this topic has focused on counting the number of linear regions, rather than obtaining explicit piecewise affine representations. This work presents a novel algorithm that can compute the piecewise affine form of any fully connected neural network with rectified linear unit activations.
△ Less
Submitted 19 January, 2020; v1 submitted 9 October, 2019;
originally announced October 2019.
-
Adaptive Group-based Zero Knowledge Proof-Authentication Protocol (AGZKP-AP) in Vehicular Ad Hoc Networks
Authors:
Amar A. Rasheed,
Rabi N. Mahapatra,
Felix G. Hamza-Lup
Abstract:
Vehicular Ad Hoc Networks (VANETs) are a particular subclass of mobile ad hoc networks that raise a number of security challenges, notably from the way users authenticate the network. Authentication technologies based on existing security policies and access control rules in such networks assume full trust on Roadside Unit (RSU) and authentication servers. The disclosure of authentication paramete…
▽ More
Vehicular Ad Hoc Networks (VANETs) are a particular subclass of mobile ad hoc networks that raise a number of security challenges, notably from the way users authenticate the network. Authentication technologies based on existing security policies and access control rules in such networks assume full trust on Roadside Unit (RSU) and authentication servers. The disclosure of authentication parameters enables user's trace-ability over the network. VANETs' trusted entities (e.g. RSU) can utilize such information to track a user traveling behavior, violating user privacy and anonymity. In this paper, we proposed a novel, light-weight, Adaptive Group-based Zero Knowledge Proof-Authentication Protocol (AGZKP-AP) for VANETs. The proposed authentication protocol is capable of offering various levels of users' privacy settings based on the type of services available on such networks. Our scheme is based on the Zero-Knowledge-Proof (ZKP) crypto approach with the support of trade-off options. Users have the option to make critical decisions on the level of privacy and the amount of resources usage they prefer such as short system response time versus the number of private information disclosures. Furthermore, AGZKP-AP is incorporated with a distributed privilege control and revoking mechanism that render user's private information to law enforcement in case of a traffic violation.
△ Less
Submitted 23 August, 2019;
originally announced August 2019.
-
A non-intrusive reduced order modeling framework for quasi-geostrophic turbulence
Authors:
Sk. Mashfiqur Rahman,
Suraj Pawar,
Omer San,
Adil Rasheed,
Traian Iliescu
Abstract:
In this study, we present a non-intrusive reduced order modeling (ROM) framework for large-scale quasi-stationary systems. The framework proposed herein exploits the time series prediction capability of long short-term memory (LSTM) recurrent neural network such that: (i) in the training phase, the LSTM model is trained on the modal coefficients extracted from the high-resolution data using proper…
▽ More
In this study, we present a non-intrusive reduced order modeling (ROM) framework for large-scale quasi-stationary systems. The framework proposed herein exploits the time series prediction capability of long short-term memory (LSTM) recurrent neural network such that: (i) in the training phase, the LSTM model is trained on the modal coefficients extracted from the high-resolution data using proper orthogonal decomposition (POD) transform, and (ii) in the testing phase, the trained model predicts the modal coefficients for the total time recursively based on the initial time history. To illustrate the predictive performance of the proposed framework, the mean flow fields and time series response of the field values are reconstructed from the predicted modal coefficients by using an inverse POD transform. As a representative benchmark test case, we consider a two-dimensional quasi-geostrophic (QG) ocean circulation model which, in general, displays an enormous range of fluctuating spatial and temporal scales. We first illustrate that the conventional Galerkin projection based ROM of such systems requires a high number of POD modes to obtain a stable flow physics. In addition, ROM-GP does not seem to capture the intermittent bursts appearing in the dynamics of the first few most energetic modes. However, the proposed non-intrusive ROM framework based on LSTM (ROM-LSTM) yields a stable solution even for a small number of POD modes. We also observe that the ROM-LSTM model is able to capture quasi-periodic intermittent bursts accurately, and yields a stable and accurate mean flow dynamics using the time history of a few previous time states, denoted as the lookback time-window in this paper. Our findings suggest that the proposed ROM framework is capable of predicting noisy nonlinear fluid flows in an extremely efficient way compared to the conventional projection based ROM.
△ Less
Submitted 23 June, 2019;
originally announced June 2019.