-
SAfEPaTh: A System-Level Approach for Efficient Power and Thermal Estimation of Convolutional Neural Network Accelerator
Authors:
Yukai Chen,
Simei Yang,
Debjyoti Bhattacharjee,
Francky Catthoor,
Arindam Mallik
Abstract:
The design of energy-efficient, high-performance, and reliable Convolutional Neural Network (CNN) accelerators involves significant challenges due to complex power and thermal management issues. This paper introduces SAfEPaTh, a novel system-level approach for accurately estimating power and temperature in tile-based CNN accelerators. By addressing both steady-state and transient-state scenarios,…
▽ More
The design of energy-efficient, high-performance, and reliable Convolutional Neural Network (CNN) accelerators involves significant challenges due to complex power and thermal management issues. This paper introduces SAfEPaTh, a novel system-level approach for accurately estimating power and temperature in tile-based CNN accelerators. By addressing both steady-state and transient-state scenarios, SAfEPaTh effectively captures the dynamic effects of pipeline bubbles in interlayer pipelines, utilizing real CNN workloads for comprehensive evaluation. Unlike traditional methods, it eliminates the need for circuit-level simulations or on-chip measurements. Our methodology leverages TANIA, a cutting-edge hybrid digital-analog tile-based accelerator featuring analog-in-memory computing cores alongside digital cores. Through rigorous simulation results using the ResNet18 model, we demonstrate SAfEPaTh's capability to accurately estimate power and temperature within 500 seconds, encompassing CNN model accelerator mapping exploration and detailed power and thermal estimations. This efficiency and accuracy make SAfEPaTh an invaluable tool for designers, enabling them to optimize performance while adhering to stringent power and thermal constraints. Furthermore, SAfEPaTh's adaptability extends its utility across various CNN models and accelerator architectures, underscoring its broad applicability in the field. This study contributes significantly to the advancement of energy-efficient and reliable CNN accelerator designs, addressing critical challenges in dynamic power and thermal management.
△ Less
Submitted 24 July, 2024;
originally announced July 2024.
-
Performance Modeling and Workload Analysis of Distributed Large Language Model Training and Inference
Authors:
Joyjit Kundu,
Wenzhe Guo,
Ali BanaGozar,
Udari De Alwis,
Sourav Sengupta,
Puneet Gupta,
Arindam Mallik
Abstract:
Aligning future system design with the ever-increasing compute needs of large language models (LLMs) is undoubtedly an important problem in today's world. Here, we propose a general performance modeling methodology and workload analysis of distributed LLM training and inference through an analytical framework that accurately considers compute, memory sub-system, network, and various parallelizatio…
▽ More
Aligning future system design with the ever-increasing compute needs of large language models (LLMs) is undoubtedly an important problem in today's world. Here, we propose a general performance modeling methodology and workload analysis of distributed LLM training and inference through an analytical framework that accurately considers compute, memory sub-system, network, and various parallelization strategies (model parallel, data parallel, pipeline parallel, and sequence parallel). We validate our performance predictions with published data from literature and relevant industry vendors (e.g., NVIDIA). For distributed training, we investigate the memory footprint of LLMs for different activation re-computation methods, dissect the key factors behind the massive performance gain from A100 to B200 ($\sim$ 35x speed-up closely following NVIDIA's scaling trend), and further run a design space exploration at different technology nodes (12 nm to 1 nm) to study the impact of logic, memory, and network scaling on the performance. For inference, we analyze the compute versus memory boundedness of different operations at a matrix-multiply level for different GPU systems and further explore the impact of DRAM memory technology scaling on inference latency. Utilizing our modeling framework, we reveal the evolution of performance bottlenecks for both LLM training and inference with technology scaling, thus, providing insights to design future systems for LLM training and inference.
△ Less
Submitted 19 July, 2024;
originally announced July 2024.
-
A Performance Analysis Modeling Framework for Extended Reality Applications in Edge-Assisted Wireless Networks
Authors:
Anik Mallik,
Jiang Xie,
Zhu Han
Abstract:
Extended reality (XR) is at the center of attraction in the research community due to the emergence of augmented, mixed, and virtual reality applications. The performance of such applications needs to be uptight to maintain the requirements of latency, energy consumption, and freshness of data. Therefore, a comprehensive performance analysis model is required to assess the effectiveness of an XR a…
▽ More
Extended reality (XR) is at the center of attraction in the research community due to the emergence of augmented, mixed, and virtual reality applications. The performance of such applications needs to be uptight to maintain the requirements of latency, energy consumption, and freshness of data. Therefore, a comprehensive performance analysis model is required to assess the effectiveness of an XR application but is challenging to design due to the dependence of the performance metrics on several difficult-to-model parameters, such as computing resources and hardware utilization of XR and edge devices, which are controlled by both their operating systems and the application itself. Moreover, the heterogeneity in devices and wireless access networks brings additional challenges in modeling. In this paper, we propose a novel modeling framework for performance analysis of XR applications considering edge-assisted wireless networks and validate the model with experimental data collected from testbeds designed specifically for XR applications. In addition, we present the challenges associated with performance analysis modeling and present methods to overcome them in detail. Finally, the performance evaluation shows that the proposed analytical model can analyze XR applications' performance with high accuracy compared to the state-of-the-art analytical models.
△ Less
Submitted 11 May, 2024;
originally announced May 2024.
-
Extremal minimal bipartite matching covered graphs
Authors:
Amit Kumar Mallik,
Ajit A. Diwan,
Nishad Kothari
Abstract:
A connected graph, on four or more vertices, is matching covered if every edge is present in some perfect matching. An ear decomposition theorem (similar to the one for $2$-connected graphs) exists for bipartite matching covered graphs due to Hetyei. From the results and proofs of Lovász and Plummer, that rely on Hetyei's theorem, one may deduce that any minimal bipartite matching covered graph ha…
▽ More
A connected graph, on four or more vertices, is matching covered if every edge is present in some perfect matching. An ear decomposition theorem (similar to the one for $2$-connected graphs) exists for bipartite matching covered graphs due to Hetyei. From the results and proofs of Lovász and Plummer, that rely on Hetyei's theorem, one may deduce that any minimal bipartite matching covered graph has at least $2(m-n+2)$ vertices of degree two (where minimal means that deleting any edge results in a graph that is not matching covered); such a graph is said to be extremal if it attains the stated lower bound.
In this paper, we provide a complete characterization of the class of extremal minimal bipartite matching covered graphs. In particular, we prove that every such graph $G$ is obtained from two copies of a tree devoid of degree two vertices, say $T$ and $T'$, by adding edges -- each of which joins a leaf of $T$ with the corresponding leaf of $T'$.
Apart from the aforementioned bound, there are four other bounds that appear in, or may be deduced from, the work of Lovász and Plummer. Each of these bounds leads to a notion of extremality. In this paper, we obtain a complete characterization of all of these extremal classes and also establish relationships between them. Two of our characterizations are in the same spirit as the one stated above. For the remaining two extremal classes, we reduce each of them to one of the already characterized extremal classes using standard matching theoretic operations.
△ Less
Submitted 11 April, 2024; v1 submitted 9 April, 2024;
originally announced April 2024.
-
LORD: Large Models based Opposite Reward Design for Autonomous Driving
Authors:
Xin Ye,
Feng Tao,
Abhirup Mallik,
Burhaneddin Yaman,
Liu Ren
Abstract:
Reinforcement learning (RL) based autonomous driving has emerged as a promising alternative to data-driven imitation learning approaches. However, crafting effective reward functions for RL poses challenges due to the complexity of defining and quantifying good driving behaviors across diverse scenarios. Recently, large pretrained models have gained significant attention as zero-shot reward models…
▽ More
Reinforcement learning (RL) based autonomous driving has emerged as a promising alternative to data-driven imitation learning approaches. However, crafting effective reward functions for RL poses challenges due to the complexity of defining and quantifying good driving behaviors across diverse scenarios. Recently, large pretrained models have gained significant attention as zero-shot reward models for tasks specified with desired linguistic goals. However, the desired linguistic goals for autonomous driving such as "drive safely" are ambiguous and incomprehensible by pretrained models. On the other hand, undesired linguistic goals like "collision" are more concrete and tractable. In this work, we introduce LORD, a novel large models based opposite reward design through undesired linguistic goals to enable the efficient use of large pretrained models as zero-shot reward models. Through extensive experiments, our proposed framework shows its efficiency in leveraging the power of large pretrained models for achieving safe and enhanced autonomous driving. Moreover, the proposed approach shows improved generalization capabilities as it outperforms counterpart methods across diverse and challenging driving scenarios.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
Unleashing the True Power of Age-of-Information: Service Aggregation in Connected and Autonomous Vehicles
Authors:
Anik Mallik,
Dawei Chen,
Kyungtae Han,
Jiang Xie,
Zhu Han
Abstract:
Connected and autonomous vehicles (CAVs) rely heavily upon time-sensitive information update services to ensure the safety of people and assets, and satisfactory entertainment applications. Therefore, the freshness of information is a crucial performance metric for CAV services. However, information from roadside sensors and nearby vehicles can get delayed in transmission due to the high mobility…
▽ More
Connected and autonomous vehicles (CAVs) rely heavily upon time-sensitive information update services to ensure the safety of people and assets, and satisfactory entertainment applications. Therefore, the freshness of information is a crucial performance metric for CAV services. However, information from roadside sensors and nearby vehicles can get delayed in transmission due to the high mobility of vehicles. Our research shows that a CAV's relative distance and speed play an essential role in determining the Age-of-Information (AoI). With an increase in AoI, incremental service aggregation issues are observed with out-of-sequence information updates, which hampers the performance of low-latency applications in CAVs. In this paper, we propose a novel AoI-based service aggregation method for CAVs, which can process the information updates according to their update cycles. First, the AoI for sensors and vehicles is modeled, and a predictive AoI system is designed. Then, to reduce the overall service aggregation time and computational load, intervals are used for periodic AoI prediction, and information sources are clustered based on the AoI value. Finally, the system aggregates services for CAV applications using the predicted AoI. We evaluate the system performance based on data sequencing success rate (DSSR) and overall system latency. Lastly, we compare the performance of our proposed system with three other state-of-the-art methods. The evaluation and comparison results show that our proposed predictive AoI-based service aggregation system maintains satisfactory latency and DSSR for CAV applications and outperforms other existing methods.
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
Experimental study of aerosol deposition in distal lung bronchioles
Authors:
Arnab Kumar Mallik
Abstract:
The deposition of micron particles finds importance in meteorology and several engineering applications such as deposition of dust in gas lines, carbon deposition in engine exhaust, designing effective air-cleaning systems and estimating deposition of inhaled drug or atmospheric pollutants to determine its consequences on human health. Although the existing literature on deposition in straight tub…
▽ More
The deposition of micron particles finds importance in meteorology and several engineering applications such as deposition of dust in gas lines, carbon deposition in engine exhaust, designing effective air-cleaning systems and estimating deposition of inhaled drug or atmospheric pollutants to determine its consequences on human health. Although the existing literature on deposition in straight tubes is quite mature, an experimental study on deposition in micro capillaries with a wide ranges of Re that models particle dynamics in lungs, is missing. The deposition of atmospheric pollutants and nebulized drugs in the lung depends on various biological factors such as flow properties, lung morphology, breathing patterns, particle properties, deposition mechanism, etc. To complicate matters, each breath manifests flows spanning a wide range of Reynolds numbers in various regions of the lung. In this study, the deposition of nebulized aerosol was experimentally investigated in phantom bronchioles of diameters relevant to the 7th to the 23rd branching generations and over the entire range of Re manifest during one breathing cycle. The aerosol fluid was loaded with boron doped carbon quantum dots as a fluorophore. An aerosol was generated of this mixture fluid using an ultrasonic nebulizer, producing droplets of 6.5$μ$m as the mean diameter. The amount of aerosol deposited on the bronchiole walls was measured using a spectrofluorometer. Finally, a universal bronchiole scale deposition model is proposed which can form the building block for lung-scale aerosol deposition prediction.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
VLP: Vision Language Planning for Autonomous Driving
Authors:
Chenbin Pan,
Burhaneddin Yaman,
Tommaso Nesti,
Abhirup Mallik,
Alessandro G Allievi,
Senem Velipasalar,
Liu Ren
Abstract:
Autonomous driving is a complex and challenging task that aims at safe motion planning through scene understanding and reasoning. While vision-only autonomous driving methods have recently achieved notable performance, through enhanced scene understanding, several key issues, including lack of reasoning, low generalization performance and long-tail scenarios, still need to be addressed. In this pa…
▽ More
Autonomous driving is a complex and challenging task that aims at safe motion planning through scene understanding and reasoning. While vision-only autonomous driving methods have recently achieved notable performance, through enhanced scene understanding, several key issues, including lack of reasoning, low generalization performance and long-tail scenarios, still need to be addressed. In this paper, we present VLP, a novel Vision-Language-Planning framework that exploits language models to bridge the gap between linguistic understanding and autonomous driving. VLP enhances autonomous driving systems by strengthening both the source memory foundation and the self-driving car's contextual understanding. VLP achieves state-of-the-art end-to-end planning performance on the challenging NuScenes dataset by achieving 35.9\% and 60.5\% reduction in terms of average L2 error and collision rates, respectively, compared to the previous best method. Moreover, VLP shows improved performance in challenging long-tail scenarios and strong generalization capabilities when faced with new urban environments.
△ Less
Submitted 9 March, 2024; v1 submitted 10 January, 2024;
originally announced January 2024.
-
DeepEn2023: Energy Datasets for Edge Artificial Intelligence
Authors:
Xiaolong Tu,
Anik Mallik,
Haoxin Wang,
Jiang Xie
Abstract:
Climate change poses one of the most significant challenges to humanity. As a result of these climatic changes, the frequency of weather, climate, and water-related disasters has multiplied fivefold over the past 50 years, resulting in over 2 million deaths and losses exceeding $3.64 trillion USD. Leveraging AI-powered technologies for sustainable development and combating climate change is a prom…
▽ More
Climate change poses one of the most significant challenges to humanity. As a result of these climatic changes, the frequency of weather, climate, and water-related disasters has multiplied fivefold over the past 50 years, resulting in over 2 million deaths and losses exceeding $3.64 trillion USD. Leveraging AI-powered technologies for sustainable development and combating climate change is a promising avenue. Numerous significant publications are dedicated to using AI to improve renewable energy forecasting, enhance waste management, and monitor environmental changes in real time. However, very few research studies focus on making AI itself environmentally sustainable. This oversight regarding the sustainability of AI within the field might be attributed to a mindset gap and the absence of comprehensive energy datasets. In addition, with the ubiquity of edge AI systems and applications, especially on-device learning, there is a pressing need to measure, analyze, and optimize their environmental sustainability, such as energy efficiency. To this end, in this paper, we propose large-scale energy datasets for edge AI, named DeepEn2023, covering a wide range of kernels, state-of-the-art deep neural network models, and popular edge AI applications. We anticipate that DeepEn2023 will improve transparency in sustainability in on-device deep learning across a range of edge AI systems and applications. For more information, including access to the dataset and code, please visit https://amai-gsu.github.io/DeepEn2023.
△ Less
Submitted 30 November, 2023;
originally announced December 2023.
-
A Gale-Shapley View of Unique Stable Marriages
Authors:
Kartik Gokhale,
Amit Kumar Mallik,
Ankit Kumar Misra,
Swaprava Nath
Abstract:
Stable marriage of a two-sided market with unit demand is a classic problem that arises in many real-world scenarios. In addition, a unique stable marriage in this market simplifies a host of downstream desiderata. In this paper, we explore a new set of sufficient conditions for unique stable matching (USM) under this setup. Unlike other approaches that also address this question using the structu…
▽ More
Stable marriage of a two-sided market with unit demand is a classic problem that arises in many real-world scenarios. In addition, a unique stable marriage in this market simplifies a host of downstream desiderata. In this paper, we explore a new set of sufficient conditions for unique stable matching (USM) under this setup. Unlike other approaches that also address this question using the structure of preference profiles, we use an algorithmic viewpoint and investigate if this question can be answered using the lens of the deferred acceptance (DA) algorithm (Gale and Shapley, 1962). Our results yield a set of sufficient conditions for USM (viz., MaxProp and MaxRou) and show that these are disjoint from the previously known sufficiency conditions like sequential preference and no crossing. We also provide a characterization of MaxProp that makes it efficiently verifiable, and shows the gap between MaxProp and the entire USM class. These results give a more detailed view of the sub-structures of the USM class.
△ Less
Submitted 2 August, 2024; v1 submitted 28 October, 2023;
originally announced October 2023.
-
Unveiling Energy Efficiency in Deep Learning: Measurement, Prediction, and Scoring across Edge Devices
Authors:
Xiaolong Tu,
Anik Mallik,
Dawei Chen,
Kyungtae Han,
Onur Altintas,
Haoxin Wang,
Jiang Xie
Abstract:
Today, deep learning optimization is primarily driven by research focused on achieving high inference accuracy and reducing latency. However, the energy efficiency aspect is often overlooked, possibly due to a lack of sustainability mindset in the field and the absence of a holistic energy dataset. In this paper, we conduct a threefold study, including energy measurement, prediction, and efficienc…
▽ More
Today, deep learning optimization is primarily driven by research focused on achieving high inference accuracy and reducing latency. However, the energy efficiency aspect is often overlooked, possibly due to a lack of sustainability mindset in the field and the absence of a holistic energy dataset. In this paper, we conduct a threefold study, including energy measurement, prediction, and efficiency scoring, with an objective to foster transparency in power and energy consumption within deep learning across various edge devices. Firstly, we present a detailed, first-of-its-kind measurement study that uncovers the energy consumption characteristics of on-device deep learning. This study results in the creation of three extensive energy datasets for edge devices, covering a wide range of kernels, state-of-the-art DNN models, and popular AI applications. Secondly, we design and implement the first kernel-level energy predictors for edge devices based on our kernel-level energy dataset. Evaluation results demonstrate the ability of our predictors to provide consistent and accurate energy estimations on unseen DNN models. Lastly, we introduce two scoring metrics, PCS and IECS, developed to convert complex power and energy consumption data of an edge device into an easily understandable manner for edge device end-users. We hope our work can help shift the mindset of both end-users and the research community towards sustainability in edge computing, a principle that drives our research. Find data, code, and more up-to-date information at https://amai-gsu.github.io/DeepEn2023.
△ Less
Submitted 10 June, 2024; v1 submitted 19 October, 2023;
originally announced October 2023.
-
Correlation-driven non-trivial phases in single bi-layer Kagome intermetallics
Authors:
Aabhaas Vineet Mallik,
Adhip Agarwala,
Tanusri Saha-Dasgupta
Abstract:
Bi-layer Kagome compounds provide an exciting playground where the interplay of topology and strong correlations can give rise to exotic phases of matter. Motivated by recent first principles calculation on such systems (Phys. Rev. Lett 125, 026401), reporting stabilization of a Chern metal with topological nearly-flat band close to Fermi level, we build minimal models to study the effect of stron…
▽ More
Bi-layer Kagome compounds provide an exciting playground where the interplay of topology and strong correlations can give rise to exotic phases of matter. Motivated by recent first principles calculation on such systems (Phys. Rev. Lett 125, 026401), reporting stabilization of a Chern metal with topological nearly-flat band close to Fermi level, we build minimal models to study the effect of strong electron-electron interactions on such a Chern metal. Using approriate numerical and analytical techniques, we show that the topologically non-trivial bands present in this system at the Fermi energy can realize fractional Chern insulator states. We further show that if the time-reversal symmetry is restored due to destruction of magnetism by low dimensionality and fluctuation, the system can realize a superconducting phase in the presence of strong local repulsive interactions. Furthermore, we identify an interesting phase transition from the superconducting phase to a correlated metal by tuning nearest-neighbor repulsion. Our study uncovers a rich set of non-trivial phases realizable in this system, and contextualizes the physically meaningful regimes where such phases can be further explored.
△ Less
Submitted 30 June, 2023;
originally announced June 2023.
-
An FPGA-Based Semi-Automated Traffic Control System Using Verilog HDL
Authors:
Anik Mallik,
Sanjoy Kundu,
Md. Ashikur Rahman
Abstract:
Traffic Congestion is one of the severe problems in heavily populated countries like Bangladesh where Automated Traffic Control System needs to be implemented. An FPGA-based Semi-automated system is introduced in this paper including a completely new feature "Safe State" to avoid sudden unwanted collision. Here we used sequential encoding which has made the program much simpler and so that easy to…
▽ More
Traffic Congestion is one of the severe problems in heavily populated countries like Bangladesh where Automated Traffic Control System needs to be implemented. An FPGA-based Semi-automated system is introduced in this paper including a completely new feature "Safe State" to avoid sudden unwanted collision. Here we used sequential encoding which has made the program much simpler and so that easy to control and modify. The experimental result showed the automated change in traffic lights according to the specified timing sequences which would be able to conduct maximum possible transitions of vehicles occur at different directions simultaneously without facing any accident.
△ Less
Submitted 8 March, 2023;
originally announced March 2023.
-
EPAM: A Predictive Energy Model for Mobile AI
Authors:
Anik Mallik,
Haoxin Wang,
Jiang Xie,
Dawei Chen,
Kyungtae Han
Abstract:
Artificial intelligence (AI) has enabled a new paradigm of smart applications -- changing our way of living entirely. Many of these AI-enabled applications have very stringent latency requirements, especially for applications on mobile devices (e.g., smartphones, wearable devices, and vehicles). Hence, smaller and quantized deep neural network (DNN) models are developed for mobile devices, which p…
▽ More
Artificial intelligence (AI) has enabled a new paradigm of smart applications -- changing our way of living entirely. Many of these AI-enabled applications have very stringent latency requirements, especially for applications on mobile devices (e.g., smartphones, wearable devices, and vehicles). Hence, smaller and quantized deep neural network (DNN) models are developed for mobile devices, which provide faster and more energy-efficient computation for mobile AI applications. However, how AI models consume energy in a mobile device is still unexplored. Predicting the energy consumption of these models, along with their different applications, such as vision and non-vision, requires a thorough investigation of their behavior using various processing sources. In this paper, we introduce a comprehensive study of mobile AI applications considering different DNN models and processing sources, focusing on computational resource utilization, delay, and energy consumption. We measure the latency, energy consumption, and memory usage of all the models using four processing sources through extensive experiments. We explain the challenges in such investigations and how we propose to overcome them. Our study highlights important insights, such as how mobile AI behaves in different applications (vision and non-vision) using CPU, GPU, and NNAPI. Finally, we propose a novel Gaussian process regression-based general predictive energy model based on DNN structures, computation resources, and processors, which can predict the energy for each complete application cycle irrespective of device configuration and application. This study provides crucial facts and an energy prediction mechanism to the AI research community to help bring energy efficiency to mobile AI applications.
△ Less
Submitted 2 March, 2023;
originally announced March 2023.
-
Semi-Supervised and Unsupervised Sense Annotation via Translations
Authors:
Bradley Hauer,
Grzegorz Kondrak,
Yixing Luan,
Arnob Mallik,
Lili Mou
Abstract:
Acquisition of multilingual training data continues to be a challenge in word sense disambiguation (WSD). To address this problem, unsupervised approaches have been proposed to automatically generate sense annotations for training supervised WSD systems. We present three new methods for creating sense-annotated corpora which leverage translations, parallel bitexts, lexical resources, as well as co…
▽ More
Acquisition of multilingual training data continues to be a challenge in word sense disambiguation (WSD). To address this problem, unsupervised approaches have been proposed to automatically generate sense annotations for training supervised WSD systems. We present three new methods for creating sense-annotated corpora which leverage translations, parallel bitexts, lexical resources, as well as contextual and synset embeddings. Our semi-supervised method applies machine translation to transfer existing sense annotations to other languages. Our two unsupervised methods refine sense annotations produced by a knowledge-based WSD system via lexical translations in a parallel corpus. We obtain state-of-the-art results on standard WSD benchmarks.
△ Less
Submitted 17 September, 2021; v1 submitted 11 June, 2021;
originally announced June 2021.
-
Interplay of Magnetism and Topological Superconductivity in Bilayer Kagome Metals
Authors:
Santu Baidya,
Aabhaas Vineet Mallik,
Subhro Bhattacharjee,
Tanusri Saha-Dasgupta
Abstract:
The binary intermetallic materials, $M_3$Sn$_2$ ($M$ = 3d transition metal) present a new class of strongly correlated systems that naturally allows for the interplay of magnetism and metallicity. Using first principles calculations we confirm that bulk Fe$_3$Sn$_2$ is a ferromagnetic metal, and show that $M$ = Ni and Cu are paramagnetic metals with non-trivial band structures. Focusing on Fe$_3$S…
▽ More
The binary intermetallic materials, $M_3$Sn$_2$ ($M$ = 3d transition metal) present a new class of strongly correlated systems that naturally allows for the interplay of magnetism and metallicity. Using first principles calculations we confirm that bulk Fe$_3$Sn$_2$ is a ferromagnetic metal, and show that $M$ = Ni and Cu are paramagnetic metals with non-trivial band structures. Focusing on Fe$_3$Sn$_2$ to understand the effect of enhanced correlations in an experimentally relevant atomistically thin single kagome-bilayer, our ab-initio results show that dimensional confinement naturally exposes the flatness of band structure associated with the bilayer kagome geometry in a resultant ferromagnetic Chern metal. We use a multistage minimal modeling of the magnetic bands progressively closer to the Fermi energy. This effectively captures the physics of the Chern metal with a non-zero anomalous Hall response over a material relevant parameter regime along with a possible superconducting instability of the spin-polarised band resulting in a topological superconductor.
△ Less
Submitted 9 July, 2020; v1 submitted 7 February, 2020;
originally announced February 2020.
-
Tracking the evolution of magmas from heterogeneous mantle sources to eruption
Authors:
Ananya Mallik,
Sarah Lambart,
Emily J. Chin
Abstract:
This contribution reviews the effects of source heterogeneities, melt-rock reactions and intracrustal differentiation on magma chemistry across mid-ocean ridges, intraplate settings and subduction zones using experimental studies and natural data. We compare melting behaviors of pyroxenites and peridotites and their relative contributions to magmas as functions of composition, mantle potential tem…
▽ More
This contribution reviews the effects of source heterogeneities, melt-rock reactions and intracrustal differentiation on magma chemistry across mid-ocean ridges, intraplate settings and subduction zones using experimental studies and natural data. We compare melting behaviors of pyroxenites and peridotites and their relative contributions to magmas as functions of composition, mantle potential temperatures and lithospheric thickness. We also discuss the fate of chemically distinct melts derived from heterogeneities as they travel through a peridotitic mantle. Using nearly 60,000 natural major element compositions of volcanic rocks, melt inclusions, and crystalline cumulates, we assess broad petrogenetic trends in as large of a global dataset as possible. Consistent with previous studies, major element chemistry of mid-ocean ridge basalts (MORBs) and their cumulates favor a first-order control of intracrustal crystal-liquid segregation, while trace element studies emphasize the role of melt-rock reactions, highlighting the decoupling between the two. Ocean island basalts (OIB) show a larger compositional variability than MORB, partly attributed to large variations of pyroxenite proportions in the mantle source. However, the estimated proportions vary considerably with heterogeneity composition, melting model and thermal structure of the mantle. For arcs, we highlight current views on the role of the downgoing slab into the source of primary arc magmas, and the role of the overriding lithosphere as a magmatic chemical filter and as the repository of voluminous arc cumulates. Our approach of simultaneously looking at a large database of volcanic + deep crustal rocks across diverse tectonic settings underscores the challenge of deciphering the source signal versus intracrustal/lithospheric processes.
△ Less
Submitted 3 January, 2020;
originally announced January 2020.
-
FQ-Conv: Fully Quantized Convolution for Efficient and Accurate Inference
Authors:
Bram-Ernst Verhoef,
Nathan Laubeuf,
Stefan Cosemans,
Peter Debacker,
Ioannis Papistas,
Arindam Mallik,
Diederik Verkest
Abstract:
Deep neural networks (DNNs) can be made hardware-efficient by reducing the numerical precision of the weights and activations of the network and by improving the network's resilience to noise. However, this gain in efficiency often comes at the cost of significantly reduced accuracy. In this paper, we present a novel approach to quantizing convolutional neural network. The resulting networks perfo…
▽ More
Deep neural networks (DNNs) can be made hardware-efficient by reducing the numerical precision of the weights and activations of the network and by improving the network's resilience to noise. However, this gain in efficiency often comes at the cost of significantly reduced accuracy. In this paper, we present a novel approach to quantizing convolutional neural network. The resulting networks perform all computations in low-precision, without requiring higher-precision BN and nonlinearities, while still being highly accurate. To achieve this result, we employ a novel quantization technique that learns to optimally quantize the weights and activations of the network during training. Additionally, to enhance training convergence we use a new training technique, called gradual quantization. We leverage the nonlinear and normalizing behavior of our quantization function to effectively remove the higher-precision nonlinearities and BN from the network. The resulting convolutional layers are fully quantized to low precision, from input to output, ideal for neural network accelerators on the edge. We demonstrate the potential of this approach on different datasets and networks, showing that ternary-weight CNNs with low-precision in- and outputs perform virtually on par with their full-precision equivalents. Finally, we analyze the influence of noise on the weights, activations and convolution outputs (multiply-accumulate, MAC) and propose a strategy to improve network performance under noisy conditions.
△ Less
Submitted 19 December, 2019;
originally announced December 2019.
-
Stable Multiple Time Step Simulation/Prediction from Lagged Dynamic Network Regression Models
Authors:
Abhirup Mallik,
Zack W. Almquist
Abstract:
Recent developments in computers and automated data collection strategies have greatly increased the interest in statistical modeling of dynamic networks. Many of the statistical models employed for inference on large-scale dynamic networks suffer from limited forward simulation/prediction ability. A major problem with many of the forward simulation procedures is the tendency for the model to beco…
▽ More
Recent developments in computers and automated data collection strategies have greatly increased the interest in statistical modeling of dynamic networks. Many of the statistical models employed for inference on large-scale dynamic networks suffer from limited forward simulation/prediction ability. A major problem with many of the forward simulation procedures is the tendency for the model to become degenerate in only a few time steps, i.e., the simulation/prediction procedure results in either null graphs or complete graphs. Here, we describe an algorithm for simulating a sequence of networks generated from lagged dynamic network regression models DNR(V), a sub-family of TERGMs. We introduce a smoothed estimator for forward prediction based on smoothing of the change statistics obtained for a dynamic network regression model. We focus on the implementation of the algorithm, providing a series of motivating examples with comparisons to dynamic network models from the literature. We find that our algorithm significantly improves multi-step prediction/simulation over standard DNR(V) forecasting. Furthermore, we show that our method performs comparably to existing more complex dynamic network analysis frameworks (SAOM and STERGMs) for small networks over short time periods, and significantly outperforms these approaches over long time time intervals and/or large networks.
△ Less
Submitted 23 July, 2018;
originally announced July 2018.
-
Surprises in the t-J model: Implications for cuprates
Authors:
Aabhaas Vineet Mallik,
Gaurav Kumar Gupta,
Vijay B. Shenoy,
H. R. Krishnamurthy
Abstract:
The t-J model is a paradigmatic model for the study of strongly correlated electron systems. In particular, it has been argued that it is an appropriate model to describe the cuprate high-Tc superconductors. It turns out that a comprehensive understanding of the gamut of physics encoded by the t-J model is still an open problem. In recent years some remarkable experiments on the cuprates, for exam…
▽ More
The t-J model is a paradigmatic model for the study of strongly correlated electron systems. In particular, it has been argued that it is an appropriate model to describe the cuprate high-Tc superconductors. It turns out that a comprehensive understanding of the gamut of physics encoded by the t-J model is still an open problem. In recent years some remarkable experiments on the cuprates, for example, discovery of nodeless superconductivity in underdoped samples (PNAS 109, 18332 (2012)), discovery of s-wave like gap in the pseudogap phase (Phys. Rev. Lett. 111, 107001 (2013)), and observation of polar Kerr effect (PKE) (Phys. Rev. Lett. 112, 047003 (2014)), have thrown up new challenges for this model. Here, we present results demonstrating that, within the slave-particle formulation of the t-J model, the d-wave superconductor is unstable at low doping to its own anti-symmetric phase mode fluctuations when the effect of fluctuations is treated self-consistently. We then show that this instability gives way to a time reversal symmetry broken d + is-SC in the underdoped region which has superfluid stiffness consistent with Uemura relation, even with a large pair amplitude. We show that our results are consistent with existing experiments on cuprates and suggest that Josephson (SQUID interferometry) experiments can clearly distinguish the d+is-SC from a host of other possibilities alluded to be contributing to the physics of underdoped cuprates. We also comment on other theoretical studies vis-a-vis ours.
△ Less
Submitted 7 May, 2018;
originally announced May 2018.
-
Directional Metropolis-Hastings
Authors:
Abhirup Mallik,
Galin L. Jones
Abstract:
We propose a new kernel for Metropolis Hastings called Directional Metropolis Hastings (DMH) with multivariate update where the proposal kernel has state dependent covariance matrix. We use the derivative of the target distribution at the current state to change the orientation of the proposal distribution, therefore producing a more plausible proposal. We study the conditions for geometric ergodi…
▽ More
We propose a new kernel for Metropolis Hastings called Directional Metropolis Hastings (DMH) with multivariate update where the proposal kernel has state dependent covariance matrix. We use the derivative of the target distribution at the current state to change the orientation of the proposal distribution, therefore producing a more plausible proposal. We study the conditions for geometric ergodicity of our algorithm and provide necessary and sufficient conditions for convergence. We also suggest a scheme for adaptively update the variance parameter and study the conditions of ergodicity of the adaptive algorithm. We demonstrate the performance of our algorithm in a Bayesian generalized linear model problem.
△ Less
Submitted 26 October, 2017;
originally announced October 2017.
-
Crucial role of Internal Collective Modes in Underdoped Cuprates
Authors:
Aabhaas V. Mallik,
Umesh K. Yadav,
Amal Medhi,
H. R. Krishnamurthy,
Vijay B. Shenoy
Abstract:
The enigmatic cuprate superconductors have attracted resurgent interest with several recent reports and discussions of competing orders in the underdoped side. Motivated by this, here we address the natural question of fragility of the d-wave superconducting state in underdoped cuprates. Using a combination of theoretical approaches we study t-J like models, and discover an - as yet unexplored - i…
▽ More
The enigmatic cuprate superconductors have attracted resurgent interest with several recent reports and discussions of competing orders in the underdoped side. Motivated by this, here we address the natural question of fragility of the d-wave superconducting state in underdoped cuprates. Using a combination of theoretical approaches we study t-J like models, and discover an - as yet unexplored - instability that is brought about by an "internal" (anti-symmetric mode) fluctuation of the d-wave state. This new theoretical result is in good agreement with recent STM and ARPES studies of cuprates. We also suggest experimental directions to uncover this physics.
△ Less
Submitted 7 February, 2017; v1 submitted 31 March, 2016;
originally announced March 2016.
-
M-estimation in multistage sampling procedures
Authors:
Atul Mallik,
Moulinath Banerjee,
George Michailidis
Abstract:
Multi-stage (designed) procedures, obtained by splitting the sampling budget suitably across stages, and designing the sampling at a particular stage based on information about the parameter obtained from previous stages, are often advantageous from the perspective of precise inference. We develop a generic framework for M-estimation in a multistage setting and apply empirical process techniques t…
▽ More
Multi-stage (designed) procedures, obtained by splitting the sampling budget suitably across stages, and designing the sampling at a particular stage based on information about the parameter obtained from previous stages, are often advantageous from the perspective of precise inference. We develop a generic framework for M-estimation in a multistage setting and apply empirical process techniques to develop limit theorems that describe the large sample behavior of the resulting M-estimates. Applications to change-point estimation, inverse isotonic regression, classification and mode estimation are provided: it is typically seen that the multistage procedure accentuates the efficiency of the M-estimates by accelerating the rate of convergence, relative to one-stage procedures. The step-by-step process induces dependence across stages and complicates the analysis in such problems, which we address through careful conditioning arguments.
△ Less
Submitted 7 January, 2014;
originally announced January 2014.
-
Baseline zone estimation in two dimensions
Authors:
Atul Mallik,
Moulinath Banerjee,
Michael Woodroofe
Abstract:
We consider the problem of estimating the region on which a non-parametric regression function is at its baseline level in two dimensions. The baseline level typically corresponds to the minimum/maximum of the function and estimating such regions or their complements is pertinent to several problems arising in edge estimation, environmental statistics, fMRI and related fields. We assume the baseli…
▽ More
We consider the problem of estimating the region on which a non-parametric regression function is at its baseline level in two dimensions. The baseline level typically corresponds to the minimum/maximum of the function and estimating such regions or their complements is pertinent to several problems arising in edge estimation, environmental statistics, fMRI and related fields. We assume the baseline region to be convex and estimate it via fitting a `stump' function to approximate $p$-values obtained from tests for deviation of the regression function from its baseline level. The estimates, obtained using an algorithm originally developed for constructing convex contours of a density, are studied in two different sampling settings, one where several responses can be obtained at a number of different covariate-levels (dose-response) and the other involving limited number of response values per covariate (standard regression). The shape of the baseline region and the smoothness of the regression function at its boundary play a critical role in determining the rate of convergence of our estimate: for a regression function which is `p-regular' at the boundary of the convex baseline region, our estimate converges at a rate $N^{2/(4p+3)}$ in the dose-response setting, $N$ being the total budget, and its analogue in the standard regression setting converges at a rate of $N^{1/(2p+2)}$. Extensions to non-convex baseline regions are explored as well.
△ Less
Submitted 22 December, 2013;
originally announced December 2013.
-
Threshold estimation based on a p-value framework in dose-response and regression settings
Authors:
Atul Mallik,
Bodhisattva Sen,
Moulinath Banerjee,
George Michailidis
Abstract:
We use p-values to identify the threshold level at which a regression function takes off from its baseline value, a problem motivated by applications in toxicological and pharmacological dose-response studies and environmental statistics. We study the problem in two sampling settings: one where multiple responses can be obtained at a number of different covariate-levels and the other the standard…
▽ More
We use p-values to identify the threshold level at which a regression function takes off from its baseline value, a problem motivated by applications in toxicological and pharmacological dose-response studies and environmental statistics. We study the problem in two sampling settings: one where multiple responses can be obtained at a number of different covariate-levels and the other the standard regression setting involving limited number of response values at each covariate. Our procedure involves testing the hypothesis that the regression function is at its baseline at each covariate value and then computing the potentially approximate p-value of the test. An estimate of the threshold is obtained by fitting a piecewise constant function with a single jump discontinuity, otherwise known as a stump, to these observed p-values, as they behave in markedly different ways on the two sides of the threshold. The estimate is shown to be consistent and its finite sample properties are studied through simulations. Our approach is computationally simple and extends to the estimation of the baseline value of the regression function, heteroscedastic errors and to time-series. It is illustrated on some real data applications.
△ Less
Submitted 9 June, 2011;
originally announced June 2011.
-
Laser Plasma Interaction and Non-classical Properties of Radiation Field
Authors:
Aabhaas Vineet Mallik,
Pratyay Ghosh,
Ananda Dasgupta
Abstract:
We show by explicit calculations that non-classical states of the radiation field can be produced by allowing short term interaction between a coherent state of the radiation field with plasma. Whereas, long term interaction, which thermalizes the radiation field, can produce non-classical states of the radiation field only at sufficiently small temperatures. A measure of k-th order squeezing, str…
▽ More
We show by explicit calculations that non-classical states of the radiation field can be produced by allowing short term interaction between a coherent state of the radiation field with plasma. Whereas, long term interaction, which thermalizes the radiation field, can produce non-classical states of the radiation field only at sufficiently small temperatures. A measure of k-th order squeezing, stricter than the one proposed by Zhang et al, is used to check the emergence of squeezing. It is also shown that photons in the considered thermalized field would follow super-Poissonian statistics.
△ Less
Submitted 14 June, 2011; v1 submitted 24 May, 2011;
originally announced May 2011.
-
A Central Limit Theorem For Linear Random Fields
Authors:
Atul Mallik,
Michael Woodroofe
Abstract:
A Central Limit Theorem is proved for linear random fields when sums are taken over finite disjoint union of rectangles. The approach does not rely upon the use of Beveridge Nelson decomposition and the conditions needed are similar to those given by Ibragimov for linear processes. When specializing this result to the case when sums are being taken over rectangles, a complete analogue of Ibragimov…
▽ More
A Central Limit Theorem is proved for linear random fields when sums are taken over finite disjoint union of rectangles. The approach does not rely upon the use of Beveridge Nelson decomposition and the conditions needed are similar to those given by Ibragimov for linear processes. When specializing this result to the case when sums are being taken over rectangles, a complete analogue of Ibragimov result is obtained with a lot of uniformity.
△ Less
Submitted 13 July, 2010; v1 submitted 8 July, 2010;
originally announced July 2010.
-
Multiscale Modeling of Materials - Concepts and Illustration
Authors:
Aditi Mallik,
Keith Runge,
James W. Dufty,
Hai-Ping Cheng
Abstract:
The approximate representation of a quantum solid as an equivalent composite semi-classical solid is considered for insulating materials. The composite is comprised of point ions moving on a potential energy surface. In the classical bulk domain this potential energy is represented by pair potentials constructed to give the same structure and elastic properties as the underlying quantum solid. I…
▽ More
The approximate representation of a quantum solid as an equivalent composite semi-classical solid is considered for insulating materials. The composite is comprised of point ions moving on a potential energy surface. In the classical bulk domain this potential energy is represented by pair potentials constructed to give the same structure and elastic properties as the underlying quantum solid. In a small local quantum domain the potential is determined from a detailed quantum calculation of the electronic structure. The primary new ingredients are 1) a determination of the pair potential from quantum data for equilibrium and strained structures, 2) development of pseudo-atoms for a realistic treatment of charge densities where bonds have been broken to define the quantum domain, and 3) inclusion of polarization effects on the quantum domain due to its environment. This formal structure is illustrated in detail for an silica nanorod. For each configuration considered, the charge density of the entire solid is calculated quantum mechanically to provide the reference by which to judge the accuracy of the modeling.It is then shown that the quantum rod, the rod constructed from the classical pair potentials, and the composite classical/quantum rod all have the same equilibrium structure and response to elastic strain. The accuracy of the modeling is shown to apply for two quite different quantum chemical methods for the underlying quantum mechanics: transfer Hamiltonian and density functional methods.
△ Less
Submitted 24 July, 2005;
originally announced July 2005.
-
Constructing A Small Strain Potential for Multi-Scale Modeling
Authors:
Aditi Mallik,
Keith Runge,
Hai-Ping Cheng,
James W. Dufty
Abstract:
For problems relating to fracture, a consistent embedding of a quantum (QM) domain in its classical (CM) environment requires that the classical system should yield the same structure and elastic properties as the QM domain for states near equilibrium. It is proposed that an appropriate classical potential can be constructed using ab initio data on the equilibrium and weakly strained configurati…
▽ More
For problems relating to fracture, a consistent embedding of a quantum (QM) domain in its classical (CM) environment requires that the classical system should yield the same structure and elastic properties as the QM domain for states near equilibrium. It is proposed that an appropriate classical potential can be constructed using ab initio data on the equilibrium and weakly strained configurations calculated from the quantum description, rather than the more usual approach of fitting to a wide range of empirical data. The scheme is illustrated in detail for a model system, silica nanorod that has the proper stiochiometric ratio of Si:O as observed in real silica. The potential is chosen to be pairwise additive, with the same pair potential functional form as familiar phenomenological TTAM potential. Here, the parameters are determined using a genetic algorithm with force data obtained directly from a quantum calculation. The resulting potential gives excellent agreement with properties of the reference quantum calculations both for structure (bond lengths, bond angles) and elasticity (Young's modulus). The proposed method for constructing the classical potential is carried out for two different choices for the quantum mechanical description: a transfer Hamiltonian method (NDDO with coupled cluster parameterization) and density functional theory (with plane wave basis set and PBE exchange correlation functional). The quality of the potentials obtained in both cases is quite good, although the two quantum rods have significant differences.
△ Less
Submitted 28 January, 2005;
originally announced January 2005.
-
An Application of Transfer Hamiltonian Quantum Mechanics to Multi-Scale Modeling
Authors:
Aditi Mallik,
Carlos E. Taylor,
Keith Runge,
James W. Dufty
Abstract:
In quantum/classical (QM/CM) partitioning methods for multi-scale modeling, one is often forced to introduce uncontrolled phenomenological effects of the environment (CM) in the quantum (QM) domain as ab initio quantum calculations are computationally too intensive to be applied to the whole sample. We propose a method, in which two qualitatively different components of the information about the…
▽ More
In quantum/classical (QM/CM) partitioning methods for multi-scale modeling, one is often forced to introduce uncontrolled phenomenological effects of the environment (CM) in the quantum (QM) domain as ab initio quantum calculations are computationally too intensive to be applied to the whole sample. We propose a method, in which two qualitatively different components of the information about the state of the CM region are incorporated into the QM calculations. First, pseudoatoms constructed to describe the chemistry of the nearest neighbor exchange interactions replace the atoms at the boundary of the CM and the QM regions. Second, the remaining effect of the CM bulk environment due to long-range Coulombic interactions is modeled in terms of dipoles. We have tested this partitioning method in a silica nanorod and a 3-membered silica ring for which ab initio quantum data for the whole system is available to assess the quality of the proposed partitioning method.
△ Less
Submitted 2 March, 2004;
originally announced March 2004.