-
Predicting Solar Energy Generation with Machine Learning based on AQI and Weather Features
Authors:
Arjun Shah,
Varun Viswanath,
Kashish Gandhi,
Nilesh Madhukar Patil
Abstract:
This paper addresses the pressing need for an accurate solar energy prediction model, which is crucial for efficient grid integration. We explore the influence of the Air Quality Index and weather features on solar energy generation, employing advanced Machine Learning and Deep Learning techniques. Our methodology uses time series modeling and makes novel use of power transform normalization and z…
▽ More
This paper addresses the pressing need for an accurate solar energy prediction model, which is crucial for efficient grid integration. We explore the influence of the Air Quality Index and weather features on solar energy generation, employing advanced Machine Learning and Deep Learning techniques. Our methodology uses time series modeling and makes novel use of power transform normalization and zero-inflated modeling. Various Machine Learning algorithms and Conv2D Long Short-Term Memory model based Deep Learning models are applied to these transformations for precise predictions. Results underscore the effectiveness of our approach, demonstrating enhanced prediction accuracy with Air Quality Index and weather features. We achieved a 0.9691 $R^2$ Score, 0.18 MAE, 0.10 RMSE with Conv2D Long Short-Term Memory model, showcasing the power transform technique's innovation in enhancing time series forecasting for solar energy generation. Such results help our research contribute valuable insights to the synergy between Air Quality Index, weather features, and Deep Learning techniques for solar energy prediction.
△ Less
Submitted 23 August, 2024; v1 submitted 22 August, 2024;
originally announced August 2024.
-
Cross-Language Assessment of Mathematical Capability of ChatGPT
Authors:
Gargi Sathe,
Aneesh Shamraj,
Aditya Surve,
Nahush Patil,
Kumkum Saxena
Abstract:
This paper presents an evaluation of the mathematical capability of ChatGPT across diverse languages like Hindi, Gujarati, and Marathi. ChatGPT, based on GPT-3.5 by OpenAI, has garnered significant attention for its natural language understanding and generation abilities. However, its performance in solving mathematical problems across multiple natural languages remains a comparatively unexplored…
▽ More
This paper presents an evaluation of the mathematical capability of ChatGPT across diverse languages like Hindi, Gujarati, and Marathi. ChatGPT, based on GPT-3.5 by OpenAI, has garnered significant attention for its natural language understanding and generation abilities. However, its performance in solving mathematical problems across multiple natural languages remains a comparatively unexplored area, especially in regional Indian languages. In this paper, we explore those capabilities as well as using chain-of-thought prompting to figure out if it increases the accuracy of responses as much as it does in the English language and provide insights into the current limitations.
△ Less
Submitted 18 May, 2024;
originally announced May 2024.
-
Early Churn Prediction from Large Scale User-Product Interaction Time Series
Authors:
Shamik Bhattacharjee,
Utkarsh Thukral,
Nilesh Patil
Abstract:
User churn, characterized by customers ending their relationship with a business, has profound economic consequences across various Business-to-Customer scenarios. For numerous system-to-user actions, such as promotional discounts and retention campaigns, predicting potential churners stands as a primary objective. In volatile sectors like fantasy sports, unpredictable factors such as internationa…
▽ More
User churn, characterized by customers ending their relationship with a business, has profound economic consequences across various Business-to-Customer scenarios. For numerous system-to-user actions, such as promotional discounts and retention campaigns, predicting potential churners stands as a primary objective. In volatile sectors like fantasy sports, unpredictable factors such as international sports events can influence even regular spending habits. Consequently, while transaction history and user-product interaction are valuable in predicting churn, they demand deep domain knowledge and intricate feature engineering. Additionally, feature development for churn prediction systems can be resource-intensive, particularly in production settings serving 200m+ users, where inference pipelines largely focus on feature engineering. This paper conducts an exhaustive study on predicting user churn using historical data. We aim to create a model forecasting customer churn likelihood, facilitating businesses in comprehending attrition trends and formulating effective retention plans. Our approach treats churn prediction as multivariate time series classification, demonstrating that combining user activity and deep neural networks yields remarkable results for churn prediction in complex business-to-customer contexts.
△ Less
Submitted 25 September, 2023;
originally announced September 2023.
-
TPU v4: An Optically Reconfigurable Supercomputer for Machine Learning with Hardware Support for Embeddings
Authors:
Norman P. Jouppi,
George Kurian,
Sheng Li,
Peter Ma,
Rahul Nagarajan,
Lifeng Nai,
Nishant Patil,
Suvinay Subramanian,
Andy Swing,
Brian Towles,
Cliff Young,
Xiang Zhou,
Zongwei Zhou,
David Patterson
Abstract:
In response to innovations in machine learning (ML) models, production workloads changed radically and rapidly. TPU v4 is the fifth Google domain specific architecture (DSA) and its third supercomputer for such ML models. Optical circuit switches (OCSes) dynamically reconfigure its interconnect topology to improve scale, availability, utilization, modularity, deployment, security, power, and perfo…
▽ More
In response to innovations in machine learning (ML) models, production workloads changed radically and rapidly. TPU v4 is the fifth Google domain specific architecture (DSA) and its third supercomputer for such ML models. Optical circuit switches (OCSes) dynamically reconfigure its interconnect topology to improve scale, availability, utilization, modularity, deployment, security, power, and performance; users can pick a twisted 3D torus topology if desired. Much cheaper, lower power, and faster than Infiniband, OCSes and underlying optical components are <5% of system cost and <3% of system power. Each TPU v4 includes SparseCores, dataflow processors that accelerate models that rely on embeddings by 5x-7x yet use only 5% of die area and power. Deployed since 2020, TPU v4 outperforms TPU v3 by 2.1x and improves performance/Watt by 2.7x. The TPU v4 supercomputer is 4x larger at 4096 chips and thus ~10x faster overall, which along with OCS flexibility helps large language models. For similar sized systems, it is ~4.3x-4.5x faster than the Graphcore IPU Bow and is 1.2x-1.7x faster and uses 1.3x-1.9x less power than the Nvidia A100. TPU v4s inside the energy-optimized warehouse scale computers of Google Cloud use ~3x less energy and produce ~20x less CO2e than contemporary DSAs in a typical on-premise data center.
△ Less
Submitted 20 April, 2023; v1 submitted 3 April, 2023;
originally announced April 2023.
-
Robust self-healing prediction model for high dimensional data
Authors:
Anirudha Rayasam,
Nagamma Patil
Abstract:
Owing to the advantages of increased accuracy and the potential to detect unseen patterns, provided by data mining techniques they have been widely incorporated for standard classification problems. They have often been used for high precision disease prediction in the medical field, and several hybrid prediction models capable of achieving high accuracies have been proposed. Though this stands tr…
▽ More
Owing to the advantages of increased accuracy and the potential to detect unseen patterns, provided by data mining techniques they have been widely incorporated for standard classification problems. They have often been used for high precision disease prediction in the medical field, and several hybrid prediction models capable of achieving high accuracies have been proposed. Though this stands true most of the previous models fail to efficiently address the recurring issue of bad data quality which plagues most high dimensional data, and especially proves troublesome in the highly sensitive medical data. This work proposes a robust self healing (RSH) hybrid prediction model which functions by using the data in its entirety by removing errors and inconsistencies from it rather than discarding any data. Initial processing involves data preparation followed by cleansing or scrubbing through context-dependent attribute correction, which ensures that there is no significant loss of relevant information before the feature selection and prediction phases. An ensemble of heterogeneous classifiers, subjected to local boosting, is utilized to build the prediction model and genetic algorithm based wrapper feature selection technique wrapped on the respective classifiers is employed to select the corresponding optimal set of features, which warrant higher accuracy. The proposed method is compared with some of the existing high performing models and the results are analyzed.
△ Less
Submitted 4 October, 2022;
originally announced October 2022.
-
Encoder-Decoder Networks for Analyzing Thermal and Power Delivery Networks
Authors:
Vidya A. Chhabria,
Vipul Ahuja,
Ashwath Prabhu,
Nikhil Patil,
Palkesh Jain,
Sachin S. Sapatnekar
Abstract:
Power delivery network (PDN) analysis and thermal analysis are computationally expensive tasks that are essential for successful IC design. Algorithmically, both these analyses have similar computational structure and complexity as they involve the solution to a partial differential equation of the same form. This paper converts these analyses into image-to-image and sequence-to-sequence translati…
▽ More
Power delivery network (PDN) analysis and thermal analysis are computationally expensive tasks that are essential for successful IC design. Algorithmically, both these analyses have similar computational structure and complexity as they involve the solution to a partial differential equation of the same form. This paper converts these analyses into image-to-image and sequence-to-sequence translation tasks, which allows leveraging a class of machine learning models with an encoder-decoder-based generative (EDGe) architecture to address the time-intensive nature of these tasks. For PDN analysis, we propose two networks: (i) IREDGe: a full-chip static and dynamic IR drop predictor and (ii) EMEDGe: electromigration (EM) hotspot classifier based on input power, power grid distribution, and power pad distribution patterns. For thermal analysis, we propose ThermEDGe, a full-chip static and dynamic temperature estimator based on input power distribution patterns for thermal analysis. These networks are transferable across designs synthesized within the same technology and packing solution. The networks predict on-chip IR drop, EM hotspot locations, and temperature in milliseconds with negligibly small errors against commercial tools requiring several hours.
△ Less
Submitted 27 October, 2021;
originally announced October 2021.
-
A Decentralized and Autonomous Model to Administer University Examinations
Authors:
Yogesh N Patil,
Arvind W Kiwelekar,
Laxman D Netak,
Shankar B Deosarkar
Abstract:
Administering standardized examinations is a challenging task, especially for those universities for which colleges affiliated to it are geographically distributed over a wide area. Some of the challenges include maintaining integrity and confidentiality of examination records, preventing mal-practices, issuing unique identification numbers to a large student population and managing assets require…
▽ More
Administering standardized examinations is a challenging task, especially for those universities for which colleges affiliated to it are geographically distributed over a wide area. Some of the challenges include maintaining integrity and confidentiality of examination records, preventing mal-practices, issuing unique identification numbers to a large student population and managing assets required for the smooth conduct of examinations. These challenges aggravate when colleges affiliated to universities demand academic and administrative autonomy by demonstrating best practices consistently over a long period.
In this chapter, we describe a model for decentralized and autonomous examination system to provide the necessary administrative support. The model is based on two emerging technologies of Blockchain Technology and Internet of Things (IoT). We adopt a software architecture approach to describe the model. The prescriptive architecture consists of {\em architectural mappings} which map functional and non-functional requirements to architectural elements of blockchain technology and IoT. In architectural mappings, first, we identify common use-cases in administering standardized examinations. Then we map these use-cases to the core elements of blockchain, i.e. distributed ledgers, cryptography, consensus protocols and smart-contracts and IoT. Such kind of prescriptive architecture guide downstream software engineering processes of implementation and testing
△ Less
Submitted 20 March, 2021;
originally announced April 2021.
-
Thermal and IR Drop Analysis Using Convolutional Encoder-Decoder Networks
Authors:
Vidya A. Chhabria,
Vipul Ahuja,
Ashwath Prabhu,
Nikhil Patil,
Palkesh Jain,
Sachin S. Sapatnekar
Abstract:
Computationally expensive temperature and power grid analyses are required during the design cycle to guide IC design. This paper employs encoder-decoder based generative (EDGe) networks to map these analyses to fast and accurate image-to-image and sequence-to-sequence translation tasks. The network takes a power map as input and outputs the corresponding temperature or IR drop map. We propose two…
▽ More
Computationally expensive temperature and power grid analyses are required during the design cycle to guide IC design. This paper employs encoder-decoder based generative (EDGe) networks to map these analyses to fast and accurate image-to-image and sequence-to-sequence translation tasks. The network takes a power map as input and outputs the corresponding temperature or IR drop map. We propose two networks: (i) ThermEDGe: a static and dynamic full-chip temperature estimator and (ii) IREDGe: a full-chip static IR drop predictor based on input power, power grid distribution, and power pad distribution patterns. The models are design-independent and must be trained just once for a particular technology and packaging solution. ThermEDGe and IREDGe are demonstrated to rapidly predict the on-chip temperature and IR drop contours in milliseconds (in contrast with commercial tools that require several hours or more) and provide an average error of 0.6% and 0.008% respectively.
△ Less
Submitted 18 September, 2020;
originally announced September 2020.
-
In-Datacenter Performance Analysis of a Tensor Processing Unit
Authors:
Norman P. Jouppi,
Cliff Young,
Nishant Patil,
David Patterson,
Gaurav Agrawal,
Raminder Bajwa,
Sarah Bates,
Suresh Bhatia,
Nan Boden,
Al Borchers,
Rick Boyle,
Pierre-luc Cantin,
Clifford Chao,
Chris Clark,
Jeremy Coriell,
Mike Daley,
Matt Dau,
Jeffrey Dean,
Ben Gelb,
Tara Vazir Ghaemmaghami,
Rajendra Gottipati,
William Gulland,
Robert Hagmann,
C. Richard Ho,
Doug Hogberg
, et al. (50 additional authors not shown)
Abstract:
Many architects believe that major improvements in cost-energy-performance must now come from domain-specific hardware. This paper evaluates a custom ASIC---called a Tensor Processing Unit (TPU)---deployed in datacenters since 2015 that accelerates the inference phase of neural networks (NN). The heart of the TPU is a 65,536 8-bit MAC matrix multiply unit that offers a peak throughput of 92 TeraOp…
▽ More
Many architects believe that major improvements in cost-energy-performance must now come from domain-specific hardware. This paper evaluates a custom ASIC---called a Tensor Processing Unit (TPU)---deployed in datacenters since 2015 that accelerates the inference phase of neural networks (NN). The heart of the TPU is a 65,536 8-bit MAC matrix multiply unit that offers a peak throughput of 92 TeraOps/second (TOPS) and a large (28 MiB) software-managed on-chip memory. The TPU's deterministic execution model is a better match to the 99th-percentile response-time requirement of our NN applications than are the time-varying optimizations of CPUs and GPUs (caches, out-of-order execution, multithreading, multiprocessing, prefetching, ...) that help average throughput more than guaranteed latency. The lack of such features helps explain why, despite having myriad MACs and a big memory, the TPU is relatively small and low power. We compare the TPU to a server-class Intel Haswell CPU and an Nvidia K80 GPU, which are contemporaries deployed in the same datacenters. Our workload, written in the high-level TensorFlow framework, uses production NN applications (MLPs, CNNs, and LSTMs) that represent 95% of our datacenters' NN inference demand. Despite low utilization for some applications, the TPU is on average about 15X - 30X faster than its contemporary GPU or CPU, with TOPS/Watt about 30X - 80X higher. Moreover, using the GPU's GDDR5 memory in the TPU would triple achieved TOPS and raise TOPS/Watt to nearly 70X the GPU and 200X the CPU.
△ Less
Submitted 16 April, 2017;
originally announced April 2017.
-
A novel image tag completion method based on convolutional neural network
Authors:
Yanyan Geng,
Guohui Zhang,
Weizhi Li,
Yi Gu,
Ru-Ze Liang,
Gaoyuan Liang,
Jingbin Wang,
Yanbin Wu,
Nitin Patil,
Jing-Yan Wang
Abstract:
In the problems of image retrieval and annotation, complete textual tag lists of images play critical roles. However, in real-world applications, the image tags are usually incomplete, thus it is important to learn the complete tags for images. In this paper, we study the problem of image tag complete and proposed a novel method for this problem based on a popular image representation method, conv…
▽ More
In the problems of image retrieval and annotation, complete textual tag lists of images play critical roles. However, in real-world applications, the image tags are usually incomplete, thus it is important to learn the complete tags for images. In this paper, we study the problem of image tag complete and proposed a novel method for this problem based on a popular image representation method, convolutional neural network (CNN). The method estimates the complete tags from the convolutional filtering outputs of images based on a linear predictor. The CNN parameters, linear predictor, and the complete tags are learned jointly by our method. We build a minimization problem to encourage the consistency between the complete tags and the available incomplete tags, reduce the estimation error, and reduce the model complexity. An iterative algorithm is developed to solve the minimization problem. Experiments over benchmark image data sets show its effectiveness.
△ Less
Submitted 3 June, 2017; v1 submitted 1 March, 2017;
originally announced March 2017.
-
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Authors:
Yonghui Wu,
Mike Schuster,
Zhifeng Chen,
Quoc V. Le,
Mohammad Norouzi,
Wolfgang Macherey,
Maxim Krikun,
Yuan Cao,
Qin Gao,
Klaus Macherey,
Jeff Klingner,
Apurva Shah,
Melvin Johnson,
Xiaobing Liu,
Łukasz Kaiser,
Stephan Gouws,
Yoshikiyo Kato,
Taku Kudo,
Hideto Kazawa,
Keith Stevens,
George Kurian,
Nishant Patil,
Wei Wang,
Cliff Young,
Jason Smith
, et al. (6 additional authors not shown)
Abstract:
Neural Machine Translation (NMT) is an end-to-end learning approach for automated translation, with the potential to overcome many of the weaknesses of conventional phrase-based translation systems. Unfortunately, NMT systems are known to be computationally expensive both in training and in translation inference. Also, most NMT systems have difficulty with rare words. These issues have hindered NM…
▽ More
Neural Machine Translation (NMT) is an end-to-end learning approach for automated translation, with the potential to overcome many of the weaknesses of conventional phrase-based translation systems. Unfortunately, NMT systems are known to be computationally expensive both in training and in translation inference. Also, most NMT systems have difficulty with rare words. These issues have hindered NMT's use in practical deployments and services, where both accuracy and speed are essential. In this work, we present GNMT, Google's Neural Machine Translation system, which attempts to address many of these issues. Our model consists of a deep LSTM network with 8 encoder and 8 decoder layers using attention and residual connections. To improve parallelism and therefore decrease training time, our attention mechanism connects the bottom layer of the decoder to the top layer of the encoder. To accelerate the final translation speed, we employ low-precision arithmetic during inference computations. To improve handling of rare words, we divide words into a limited set of common sub-word units ("wordpieces") for both input and output. This method provides a good balance between the flexibility of "character"-delimited models and the efficiency of "word"-delimited models, naturally handles translation of rare words, and ultimately improves the overall accuracy of the system. Our beam search technique employs a length-normalization procedure and uses a coverage penalty, which encourages generation of an output sentence that is most likely to cover all the words in the source sentence. On the WMT'14 English-to-French and English-to-German benchmarks, GNMT achieves competitive results to state-of-the-art. Using a human side-by-side evaluation on a set of isolated simple sentences, it reduces translation errors by an average of 60% compared to Google's phrase-based production system.
△ Less
Submitted 8 October, 2016; v1 submitted 26 September, 2016;
originally announced September 2016.