Skip to main content

Showing 1–30 of 30 results for author: Ghosh, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.08447  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    The Impact of Initialization on LoRA Finetuning Dynamics

    Authors: Soufiane Hayou, Nikhil Ghosh, Bin Yu

    Abstract: In this paper, we study the role of initialization in Low Rank Adaptation (LoRA) as originally introduced in Hu et al. (2021). Essentially, to start from the pretrained model as initialization for finetuning, one can either initialize B to zero and A to random (default initialization in PEFT package), or vice-versa. In both cases, the product BA is equal to zero at initialization, which makes fine… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: TDLR: Different Initializations lead to completely different finetuning dynamics. One initialization (set A random and B zero) is generally better than the natural opposite initialization. arXiv admin note: text overlap with arXiv:2402.12354

  2. arXiv:2403.04416  [pdf, other

    cs.NI

    iTRPL: An Intelligent and Trusted RPL Protocol based on Multi-Agent Reinforcement Learning

    Authors: Debasmita Dey, Nirnay Ghosh

    Abstract: Routing Protocol for Low Power and Lossy Networks (RPL) is the de-facto routing standard in IoT networks. It enables nodes to collaborate and autonomously build ad-hoc networks modeled by tree-like destination-oriented direct acyclic graphs (DODAG). Despite its widespread usage in industry and healthcare domains, RPL is susceptible to insider attacks. Although the state-of-the-art RPL ensures that… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  3. arXiv:2402.12354  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    LoRA+: Efficient Low Rank Adaptation of Large Models

    Authors: Soufiane Hayou, Nikhil Ghosh, Bin Yu

    Abstract: In this paper, we show that Low Rank Adaptation (LoRA) as originally introduced in Hu et al. (2021) leads to suboptimal finetuning of models with large width (embedding dimension). This is due to the fact that adapter matrices A and B in LoRA are updated with the same learning rate. Using scaling arguments for large width networks, we demonstrate that using the same learning rate for A and B does… ▽ More

    Submitted 4 July, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: 27 pages

  4. arXiv:2401.06657  [pdf, other

    cs.CR cs.NI

    Accelerating Tactile Internet with QUIC: A Security and Privacy Perspective

    Authors: Jayasree Sengupta, Debasmita Dey, Simone Ferlin, Nirnay Ghosh, Vaibhav Bajpai

    Abstract: The Tactile Internet paradigm is set to revolutionize human society by enabling skill-set delivery and haptic communication over ultra-reliable, low-latency networks. The emerging sixth-generation (6G) mobile communication systems are envisioned to underpin this Tactile Internet ecosystem at the network edge by providing ubiquitous global connectivity. However, apart from a multitude of opportunit… ▽ More

    Submitted 31 January, 2024; v1 submitted 12 January, 2024; originally announced January 2024.

    Comments: 7 pages, 3 figures, 1 table

  5. arXiv:2311.14646  [pdf, other

    cs.LG stat.ML

    More is Better in Modern Machine Learning: when Infinite Overparameterization is Optimal and Overfitting is Obligatory

    Authors: James B. Simon, Dhruva Karkada, Nikhil Ghosh, Mikhail Belkin

    Abstract: In our era of enormous neural networks, empirical progress has been driven by the philosophy that more is better. Recent deep learning practice has found repeatedly that larger model size, more data, and more computation (resulting in lower training loss) improves performance. In this paper, we give theoretical backing to these empirical observations by showing that these three properties hold in… ▽ More

    Submitted 15 May, 2024; v1 submitted 24 November, 2023; originally announced November 2023.

    Comments: Appeared in ICLR 2024

  6. arXiv:2310.15202  [pdf, ps, other

    q-bio.GN cs.AI cs.LG

    Predicting Transcription Factor Binding Sites using Transformer based Capsule Network

    Authors: Nimisha Ghosh, Daniele Santoni, Indrajit Saha, Giovanni Felici

    Abstract: Prediction of binding sites for transcription factors is important to understand how they regulate gene expression and how this regulation can be modulated for therapeutic purposes. Although in the past few years there are significant works addressing this issue, there is still space for improvement. In this regard, a transformer based capsule network viz. DNABERT-Cap is proposed in this work to p… ▽ More

    Submitted 28 December, 2023; v1 submitted 23 October, 2023; originally announced October 2023.

  7. arXiv:2308.03215  [pdf, other

    stat.ML cs.LG

    The Effect of SGD Batch Size on Autoencoder Learning: Sparsity, Sharpness, and Feature Learning

    Authors: Nikhil Ghosh, Spencer Frei, Wooseok Ha, Bin Yu

    Abstract: In this work, we investigate the dynamics of stochastic gradient descent (SGD) when training a single-neuron autoencoder with linear or ReLU activation on orthogonal data. We show that for this non-convex problem, randomly initialized SGD with a constant step size successfully finds a global minimum for any batch size choice. However, the particular global minimum found depends upon the batch size… ▽ More

    Submitted 6 August, 2023; originally announced August 2023.

  8. arXiv:2302.00003  [pdf, other

    cs.LG cs.CL

    The Power of External Memory in Increasing Predictive Model Capacity

    Authors: Cenk Baykal, Dylan J Cutler, Nishanth Dikkala, Nikhil Ghosh, Rina Panigrahy, Xin Wang

    Abstract: One way of introducing sparsity into deep networks is by attaching an external table of parameters that is sparsely looked up at different layers of the network. By storing the bulk of the parameters in the external table, one can increase the capacity of the model without necessarily increasing the inference time. Two crucial questions in this setting are then: what is the lookup function for acc… ▽ More

    Submitted 30 January, 2023; originally announced February 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2301.13310

  9. arXiv:2301.13310  [pdf, other

    cs.LG cs.CL

    Alternating Updates for Efficient Transformers

    Authors: Cenk Baykal, Dylan Cutler, Nishanth Dikkala, Nikhil Ghosh, Rina Panigrahy, Xin Wang

    Abstract: It has been well established that increasing scale in deep transformer networks leads to improved quality and performance. However, this increase in scale often comes with prohibitive increases in compute cost and inference latency. We introduce Alternating Updates (AltUp), a simple-to-implement method to increase a model's capacity without the computational burden. AltUp enables the widening of t… ▽ More

    Submitted 3 October, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

  10. arXiv:2207.11621  [pdf, other

    stat.ML cs.LG

    A Universal Trade-off Between the Model Size, Test Loss, and Training Loss of Linear Predictors

    Authors: Nikhil Ghosh, Mikhail Belkin

    Abstract: In this work we establish an algorithm and distribution independent non-asymptotic trade-off between the model size, excess test loss, and training loss of linear predictors. Specifically, we show that models that perform well on the test data (have low excess loss) are either "classical" -- have training loss close to the noise level, or are "modern" -- have a much larger number of parameters com… ▽ More

    Submitted 18 April, 2023; v1 submitted 23 July, 2022; originally announced July 2022.

    Comments: Further polished writing

  11. arXiv:2202.09931  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Deconstructing Distributions: A Pointwise Framework of Learning

    Authors: Gal Kaplun, Nikhil Ghosh, Saurabh Garg, Boaz Barak, Preetum Nakkiran

    Abstract: In machine learning, we traditionally evaluate the performance of a single model, averaged over a collection of test inputs. In this work, we propose a new approach: we measure the performance of a collection of models when evaluated on a $\textit{single input point}$. Specifically, we study a point's $\textit{profile}$: the relationship between models' average performance on the test distribution… ▽ More

    Submitted 7 June, 2022; v1 submitted 20 February, 2022; originally announced February 2022.

    Comments: GK and NG contributed equally. v2: Added Figures 4, 5

  12. arXiv:2202.07448  [pdf, other

    cs.CY cs.NI

    Towards a Unified Pandemic Management Architecture: Survey, Challenges and Future Directions

    Authors: Satyaki Roy, Nirnay Ghosh, Nitish Uplavikar, Preetam Ghosh

    Abstract: The pandemic caused by SARS-CoV-2 has left an unprecedented impact on health, economy and society worldwide. Emerging strains are making pandemic management increasingly challenging. There is an urge to collect epidemiological, clinical, and physiological data to make an informed decision on mitigation measures. Advances in the Internet of Things (IoT) and edge computing provide solutions for pand… ▽ More

    Submitted 3 February, 2022; originally announced February 2022.

    Comments: 30 pages and 10 figures

  13. Reuse-Aware Cache Partitioning Framework for Data-Sharing Multicore Systems

    Authors: Soma N. Ghosh, Vineet Sahula, Lava Bhargava

    Abstract: Multi-core processors improve performance, but they can create unpredictability owing to shared resources such as caches interfering. Cache partitioning is used to alleviate the Worst-Case Execution Time (WCET) estimation by isolating the shared cache across each thread to reduce interference. It does, however, prohibit data from being transferred between parallel threads running on different core… ▽ More

    Submitted 17 January, 2022; originally announced January 2022.

    Comments: 2 pages. 7th IEEE International Symposium on Smart Electronic Systems (iSES) 2021

    ACM Class: C.1.4; D.4.2

  14. arXiv:2111.07167  [pdf, other

    stat.ML cs.LG math.ST

    The Three Stages of Learning Dynamics in High-Dimensional Kernel Methods

    Authors: Nikhil Ghosh, Song Mei, Bin Yu

    Abstract: To understand how deep learning works, it is crucial to understand the training dynamics of neural networks. Several interesting hypotheses about these dynamics have been made based on empirically observed phenomena, but there exists a limited theoretical understanding of when and why such phenomena occur. In this paper, we consider the training dynamics of gradient flow on kernel least-squares… ▽ More

    Submitted 13 November, 2021; originally announced November 2021.

  15. arXiv:2007.06201  [pdf, other

    cs.CR cs.AR

    The Blockchain Based Auditor on Secret key Life Cycle in Reconfigurable Platform

    Authors: Rourab Paul, Nimisha Ghosh, Amlan Chakrabarti, Prasant Mahapatra

    Abstract: The growing sophistication of cyber attacks, vulnerabilities in high computing systems and increasing dependency on cryptography to protect our digital data make it more important to keep secret keys safe and secure. Few major issues on secret keys like incorrect use of keys, inappropriate storage of keys, inadequate protection of keys, insecure movement of keys, lack of audit logging, insider thr… ▽ More

    Submitted 13 July, 2020; originally announced July 2020.

    Comments: Manuscript

  16. arXiv:2004.11726  [pdf, other

    cs.CV

    A Two-Stage Multiple Instance Learning Framework for the Detection of Breast Cancer in Mammograms

    Authors: Sarath Chandra K, Arunava Chakravarty, Nirmalya Ghosh, Tandra Sarkar, Ramanathan Sethuraman, Debdoot Sheet

    Abstract: Mammograms are commonly employed in the large scale screening of breast cancer which is primarily characterized by the presence of malignant masses. However, automated image-level detection of malignancy is a challenging task given the small size of the mass regions and difficulty in discriminating between malignant, benign mass and healthy dense fibro-glandular tissue. To address these issues, we… ▽ More

    Submitted 24 April, 2020; originally announced April 2020.

    Comments: accepted in EMBC 2020, 4 pg+1 pg Supplementary

  17. arXiv:2004.11721  [pdf, other

    cs.CV cs.LG eess.IV

    Learning Decision Ensemble using a Graph Neural Network for Comorbidity Aware Chest Radiograph Screening

    Authors: Arunava Chakravarty, Tandra Sarkar, Nirmalya Ghosh, Ramanathan Sethuraman, Debdoot Sheet

    Abstract: Chest radiographs are primarily employed for the screening of cardio, thoracic and pulmonary conditions. Machine learning based automated solutions are being developed to reduce the burden of routine screening on Radiologists, allowing them to focus on critical cases. While recent efforts demonstrate the use of ensemble of deep convolutional neural networks(CNN), they do not take disease comorbidi… ▽ More

    Submitted 24 April, 2020; originally announced April 2020.

    Comments: accepted in EMBC 2020, 4pg+2pg Supplementary Material

  18. arXiv:2004.11693  [pdf, other

    cs.CV cs.LG

    A Systematic Search over Deep Convolutional Neural Network Architectures for Screening Chest Radiographs

    Authors: Arka Mitra, Arunava Chakravarty, Nirmalya Ghosh, Tandra Sarkar, Ramanathan Sethuraman, Debdoot Sheet

    Abstract: Chest radiographs are primarily employed for the screening of pulmonary and cardio-/thoracic conditions. Being undertaken at primary healthcare centers, they require the presence of an on-premise reporting Radiologist, which is a challenge in low and middle income countries. This has inspired the development of machine learning based automation of the screening process. While recent efforts demons… ▽ More

    Submitted 24 April, 2020; originally announced April 2020.

    Comments: accepted in EMBC 2020, 4 pages+2 page Appendix

  19. arXiv:2002.03639  [pdf, ps, other

    cs.AI cs.IT eess.SP

    iDCR: Improved Dempster Combination Rule for Multisensor Fault Diagnosis

    Authors: Nimisha Ghosh, Sayantan Saha, Rourab Paul

    Abstract: Data gathered from multiple sensors can be effectively fused for accurate monitoring of many engineering applications. In the last few years, one of the most sought after applications for multi sensor fusion has been fault diagnosis. Dempster-Shafer Theory of Evidence along with Dempsters Combination Rule is a very popular method for multi sensor fusion which can be successfully applied to fault d… ▽ More

    Submitted 10 February, 2020; originally announced February 2020.

  20. arXiv:1910.12379  [pdf, other

    cs.LG stat.ML

    Landmark Ordinal Embedding

    Authors: Nikhil Ghosh, Yuxin Chen, Yisong Yue

    Abstract: In this paper, we aim to learn a low-dimensional Euclidean representation from a set of constraints of the form "item j is closer to item i than item k". Existing approaches for this "ordinal embedding" problem require expensive optimization procedures, which cannot scale to handle increasingly larger datasets. To address this issue, we propose a landmark-based strategy, which we call Landmark Ord… ▽ More

    Submitted 27 October, 2019; originally announced October 2019.

    Comments: NeurIPS 2019

  21. arXiv:1908.11538  [pdf, other

    cs.CR cs.LG cs.NI

    IoT based Smart Access Controlled Secure Smart City Architecture Using Blockchain

    Authors: Rourab Paul, Nimisha Ghosh, Suman Sau, Amlan Chakrabarti, Prasant Mahapatra

    Abstract: Standard security protocols like SSL, TLS, IPSec etc. have high memory and processor consumption which makes all these security protocols unsuitable for resource constrained platforms such as Internet of Things (IoT). Blockchain (BC) finds its efficient application in IoT platform to preserve the five basic cryptographic primitives, such as confidentiality, authenticity, integrity, availability an… ▽ More

    Submitted 9 September, 2019; v1 submitted 30 August, 2019; originally announced August 2019.

    Comments: Manuscript

  22. arXiv:1908.01176  [pdf, other

    eess.IV cs.LG stat.ML

    Adversarially Trained Convolutional Neural Networks for Semantic Segmentation of Ischaemic Stroke Lesion using Multisequence Magnetic Resonance Imaging

    Authors: Rachana Sathish, Ronnie Rajan, Anusha Vupputuri, Nirmalya Ghosh, Debdoot Sheet

    Abstract: Ischaemic stroke is a medical condition caused by occlusion of blood supply to the brain tissue thus forming a lesion. A lesion is zoned into a core associated with irreversible necrosis typically located at the center of the lesion, while reversible hypoxic changes in the outer regions of the lesion are termed as the penumbra. Early estimation of core and penumbra in ischaemic stroke is crucial f… ▽ More

    Submitted 3 August, 2019; originally announced August 2019.

  23. arXiv:1906.09769  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Fault Matters: Sensor Data Fusion for Detection of Faults using Dempster-Shafer Theory of Evidence in IoT-Based Applications

    Authors: Nimisha Ghosh, Rourab Paul, Satyabrata Maity, Krishanu Maity, Sayantan Saha

    Abstract: Fault detection in sensor nodes is a pertinent issue that has been an important area of research for a very long time. But it is not explored much as yet in the context of Internet of Things. Internet of Things work with a massive amount of data so the responsibility for guaranteeing the accuracy of the data also lies with it. Moreover, a lot of important and critical decisions are made based on t… ▽ More

    Submitted 24 June, 2019; originally announced June 2019.

  24. PS-Sim: A Framework for Scalable Simulation of Participatory Sensing Data

    Authors: Rajesh P Barnwal, Nirnay Ghosh, Soumya K Ghosh, Sajal K Das

    Abstract: Emergence of smartphone and the participatory sensing (PS) paradigm have paved the way for a new variant of pervasive computing. In PS, human user performs sensing tasks and generates notifications, typically in lieu of incentives. These notifications are real-time, large-volume, and multi-modal, which are eventually fused by the PS platform to generate a summary. One major limitation with PS is t… ▽ More

    Submitted 29 August, 2018; originally announced August 2018.

    Comments: Published and Appeared in Proceedings of IEEE International Conference on Smart Computing (SMARTCOMP-2018)

  25. arXiv:1805.06909  [pdf, other

    cs.CV

    Fully Convolutional Model for Variable Bit Length and Lossy High Density Compression of Mammograms

    Authors: Aupendu Kar, Sri Phani Krishna Karri, Nirmalya Ghosh, Ramanathan Sethuraman, Debdoot Sheet

    Abstract: Early works on medical image compression date to the 1980's with the impetus on deployment of teleradiology systems for high-resolution digital X-ray detectors. Commercially deployed systems during the period could compress 4,096 x 4,096 sized images at 12 bpp to 2 bpp using lossless arithmetic coding, and over the years JPEG and JPEG2000 were imbibed reaching upto 0.1 bpp. Inspired by the reprise… ▽ More

    Submitted 17 May, 2018; originally announced May 2018.

    Comments: 4 pages, 3 figures, To appear in Workshop on Learned Image Compression, CVPR 2018

  26. arXiv:1709.03583  [pdf, other

    cs.CY

    Quality of Information in Mobile Crowdsensing: Survey and Research Challenges

    Authors: Francesco Restuccia, Nirnay Ghosh, Shameek Bhattacharjee, Sajal Das, Tommaso Melodia

    Abstract: Smartphones have become the most pervasive devices in people's lives, and are clearly transforming the way we live and perceive technology. Today's smartphones benefit from almost ubiquitous Internet connectivity and come equipped with a plethora of inexpensive yet powerful embedded sensors, such as accelerometer, gyroscope, microphone, and camera. This unique combination has enabled revolutionary… ▽ More

    Submitted 6 September, 2017; originally announced September 2017.

    Comments: To appear in ACM Transactions on Sensor Networks (TOSN)

  27. arXiv:1505.06219  [pdf

    cs.CV

    A comparative study between proposed Hyper Kurtosis based Modified Duo-Histogram Equalization (HKMDHE) and Contrast Limited Adaptive Histogram Equalization (CLAHE) for Contrast Enhancement Purpose of Low Contrast Human Brain CT scan images

    Authors: Sabyasachi Mukhopadhyay, Soham Mandal, Sawon Pratiher, Satyasaran Changdar, Ritwik Burman, Nirmalya Ghosh, Prasanta K. Panigrahi

    Abstract: In this paper, a comparative study between proposed hyper kurtosis based modified duo-histogram equalization (HKMDHE) algorithm and contrast limited adaptive histogram enhancement (CLAHE) has been presented for the implementation of contrast enhancement and brightness preservation of low contrast human brain CT scan images. In HKMDHE algorithm, contrast enhancement is done on the hyper-kurtosis ba… ▽ More

    Submitted 6 April, 2015; originally announced May 2015.

  28. arXiv:1505.00192  [pdf

    cs.CV

    Application of S-Transform on Hyper kurtosis based Modified Duo Histogram Equalized DIC images for Pre-cancer Detection

    Authors: Sabyasachi Mukhopadhyay, Soham Mandal, Sawon Pratiher, Ritwik Barman, M. Venkatesh, Nirmalya Ghosh, Prasanta K. Panigrahi

    Abstract: Our proposed hyper kurtosis based histogram equalized DIC images enhances the contrast by preserving the brightness. The evolution and development of precancerous activity among tissues are studied through S-transform (ST). The significant variations of amplitude spectra can be observed due to increased medium roughness from normal tissue were observed in time-frequency domain. The randomness and… ▽ More

    Submitted 30 April, 2015; originally announced May 2015.

  29. arXiv:1503.06323  [pdf

    cs.CV

    Wavelet based approach for tissue fractal parameter measurement: Pre cancer detection

    Authors: Sabyasachi Mukhopadhyay, Nandan K. Das, Soham Mandal, Sawon Pratiher, Asish Mitra, Asima Pradhan, Nirmalya Ghosh, Prasanta K. Panigrahi

    Abstract: In this paper, we have carried out the detail studies of pre-cancer by wavelet coherency and multifractal based detrended fluctuation analysis (MFDFA) on differential interference contrast (DIC) images of stromal region among different grades of pre-cancer tissues. Discrete wavelet transform (DWT) through Daubechies basis has been performed for identifying fluctuations over polynomial trends for c… ▽ More

    Submitted 21 March, 2015; originally announced March 2015.

  30. arXiv:1503.03913  [pdf

    cs.CV

    Diagnosing Heterogeneous Dynamics for CT Scan Images of Human Brain in Wavelet and MFDFA domain

    Authors: Sabyasachi Mukhopadhyay, Soham Mandal, Nandan K Das, Subhadip Dey, Asish Mitra, Nirmalya Ghosh, Prasanta K Panigrahi

    Abstract: CT scan images of human brain of a particular patient in different cross sections are taken, on which wavelet transform and multi-fractal analysis are applied. The vertical and horizontal unfolding of images are done before analyzing these images. A systematic investigation of de-noised CT scan images of human brain in different cross-sections are carried out through wavelet normalized energy and… ▽ More

    Submitted 12 March, 2015; originally announced March 2015.