Zum Hauptinhalt springen

Showing 1–50 of 103 results for author: Pandey, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.05467  [pdf, other

    cs.DC cs.AI

    The infrastructure powering IBM's Gen AI model development

    Authors: Talia Gershon, Seetharami Seelam, Brian Belgodere, Milton Bonilla, Lan Hoang, Danny Barnett, I-Hsin Chung, Apoorve Mohan, Ming-Hung Chen, Lixiang Luo, Robert Walkup, Constantinos Evangelinos, Shweta Salaria, Marc Dombrowa, Yoonho Park, Apo Kayi, Liran Schour, Alim Alim, Ali Sydney, Pavlos Maniotis, Laurent Schares, Bernard Metzler, Bengi Karacali-Akyamac, Sophia Wen, Tatsuhiro Chiba , et al. (121 additional authors not shown)

    Abstract: AI Infrastructure plays a key role in the speed and cost-competitiveness of developing and deploying advanced AI models. The current demand for powerful AI infrastructure for model training is driven by the emergence of generative AI and foundational models, where on occasion thousands of GPUs must cooperate on a single training job for the model to be trained in a reasonable time. Delivering effi… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: Corresponding Authors: Talia Gershon, Seetharami Seelam,Brian Belgodere, Milton Bonilla

  2. arXiv:2407.01544  [pdf, other

    cs.NI cs.AI

    Decentralized Multi-Party Multi-Network AI for Global Deployment of 6G Wireless Systems

    Authors: Merim Dzaferagic, Marco Ruffini, Nina Slamnik-Krijestorac, Joao F. Santos, Johann Marquez-Barja, Christos Tranoris, Spyros Denazis, Thomas Kyriakakis, Panagiotis Karafotis, Luiz DaSilva, Shashi Raj Pandey, Junya Shiraishi, Petar Popovski, Soren Kejser Jensen, Christian Thomsen, Torben Bach Pedersen, Holger Claussen, Jinfeng Du, Gil Zussman, Tingjun Chen, Yiran Chen, Seshu Tirupathi, Ivan Seskar, Daniel Kilper

    Abstract: Multiple visions of 6G networks elicit Artificial Intelligence (AI) as a central, native element. When 6G systems are deployed at a large scale, end-to-end AI-based solutions will necessarily have to encompass both the radio and the fiber-optical domain. This paper introduces the Decentralized Multi-Party, Multi-Network AI (DMMAI) framework for integrating AI into 6G networks deployed at scale. DM… ▽ More

    Submitted 15 April, 2024; originally announced July 2024.

  3. arXiv:2406.17910  [pdf

    cs.SE cs.AI

    Transforming Software Development: Evaluating the Efficiency and Challenges of GitHub Copilot in Real-World Projects

    Authors: Ruchika Pandey, Prabhat Singh, Raymond Wei, Shaila Shankar

    Abstract: Generative AI technologies promise to transform the product development lifecycle. This study evaluates the efficiency gains, areas for improvement, and emerging challenges of using GitHub Copilot, an AI-powered coding assistant. We identified 15 software development tasks and assessed Copilot's benefits through real-world projects on large proprietary code bases. Our findings indicate significant… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 13 pages, 8 figures

  4. arXiv:2406.11356  [pdf, other

    cs.CR cs.NI

    DIDChain: Advancing Supply Chain Data Management with Decentralized Identifiers and Blockchain

    Authors: Patrick Herbke, Sid Lamichhane, Kaustabh Barman, Sanjeet Raj Pandey, Axel Küpper, Andreas Abraham, Markus Sabadello

    Abstract: Supply chain data management faces challenges in traceability, transparency, and trust. These issues stem from data silos and communication barriers. This research introduces DIDChain, a framework leveraging blockchain technology, Decentralized Identifiers, and the InterPlanetary File System. DIDChain improves supply chain data management. To address privacy concerns, DIDChain employs a hybrid blo… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Accepted to be published at the 18th IEEE International Conference on Service-Oriented System Engineering 2024

  5. arXiv:2405.16684  [pdf, other

    cs.CL cs.LG

    gzip Predicts Data-dependent Scaling Laws

    Authors: Rohan Pandey

    Abstract: Past work has established scaling laws that predict the performance of a neural language model (LM) as a function of its parameter count and the number of tokens it's trained on, enabling optimal allocation of a fixed compute budget. Are these scaling laws agnostic to training data as some prior work suggests? We generate training datasets of varying complexities by modulating the syntactic proper… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: 9 pages, 9 figures

  6. arXiv:2404.12816  [pdf, other

    cs.NI

    Coexistence of Push Wireless Access with Pull Communication for Content-based Wake-up Radios

    Authors: Junya Shiraishi, Sara Cavallero, Shashi Raj Pandey, Fabio Saggese, Petar Popovski

    Abstract: This paper considers energy-efficient connectivity for Internet of Things (IoT) devices in a coexistence scenario between two distinctive communication models: pull- and push-based. In pull-based, the base station (BS) decides when to retrieve a specific type of data from the IoT devices, while in push-based, the IoT device decides when and which data to transmit. To this end, this paper advocates… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: Paper submitted to Globecom 2024. Copyright may be transferred without further notice

  7. arXiv:2404.01786  [pdf

    cs.CL

    Generative AI-Based Text Generation Methods Using Pre-Trained GPT-2 Model

    Authors: Rohit Pandey, Hetvi Waghela, Sneha Rakshit, Aparna Rangari, Anjali Singh, Rahul Kumar, Ratnadeep Ghosal, Jaydip Sen

    Abstract: This work delved into the realm of automatic text generation, exploring a variety of techniques ranging from traditional deterministic approaches to more modern stochastic methods. Through analysis of greedy search, beam search, top-k sampling, top-p sampling, contrastive searching, and locally typical searching, this work has provided valuable insights into the strengths, weaknesses, and potentia… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: This report pertains to the Capstone Project done by Group 5 of the Fall batch of 2023 students at Praxis Tech School, Kolkata, India. The reports consists of 57 pages and it includes 17 figures and 8 tables. This is the preprint which will be submitted to IEEE CONIT 2024 for review

  8. arXiv:2404.01543  [pdf, other

    cs.CV cs.GR

    Efficient 3D Implicit Head Avatar with Mesh-anchored Hash Table Blendshapes

    Authors: Ziqian Bai, Feitong Tan, Sean Fanello, Rohit Pandey, Mingsong Dou, Shichen Liu, Ping Tan, Yinda Zhang

    Abstract: 3D head avatars built with neural implicit volumetric representations have achieved unprecedented levels of photorealism. However, the computational cost of these methods remains a significant barrier to their widespread adoption, particularly in real-time applications such as virtual reality and teleconferencing. While attempts have been made to develop fast neural rendering approaches for static… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: In CVPR2024. Project page: https://augmentedperception.github.io/monoavatar-plus

  9. arXiv:2402.11909  [pdf, other

    cs.CV

    One2Avatar: Generative Implicit Head Avatar For Few-shot User Adaptation

    Authors: Zhixuan Yu, Ziqian Bai, Abhimitra Meka, Feitong Tan, Qiangeng Xu, Rohit Pandey, Sean Fanello, Hyun Soo Park, Yinda Zhang

    Abstract: Traditional methods for constructing high-quality, personalized head avatars from monocular videos demand extensive face captures and training time, posing a significant challenge for scalability. This paper introduces a novel approach to create high quality head avatar utilizing only a single or a few images per user. We learn a generative model for 3D animatable photo-realistic head avatar from… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

  10. arXiv:2402.10051  [pdf, other

    cs.AI cs.CL

    SwissNYF: Tool Grounded LLM Agents for Black Box Setting

    Authors: Somnath Sendhil Kumar, Dhruv Jain, Eshaan Agarwal, Raunak Pandey

    Abstract: While Large Language Models (LLMs) have demonstrated enhanced capabilities in function-calling, these advancements primarily rely on accessing the functions' responses. This methodology is practical for simpler APIs but faces scalability issues with irreversible APIs that significantly impact the system, such as a database deletion API. Similarly, processes requiring extensive time for each API ca… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

  11. arXiv:2312.15024  [pdf, other

    cs.IT

    Coded Caching for Hierarchical Two-Layer Networks with Coded Placement

    Authors: Rajlaxmi Pandey, Charul Rajput, B. Sundar Rajan

    Abstract: We consider two layered hierarchical coded caching problem introduced in Hierarchical coded caching, IEEE Trans. Inf. Theory, 2016, in which a server is connected to $K_1$ mirrors, and each mirror is connected to $K_2$ users. The mirrors and the users are equipped with the cache of size $M_1$ and $M_2$, respectively. We propose a hierarchical coded caching scheme with coded placements that perform… ▽ More

    Submitted 22 December, 2023; originally announced December 2023.

    Comments: 16 pages and 5 figures

  12. Greedy Shapley Client Selection for Communication-Efficient Federated Learning

    Authors: Pranava Singhal, Shashi Raj Pandey, Petar Popovski

    Abstract: The standard client selection algorithms for Federated Learning (FL) are often unbiased and involve uniform random sampling of clients. This has been proven sub-optimal for fast convergence under practical settings characterized by significant heterogeneity in data distribution, computing, and communication resources across clients. For applications having timing constraints due to limited communi… ▽ More

    Submitted 7 February, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: Accepted for publication in IEEE Networking Letters

  13. arXiv:2312.04875  [pdf, other

    cs.CV

    MVDD: Multi-View Depth Diffusion Models

    Authors: Zhen Wang, Qiangeng Xu, Feitong Tan, Menglei Chai, Shichen Liu, Rohit Pandey, Sean Fanello, Achuta Kadambi, Yinda Zhang

    Abstract: Denoising diffusion models have demonstrated outstanding results in 2D image generation, yet it remains a challenge to replicate its success in 3D shape generation. In this paper, we propose leveraging multi-view depth, which represents complex 3D shapes in a 2D data format that is easy to denoise. We pair this representation with a diffusion model, MVDD, that is capable of generating high-quality… ▽ More

    Submitted 19 December, 2023; v1 submitted 8 December, 2023; originally announced December 2023.

  14. arXiv:2312.03763  [pdf, other

    cs.CV cs.GR cs.LG

    Gaussian3Diff: 3D Gaussian Diffusion for 3D Full Head Synthesis and Editing

    Authors: Yushi Lan, Feitong Tan, Di Qiu, Qiangeng Xu, Kyle Genova, Zeng Huang, Sean Fanello, Rohit Pandey, Thomas Funkhouser, Chen Change Loy, Yinda Zhang

    Abstract: We present a novel framework for generating photorealistic 3D human head and subsequently manipulating and reposing them with remarkable flexibility. The proposed approach leverages an implicit function representation of 3D human heads, employing 3D Gaussians anchored on a parametric face model. To enhance representational capabilities and encode spatial information, we embed a lightweight tri-pla… ▽ More

    Submitted 19 December, 2023; v1 submitted 5 December, 2023; originally announced December 2023.

    Comments: project webpage: https://nirvanalan.github.io/projects/gaussian3diff/

  15. arXiv:2312.02611  [pdf, other

    cs.LG cs.CR cs.GT

    Privacy-Aware Data Acquisition under Data Similarity in Regression Markets

    Authors: Shashi Raj Pandey, Pierre Pinson, Petar Popovski

    Abstract: Data markets facilitate decentralized data exchange for applications such as prediction, learning, or inference. The design of these markets is challenged by varying privacy preferences as well as data similarity among data owners. Related works have often overlooked how data similarity impacts pricing and data value through statistical information leakage. We demonstrate that data similarity and… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

    Comments: Submitted to IEEE Transactions on Neural Networks and Learning Systems (submission version)

  16. arXiv:2311.08053  [pdf, other

    cs.LG

    Batch Selection and Communication for Active Learning with Edge Labeling

    Authors: Victor Croisfelt, Shashi Raj Pandey, Osvaldo Simeone, Petar Popovski

    Abstract: Conventional retransmission (ARQ) protocols are designed with the goal of ensuring the correct reception of all the individual transmitter's packets at the receiver. When the transmitter is a learner communicating with a teacher, this goal is at odds with the actual aim of the learner, which is that of eliciting the most relevant label information from the teacher. Taking an active learning perspe… ▽ More

    Submitted 22 May, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

    Comments: 6 pages, 4 figures, conference version, accepted in IEEE ICC 2024, Workshop on Task-Oriented and Generative Communications For 6G

  17. arXiv:2311.04788  [pdf, other

    cs.NI eess.SP

    TinyAirNet: TinyML Model Transmission for Energy-efficient Image Retrieval from IoT Devices

    Authors: Junya Shiraishi, Mathias Thorsager, Shashi Raj Pandey, Petar Popovski

    Abstract: This letter introduces an energy-efficient pull-based data collection framework for Internet of Things (IoT) devices that use Tiny Machine Learning (TinyML) to interpret data queries. A TinyML model is transmitted from the edge server to the IoT devices. The devices employ the model to facilitate the subsequent semantic queries. This reduces the transmission of irrelevant data, but receiving the M… ▽ More

    Submitted 17 June, 2024; v1 submitted 8 November, 2023; originally announced November 2023.

    Comments: 5 pages, 3 figures, Submitted for possible publication

  18. arXiv:2311.00991  [pdf, other

    cs.CV

    IR-UWB Radar-based Situational Awareness System for Smartphone-Distracted Pedestrians

    Authors: Jamsheed Manja Ppallan, Ruchi Pandey, Yellappa Damam, Vijay Narayan Tiwari, Karthikeyan Arunachalam, Antariksha Ray

    Abstract: With the widespread adoption of smartphones, ensuring pedestrian safety on roads has become a critical concern due to smartphone distraction. This paper proposes a novel and real-time assistance system called UWB-assisted Safe Walk (UASW) for obstacle detection and warns users about real-time situations. The proposed method leverages Impulse Radio Ultra-Wideband (IR-UWB) radar embedded in the smar… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

  19. arXiv:2308.14179  [pdf, other

    cs.CL cs.AI cs.CV

    Towards Vision-Language Mechanistic Interpretability: A Causal Tracing Tool for BLIP

    Authors: Vedant Palit, Rohan Pandey, Aryaman Arora, Paul Pu Liang

    Abstract: Mechanistic interpretability seeks to understand the neural mechanisms that enable specific behaviors in Large Language Models (LLMs) by leveraging causality-based methods. While these approaches have identified neural circuits that copy spans of text, capture factual knowledge, and more, they remain unusable for multimodal models since adapting these tools to the vision-language domain requires c… ▽ More

    Submitted 27 August, 2023; originally announced August 2023.

    Comments: Final version for 5th Workshop on Closing the Loop Between Vision and Language (CLVL) @ ICCV 2023. 4 pages, 5 figures

  20. arXiv:2308.07265  [pdf, other

    eess.AS cs.SD

    Localization of DOA trajectories -- Beyond the grid

    Authors: Ruchi Pandey, Santosh Nannuru

    Abstract: The direction of arrival (DOA) estimation algorithms are crucial in localizing acoustic sources. Traditional localization methods rely on block-level processing to extract the directional information from multiple measurements processed together. However, these methods assume that DOA remains constant throughout the block, which may not be true in practical scenarios. Also, the performance of loca… ▽ More

    Submitted 14 August, 2023; originally announced August 2023.

  21. arXiv:2308.03580  [pdf, other

    cs.CV cs.AI

    Revealing the Underlying Patterns: Investigating Dataset Similarity, Performance, and Generalization

    Authors: Akshit Achara, Ram Krishna Pandey

    Abstract: Supervised deep learning models require significant amount of labeled data to achieve an acceptable performance on a specific task. However, when tested on unseen data, the models may not perform well. Therefore, the models need to be trained with additional and varying labeled data to improve the generalization. In this work, our goal is to understand the models, their performance and generalizat… ▽ More

    Submitted 29 December, 2023; v1 submitted 7 August, 2023; originally announced August 2023.

  22. arXiv:2308.01887  [pdf, other

    cs.CL

    Athena 2.0: Discourse and User Modeling in Open Domain Dialogue

    Authors: Omkar Patil, Lena Reed, Kevin K. Bowden, Juraj Juraska, Wen Cui, Vrindavan Harrison, Rishi Rajasekaran, Angela Ramirez, Cecilia Li, Eduardo Zamora, Phillip Lee, Jeshwanth Bheemanpally, Rohan Pandey, Adwait Ratnaparkhi, Marilyn Walker

    Abstract: Conversational agents are consistently growing in popularity and many people interact with them every day. While many conversational agents act as personal assistants, they can have many different goals. Some are task-oriented, such as providing customer support for a bank or making a reservation. Others are designed to be empathetic and to form emotional connections with the user. The Alexa Prize… ▽ More

    Submitted 3 August, 2023; originally announced August 2023.

    Comments: Alexa Prize Proceedings, 2021. Socialbot Grand Challenge 4

  23. arXiv:2306.04539  [pdf, other

    cs.LG cs.CL cs.CV cs.IT stat.ML

    Multimodal Learning Without Labeled Multimodal Data: Guarantees and Applications

    Authors: Paul Pu Liang, Chun Kai Ling, Yun Cheng, Alex Obolenskiy, Yudong Liu, Rohan Pandey, Alex Wilf, Louis-Philippe Morency, Ruslan Salakhutdinov

    Abstract: In many machine learning systems that jointly learn from multiple modalities, a core research question is to understand the nature of multimodal interactions: how modalities combine to provide new task-relevant information that was not present in either alone. We study this challenge of interaction quantification in a semi-supervised setting with only labeled unimodal data and naturally co-occurri… ▽ More

    Submitted 13 June, 2024; v1 submitted 7 June, 2023; originally announced June 2023.

    Comments: ICLR 2024, Code available at: https://github.com/pliang279/PID

  24. arXiv:2305.19088  [pdf, other

    cs.CV

    TrueDeep: A systematic approach of crack detection with less data

    Authors: Ram Krishna Pandey, Akshit Achara

    Abstract: Supervised and semi-supervised semantic segmentation algorithms require significant amount of annotated data to achieve a good performance. In many situations, the data is either not available or the annotation is expensive. The objective of this work is to show that by incorporating domain knowledge along with deep learning architectures, we can achieve similar performance with less data. We have… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

  25. arXiv:2305.16328  [pdf, other

    cs.CL cs.LG

    Semantic Composition in Visually Grounded Language Models

    Authors: Rohan Pandey

    Abstract: What is sentence meaning and its ideal representation? Much of the expressive power of human language derives from semantic composition, the mind's ability to represent meaning hierarchically & relationally over constituents. At the same time, much sentential meaning is outside the text and requires grounding in sensory, motor, and experiential modalities to be adequately learned. Although large l… ▽ More

    Submitted 14 May, 2023; originally announced May 2023.

    Comments: Carnegie Mellon University Senior Thesis. arXiv admin note: substantial text overlap with arXiv:2212.10549

  26. arXiv:2305.11633  [pdf, other

    cs.DC cs.LG

    Goal-Oriented Communications in Federated Learning via Feedback on Risk-Averse Participation

    Authors: Shashi Raj Pandey, Van Phuc Bui, Petar Popovski

    Abstract: We treat the problem of client selection in a Federated Learning (FL) setup, where the learning objective and the local incentives of the participants are used to formulate a goal-oriented communication problem. Specifically, we incorporate the risk-averse nature of participants and obtain a communication-efficient on-device performance, while relying on feedback from the Parameter Server (\texttt… ▽ More

    Submitted 19 May, 2023; originally announced May 2023.

    Journal ref: PIMRC 2023, WS NAISC

  27. arXiv:2305.04745  [pdf, other

    cs.CV cs.GR

    Controllable Light Diffusion for Portraits

    Authors: David Futschik, Kelvin Ritland, James Vecore, Sean Fanello, Sergio Orts-Escolano, Brian Curless, Daniel Sýkora, Rohit Pandey

    Abstract: We introduce light diffusion, a novel method to improve lighting in portraits, softening harsh shadows and specular highlights while preserving overall scene illumination. Inspired by professional photographers' diffusers and scrims, our method softens lighting given only a single portrait photo. Previous portrait relighting approaches focus on changing the entire lighting environment, removing sh… ▽ More

    Submitted 8 May, 2023; originally announced May 2023.

    Comments: CVPR 2023

    ACM Class: I.4.3

  28. arXiv:2304.01436  [pdf, other

    cs.CV cs.GR

    Learning Personalized High Quality Volumetric Head Avatars from Monocular RGB Videos

    Authors: Ziqian Bai, Feitong Tan, Zeng Huang, Kripasindhu Sarkar, Danhang Tang, Di Qiu, Abhimitra Meka, Ruofei Du, Mingsong Dou, Sergio Orts-Escolano, Rohit Pandey, Ping Tan, Thabo Beeler, Sean Fanello, Yinda Zhang

    Abstract: We propose a method to learn a high-quality implicit 3D head avatar from a monocular RGB video captured in the wild. The learnt avatar is driven by a parametric face model to achieve user-controlled facial expressions and head poses. Our hybrid pipeline combines the geometry prior and dynamic tracking of a 3DMM with a neural radiance field to achieve fine-grained control and photorealism. To reduc… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

    Comments: In CVPR2023. Project page: https://augmentedperception.github.io/monoavatar/

  29. PROCTER: PROnunciation-aware ConTextual adaptER for personalized speech recognition in neural transducers

    Authors: Rahul Pandey, Roger Ren, Qi Luo, Jing Liu, Ariya Rastrow, Ankur Gandhe, Denis Filimonov, Grant Strimel, Andreas Stolcke, Ivan Bulyko

    Abstract: End-to-End (E2E) automatic speech recognition (ASR) systems used in voice assistants often have difficulties recognizing infrequent words personalized to the user, such as names and places. Rare words often have non-trivial pronunciations, and in such cases, human knowledge in the form of a pronunciation lexicon can be useful. We propose a PROnunCiation-aware conTextual adaptER (PROCTER) that dyna… ▽ More

    Submitted 29 March, 2023; originally announced March 2023.

    Comments: To appear in Proc. IEEE ICASSP

    Journal ref: Proc. IEEE ICASSP, June 2023

  30. arXiv:2302.01672  [pdf, other

    eess.SP cs.NI

    The Role of Game Networking in the Fusion of Physical and Digital Worlds through 6G Wireless Networks

    Authors: Van-Phuc Bui, Shashi Raj Pandey, Andreas Casparsen, Federico Chiariotti, Petar Popovski

    Abstract: The sixth generation (6G) of wireless technology is seen as one of the enablers of real-time fusion of the physical and digital realms, as in Digital Twin, eXtended reality, or the Metaverse. This would allow people to interact, work, and entertain themselves in an immersive social network of online 3D~virtual environments. From the viewpoint of communication and networking, this will represent an… ▽ More

    Submitted 28 August, 2024; v1 submitted 3 February, 2023; originally announced February 2023.

  31. arXiv:2301.08998  [pdf, other

    cs.CL

    Syntax-guided Neural Module Distillation to Probe Compositionality in Sentence Embeddings

    Authors: Rohan Pandey

    Abstract: Past work probing compositionality in sentence embedding models faces issues determining the causal impact of implicit syntax representations. Given a sentence, we construct a neural module net based on its syntax parse and train it end-to-end to approximate the sentence's embedding generated by a transformer model. The distillability of a transformer to a Syntactic NeurAl Module Net (SynNaMoN) th… ▽ More

    Submitted 8 February, 2023; v1 submitted 21 January, 2023; originally announced January 2023.

    Comments: EACL 2023 (camera-ready)

  32. arXiv:2212.10549  [pdf, other

    cs.CL cs.CV cs.LG

    Cross-modal Attention Congruence Regularization for Vision-Language Relation Alignment

    Authors: Rohan Pandey, Rulin Shao, Paul Pu Liang, Ruslan Salakhutdinov, Louis-Philippe Morency

    Abstract: Despite recent progress towards scaling up multimodal vision-language models, these models are still known to struggle on compositional generalization benchmarks such as Winoground. We find that a critical component lacking from current vision-language models is relation-level alignment: the ability to match directional semantic relations in text (e.g., "mug in grass") with spatial relationships i… ▽ More

    Submitted 4 July, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: ACL 2023

  33. arXiv:2212.03470  [pdf, other

    eess.AS cs.SD

    Improving trajectory localization accuracy via direction-of-arrival derivative estimation

    Authors: Ruchi Pandey, Shreyas Jaiswal, Huy Phan, Santosh Nannuru

    Abstract: Sound source localization is crucial in acoustic sensing and monitoring-related applications. In this paper, we do a comprehensive analysis of improvement in sound source localization by combining the direction of arrivals (DOAs) with their derivatives which quantify the changes in the positions of sources over time. This study uses the SALSA-Lite feature with a convolutional recurrent neural netw… ▽ More

    Submitted 10 December, 2022; v1 submitted 7 December, 2022; originally announced December 2022.

  34. arXiv:2209.09775  [pdf, other

    cs.LG cs.DC cs.GT cs.NI

    FedToken: Tokenized Incentives for Data Contribution in Federated Learning

    Authors: Shashi Raj Pandey, Lam Duc Nguyen, Petar Popovski

    Abstract: Incentives that compensate for the involved costs in the decentralized training of a Federated Learning (FL) model act as a key stimulus for clients' long-term participation. However, it is challenging to convince clients for quality participation in FL due to the absence of: (i) full information on the client's data quality and properties; (ii) the value of client's data contributions; and (iii)… ▽ More

    Submitted 3 November, 2022; v1 submitted 20 September, 2022; originally announced September 2022.

    Comments: Accepted at Workshop on Federated Learning: Recent Advances and New Challenges, in Conjunction with NeurIPS 2022 (FL-NeurIPS'22). 9 Pages, 5 Figures

  35. arXiv:2209.04648  [pdf, other

    cs.CV cs.AI

    CoreDeep: Improving Crack Detection Algorithms Using Width Stochasticity

    Authors: Ram Krishna Pandey, Akshit Achara

    Abstract: Automatically detecting or segmenting cracks in images can help in reducing the cost of maintenance or operations. Detecting, measuring and quantifying cracks for distress analysis in challenging background scenarios is a difficult task as there is no clear boundary that separates cracks from the background. Developed algorithms should handle the inherent challenges associated with data. Some of t… ▽ More

    Submitted 29 December, 2023; v1 submitted 10 September, 2022; originally announced September 2022.

  36. arXiv:2206.07785  [pdf, other

    cs.NI cs.DC cs.GT cs.LG eess.SY

    Strategic Coalition for Data Pricing in IoT Data Markets

    Authors: Shashi Raj Pandey, Pierre Pinson, Petar Popovski

    Abstract: This paper considers a market for trading Internet of Things (IoT) data that is used to train machine learning models. The data, either raw or processed, is supplied to the market platform through a network and the price of such data is controlled based on the value it brings to the machine learning model. We explore the correlation property of data in a game-theoretical setting to eventually deri… ▽ More

    Submitted 29 August, 2023; v1 submitted 15 June, 2022; originally announced June 2022.

    Comments: 15 pages. 12 figures. This paper has been accepted for publication in IEEE Internet of Things Journal. Copyright may change without notice

  37. Semi-Private Computation of Data Similarity with Applications to Data Valuation and Pricing

    Authors: René Bødker Christensen, Shashi Raj Pandey, Petar Popovski

    Abstract: Consider two data providers that want to contribute data to a certain learning model. Recent works have shown that the value of the data of one of the providers is dependent on the similarity with the data owned by the other provider. It would thus be beneficial if the two providers can calculate the similarity of their data, while keeping the actual data private. In this work, we devise multipart… ▽ More

    Submitted 11 April, 2023; v1 submitted 14 June, 2022; originally announced June 2022.

    Comments: 11 pages

    MSC Class: 94A60; 68P27

    Journal ref: IEEE Transactions on Information Forensics and Security (2023). Vol 18, pp. 1978-1988

  38. arXiv:2204.01542  [pdf, other

    cs.LG cs.AI

    CDKT-FL: Cross-Device Knowledge Transfer using Proxy Dataset in Federated Learning

    Authors: Huy Q. Le, Minh N. H. Nguyen, Shashi Raj Pandey, Chaoning Zhang, Choong Seon Hong

    Abstract: In a practical setting, how to enable robust Federated Learning (FL) systems, both in terms of generalization and personalization abilities, is one important research question. It is a challenging issue due to the consequences of non-i.i.d. properties of client's data, often referred to as statistical heterogeneity, and small local data samples from the various data distributions. Therefore, to de… ▽ More

    Submitted 8 June, 2024; v1 submitted 4 April, 2022; originally announced April 2022.

    Comments: Accepted to Engineering Applications of Artificial Intelligence (EAAI)

  39. A Contribution-based Device Selection Scheme in Federated Learning

    Authors: Shashi Raj Pandey, Lam D. Nguyen, Petar Popovski

    Abstract: In a Federated Learning (FL) setup, a number of devices contribute to the training of a common model. We present a method for selecting the devices that provide updates in order to achieve improved generalization, fast convergence, and better device-level performance. We formulate a min-max optimization problem and decompose it into a primal-dual setup, where the duality gap is used to quantify th… ▽ More

    Submitted 7 June, 2022; v1 submitted 10 March, 2022; originally announced March 2022.

    Comments: This work has been accepted for publication in IEEE Communications Letters

  40. arXiv:2201.04873  [pdf, other

    cs.CV

    VoLux-GAN: A Generative Model for 3D Face Synthesis with HDRI Relighting

    Authors: Feitong Tan, Sean Fanello, Abhimitra Meka, Sergio Orts-Escolano, Danhang Tang, Rohit Pandey, Jonathan Taylor, Ping Tan, Yinda Zhang

    Abstract: We propose VoLux-GAN, a generative framework to synthesize 3D-aware faces with convincing relighting. Our main contribution is a volumetric HDRI relighting method that can efficiently accumulate albedo, diffuse and specular lighting contributions along each 3D ray for any desired HDR environmental map. Additionally, we show the importance of supervising the image decomposition process using multip… ▽ More

    Submitted 13 January, 2022; originally announced January 2022.

  41. arXiv:2112.02870  [pdf, other

    cs.LG cs.DC

    A Marketplace for Trading AI Models based on Blockchain and Incentives for IoT Data

    Authors: Lam Duc Nguyen, Shashi Raj Pandey, Soret Beatriz, Arne Broering, Petar Popovski

    Abstract: As Machine Learning (ML) models are becoming increasingly complex, one of the central challenges is their deployment at scale, such that companies and organizations can create value through Artificial Intelligence (AI). An emerging paradigm in ML is a federated approach where the learning model is delivered to a group of heterogeneous agents partially, allowing agents to train the model locally wi… ▽ More

    Submitted 6 December, 2021; originally announced December 2021.

    Comments: 14 pages, 9 figures, submitted for publication

  42. arXiv:2111.05811  [pdf, other

    cs.IT cs.NI

    Internet of Things (IoT) Connectivity in 6G: An Interplay of Time, Space, Intelligence, and Value

    Authors: Petar Popovski, Federico Chiariotti, Victor Croisfelt, Anders E. Kalør, Israel Leyva-Mayorga, Letizia Marchegiani, Shashi Raj Pandey, Beatriz Soret

    Abstract: Internet of Things (IoT) connectivity has a prominent presence in the 5G wireless communication systems. As these systems are being deployed, there is a surge of research efforts and visions towards 6G wireless systems. In order to position the evolution of IoT within the 6G systems, this paper first takes a critical view on the way IoT connectivity is supported within 5G. Following that, the wire… ▽ More

    Submitted 11 November, 2021; v1 submitted 10 November, 2021; originally announced November 2021.

    Comments: Submitted for publication

  43. arXiv:2111.02519  [pdf, other

    cs.CL

    Athena 2.0: Contextualized Dialogue Management for an Alexa Prize SocialBot

    Authors: Juraj Juraska, Kevin K. Bowden, Lena Reed, Vrindavan Harrison, Wen Cui, Omkar Patil, Rishi Rajasekaran, Angela Ramirez, Cecilia Li, Eduardo Zamora, Phillip Lee, Jeshwanth Bheemanpally, Rohan Pandey, Adwait Ratnaparkhi, Marilyn Walker

    Abstract: Athena 2.0 is an Alexa Prize SocialBot that has been a finalist in the last two Alexa Prize Grand Challenges. One reason for Athena's success is its novel dialogue management strategy, which allows it to dynamically construct dialogues and responses from component modules, leading to novel conversations with every interaction. Here we describe Athena's system design and performance in the Alexa Pr… ▽ More

    Submitted 3 November, 2021; originally announced November 2021.

    Comments: Accepted to EMNLP 2021 System Demonstrations

  44. arXiv:2108.10316  [pdf, ps, other

    cs.IT

    A Generalization of the ASR Search Algorithm to 2-Generator Quasi-Twisted Codes

    Authors: Dev Akre, Nuh Aydin, Matthew J. Harrington, Saurav R. Pandey

    Abstract: One of the main goals of coding theory is to construct codes with best possible parameters and properties. A special class of codes called quasi-twisted (QT) codes is well-known to produce codes with good parameters. Most of the work on QT codes has been over the 1-generator case. In this work, we focus on 2-generator QT codes and generalize the ASR algorithm that has been very effective to produc… ▽ More

    Submitted 20 August, 2021; originally announced August 2021.

  45. arXiv:2108.06752  [pdf, ps, other

    cs.IT

    New Binary and Ternary Quasi-Cyclic Codes with Good Properties

    Authors: Dev Akre, Nuh Aydin, Matthew J. Harrington, Saurav R. Pandey

    Abstract: One of the most important and challenging problems in coding theory is to construct codes with best possible parameters and properties. The class of quasi-cyclic (QC) codes is known to be fertile to produce such codes. Focusing on QC codes over the binary field, we have found 113 binary QC codes that are new among the class of QC codes using an implementation of a fast cyclic partitioning algorith… ▽ More

    Submitted 15 August, 2021; originally announced August 2021.

    MSC Class: 94B05; 94B15; 94B65

  46. arXiv:2108.04208  [pdf, other

    cs.CY cs.HC

    Experiences with the Introduction of AI-based Tools for Moderation Automation of Voice-based Participatory Media Forums

    Authors: Aman Khullar, Paramita Panjal, Rachit Pandey, Abhishek Burnwal, Prashit Raj, Ankit Akash Jha, Priyadarshi Hitesh, R Jayanth Reddy, Himanshu, Aaditeshwar Seth

    Abstract: Voice-based discussion forums where users can record audio messages which are then published for other users to listen and comment, are often moderated to ensure that the published audios are of good quality, relevant, and adhere to editorial guidelines of the forum. There is room for the introduction of AI-based tools in the moderation process, such as to identify and filter out blank or noisy au… ▽ More

    Submitted 9 August, 2021; originally announced August 2021.

    Journal ref: The 3rd KDD Workshop on Data Science for Social Good, 2021

  47. arXiv:2103.15573  [pdf, other

    cs.CV

    HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences

    Authors: Feitong Tan, Danhang Tang, Mingsong Dou, Kaiwen Guo, Rohit Pandey, Cem Keskin, Ruofei Du, Deqing Sun, Sofien Bouaziz, Sean Fanello, Ping Tan, Yinda Zhang

    Abstract: In this paper, we address the problem of building dense correspondences between human images under arbitrary camera viewpoints and body poses. Prior art either assumes small motion between frames or relies on local descriptors, which cannot handle large motion or visually ambiguous body parts, e.g., left vs. right hand. In contrast, we propose a deep learning framework that maps each pixel to a fe… ▽ More

    Submitted 29 March, 2021; originally announced March 2021.

  48. arXiv:2103.13293  [pdf, other

    cs.NI eess.SY

    Energy-aware Resource Management for Federated Learning in Multi-access Edge Computing Systems

    Authors: Chit Wutyee Zaw, Shashi Raj Pandey, Kitae Kim, Choong Seon Hong

    Abstract: In Federated Learning (FL), a global statistical model is developed by encouraging mobile users to perform the model training on their local data and aggregating the output local model parameters in an iterative manner. However, due to limited energy and computation capability at the mobile devices, the performance of the model training is always at stake to meet the objective of local energy mini… ▽ More

    Submitted 11 January, 2021; originally announced March 2021.

  49. arXiv:2103.06049  [pdf

    cs.SD cs.RO eess.AS

    Search Disaster Victims using Sound Source Localization

    Authors: Abhish Khanal, Deepak Chand, Prakash Chaudhary, Subash Timilsina, Sanjeeb Prasad Panday, Aman Shakya, Rom Kant Pandey

    Abstract: Sound Source Localization (SSL) are used to estimate the position of sound sources. Various methods have been used for detecting sound and its localization. This paper presents a system for stationary sound source localization by cubical microphone array consisting of eight microphones placed on four vertical adjacent faces which is mounted on three wheel omni-directional drive for the inspection… ▽ More

    Submitted 10 March, 2021; originally announced March 2021.

    Comments: 9 pages, 17 figures, 17th ISCRAM Conference Blacksburg, VA, USA

    Journal ref: Iscram 2020 1022-1030

  50. Edge-assisted Democratized Learning Towards Federated Analytics

    Authors: Shashi Raj Pandey, Minh N. H. Nguyen, Tri Nguyen Dang, Nguyen H. Tran, Kyi Thar, Zhu Han, Choong Seon Hong

    Abstract: A recent take towards Federated Analytics (FA), which allows analytical insights of distributed datasets, reuses the Federated Learning (FL) infrastructure to evaluate the summary of model performances across the training devices. However, the current realization of FL adopts single server-multiple client architecture with limited scope for FA, which often results in learning models with poor gene… ▽ More

    Submitted 31 May, 2021; v1 submitted 1 December, 2020; originally announced December 2020.

    Comments: Accepted for publication in IEEE Internet of Things Journal