Search | arXiv e-print repository

Exploring Linguistic Similarity and Zero-Shot Learning for Multilingual Translation of Dravidian Languages

Authors: Danish Ebadulla, Rahul Raman, S. Natarajan, Hridhay Kiran Shetty, Ashish Harish Shenoy

Abstract: Current research in zero-shot translation is plagued by several issues such as high compute requirements, increased training time and off target translations. Proposed remedies often come at the cost of additional data or compute requirements. Pivot based neural machine translation is preferred over a single-encoder model for most settings despite the increased training and evaluation time. In thi… ▽ More Current research in zero-shot translation is plagued by several issues such as high compute requirements, increased training time and off target translations. Proposed remedies often come at the cost of additional data or compute requirements. Pivot based neural machine translation is preferred over a single-encoder model for most settings despite the increased training and evaluation time. In this work, we overcome the shortcomings of zero-shot translation by taking advantage of transliteration and linguistic similarity. We build a single encoder-decoder neural machine translation system for Dravidian-Dravidian multilingual translation and perform zero-shot translation. We compare the data vs zero-shot accuracy tradeoff and evaluate the performance of our vanilla method against the current state of the art pivot based method. We also test the theory that morphologically rich languages require large vocabularies by restricting the vocabulary using an optimal transport based technique. Our model manages to achieves scores within 3 BLEU of large-scale pivot-based models when it is trained on 50\% of the language directions. △ Less

Submitted 10 August, 2023; originally announced August 2023.

arXiv:2303.04923 [pdf, other]

BOSS: Bones, Organs and Skin Shape Model

Authors: Karthik Shetty, Annette Birkhold, Srikrishna Jaganathan, Norbert Strobel, Bernhard Egger, Markus Kowarschik, Andreas Maier

Abstract: Objective: A digital twin of a patient can be a valuable tool for enhancing clinical tasks such as workflow automation, patient-specific X-ray dose optimization, markerless tracking, positioning, and navigation assistance in image-guided interventions. However, it is crucial that the patient's surface and internal organs are of high quality for any pose and shape estimates. At present, the majorit… ▽ More Objective: A digital twin of a patient can be a valuable tool for enhancing clinical tasks such as workflow automation, patient-specific X-ray dose optimization, markerless tracking, positioning, and navigation assistance in image-guided interventions. However, it is crucial that the patient's surface and internal organs are of high quality for any pose and shape estimates. At present, the majority of statistical shape models (SSMs) are restricted to a small number of organs or bones or do not adequately represent the general population. Method: To address this, we propose a deformable human shape and pose model that combines skin, internal organs, and bones, learned from CT images. By modeling the statistical variations in a pose-normalized space using probabilistic PCA while also preserving joint kinematics, our approach offers a holistic representation of the body that can benefit various medical applications. Results: We assessed our model's performance on a registered dataset, utilizing the unified shape space, and noted an average error of 3.6 mm for bones and 8.8 mm for organs. To further verify our findings, we conducted additional tests on publicly available datasets with multi-part segmentations, which confirmed the effectiveness of our model. Conclusion: This works shows that anatomically parameterized statistical shape models can be created accurately and in a computationally efficient manner. Significance: The proposed approach enables the construction of shape models that can be directly applied to various medical applications, including biomechanics and reconstruction. △ Less

Submitted 8 March, 2023; originally announced March 2023.

arXiv:2211.11734 [pdf, other]

PLIKS: A Pseudo-Linear Inverse Kinematic Solver for 3D Human Body Estimation

Authors: Karthik Shetty, Annette Birkhold, Srikrishna Jaganathan, Norbert Strobel, Markus Kowarschik, Andreas Maier, Bernhard Egger

Abstract: We introduce PLIKS (Pseudo-Linear Inverse Kinematic Solver) for reconstruction of a 3D mesh of the human body from a single 2D image. Current techniques directly regress the shape, pose, and translation of a parametric model from an input image through a non-linear mapping with minimal flexibility to any external influences. We approach the task as a model-in-the-loop optimization problem. PLIKS i… ▽ More We introduce PLIKS (Pseudo-Linear Inverse Kinematic Solver) for reconstruction of a 3D mesh of the human body from a single 2D image. Current techniques directly regress the shape, pose, and translation of a parametric model from an input image through a non-linear mapping with minimal flexibility to any external influences. We approach the task as a model-in-the-loop optimization problem. PLIKS is built on a linearized formulation of the parametric SMPL model. Using PLIKS, we can analytically reconstruct the human model via 2D pixel-aligned vertices. This enables us with the flexibility to use accurate camera calibration information when available. PLIKS offers an easy way to introduce additional constraints such as shape and translation. We present quantitative evaluations which confirm that PLIKS achieves more accurate reconstruction with greater than 10% improvement compared to other state-of-the-art methods with respect to the standard 3D human pose and shape benchmarks while also obtaining a reconstruction error improvement of 12.9 mm on the newer AGORA dataset. △ Less

Submitted 27 March, 2023; v1 submitted 21 November, 2022; originally announced November 2022.

Comments: CVPR2023

arXiv:2210.07611 [pdf, other]

Self-Supervised 2D/3D Registration for X-Ray to CT Image Fusion

Authors: Srikrishna Jaganathan, Maximilian Kukla, Jian Wang, Karthik Shetty, Andreas Maier

Abstract: Deep Learning-based 2D/3D registration enables fast, robust, and accurate X-ray to CT image fusion when large annotated paired datasets are available for training. However, the need for paired CT volume and X-ray images with ground truth registration limits the applicability in interventional scenarios. An alternative is to use simulated X-ray projections from CT volumes, thus removing the need fo… ▽ More Deep Learning-based 2D/3D registration enables fast, robust, and accurate X-ray to CT image fusion when large annotated paired datasets are available for training. However, the need for paired CT volume and X-ray images with ground truth registration limits the applicability in interventional scenarios. An alternative is to use simulated X-ray projections from CT volumes, thus removing the need for paired annotated datasets. Deep Neural Networks trained exclusively on simulated X-ray projections can perform significantly worse on real X-ray images due to the domain gap. We propose a self-supervised 2D/3D registration framework combining simulated training with unsupervised feature and pixel space domain adaptation to overcome the domain gap and eliminate the need for paired annotated datasets. Our framework achieves a registration accuracy of 1.83$\pm$1.16 mm with a high success ratio of 90.1% on real X-ray images showing a 23.9% increase in success ratio compared to reference annotation-free algorithms. △ Less

Submitted 14 October, 2022; originally announced October 2022.

Comments: Accepted at WACV 2023

arXiv:2207.11345 [pdf, ps, other]

doi 10.21437/Interspeech.2022-10816

Toward Fairness in Speech Recognition: Discovery and mitigation of performance disparities

Authors: Pranav Dheram, Murugesan Ramakrishnan, Anirudh Raju, I-Fan Chen, Brian King, Katherine Powell, Melissa Saboowala, Karan Shetty, Andreas Stolcke

Abstract: As for other forms of AI, speech recognition has recently been examined with respect to performance disparities across different user cohorts. One approach to achieve fairness in speech recognition is to (1) identify speaker cohorts that suffer from subpar performance and (2) apply fairness mitigation measures targeting the cohorts discovered. In this paper, we report on initial findings with both… ▽ More As for other forms of AI, speech recognition has recently been examined with respect to performance disparities across different user cohorts. One approach to achieve fairness in speech recognition is to (1) identify speaker cohorts that suffer from subpar performance and (2) apply fairness mitigation measures targeting the cohorts discovered. In this paper, we report on initial findings with both discovery and mitigation of performance disparities using data from a product-scale AI assistant speech recognition system. We compare cohort discovery based on geographic and demographic information to a more scalable method that groups speakers without human labels, using speaker embedding technology. For fairness mitigation, we find that oversampling of underrepresented cohorts, as well as modeling speaker cohort membership by additional input variables, reduces the gap between top- and bottom-performing cohorts, without deteriorating overall recognition accuracy. △ Less

Submitted 22 July, 2022; originally announced July 2022.

Comments: Proc. Interspeech 2022

Journal ref: Proc. Interspeech, Sept. 2022, pp. 1268-1272

arXiv:2108.05891 [pdf, other]

Page-level Optimization of e-Commerce Item Recommendations

Authors: Chieh Lo, Hongliang Yu, Xin Yin, Krutika Shetty, Changchen He, Kathy Hu, Justin Platz, Adam Ilardi, Sriganesh Madhvanath

Abstract: The item details page (IDP) is a web page on an e-commerce website that provides information on a specific product or item listing. Just below the details of the item on this page, the buyer can usually find recommendations for other relevant items. These are typically in the form of a series of modules or carousels, with each module containing a set of recommended items. The selection and orderin… ▽ More The item details page (IDP) is a web page on an e-commerce website that provides information on a specific product or item listing. Just below the details of the item on this page, the buyer can usually find recommendations for other relevant items. These are typically in the form of a series of modules or carousels, with each module containing a set of recommended items. The selection and ordering of these item recommendation modules are intended to increase discover-ability of relevant items and encourage greater user engagement, while simultaneously showcasing diversity of inventory and satisfying other business objectives. Item recommendation modules on the IDP are often curated and statically configured for all customers, ignoring opportunities for personalization. In this paper, we present a scalable end-to-end production system to optimize the personalized selection and ordering of item recommendation modules on the IDP in real-time by utilizing deep neural networks. Through extensive offline experimentation and online A/B testing, we show that our proposed system achieves significantly higher click-through and conversion rates compared to other existing methods. In our online A/B test, our framework improved click-through rate by 2.48% and purchase-through rate by 7.34% over a static configuration. △ Less

Submitted 12 August, 2021; originally announced August 2021.

Comments: Accepted by RecSys 2021

arXiv:2108.00963 [pdf, other]

Scope-Bounded Reachability in Valence Systems

Authors: Aneesh K. Shetty, S. Krishna, Georg Zetzsche

Abstract: Multi-pushdown systems are a standard model for concurrent recursive programs, but they have an undecidable reachability problem. Therefore, there have been several proposals to underapproximate their sets of runs so that reachability in this underapproximation becomes decidable. One such underapproximation that covers a relatively high portion of runs is scope boundedness. In such a run, after ea… ▽ More Multi-pushdown systems are a standard model for concurrent recursive programs, but they have an undecidable reachability problem. Therefore, there have been several proposals to underapproximate their sets of runs so that reachability in this underapproximation becomes decidable. One such underapproximation that covers a relatively high portion of runs is scope boundedness. In such a run, after each push to stack $i$, the corresponding pop operation must come within a bounded number of visits to stack $i$. In this work, we generalize this approach to a large class of infinite-state systems. For this, we consider the model of valence systems, which consist of a finite-state control and an infinite-state storage mechanism that is specified by a finite undirected graph. This framework captures pushdowns, vector addition systems, integer vector addition systems, and combinations thereof. For this framework, we propose a notion of scope boundedness that coincides with the classical notion when the storage mechanism happens to be a multi-pushdown. We show that with this notion, reachability can be decided in PSPACE for every storage mechanism in the framework. Moreover, we describe the full complexity landscape of this problem across all storage mechanisms, both in the case of (i) the scope bound being given as input and (ii) for fixed scope bounds. Finally, we provide an almost complete description of the complexity landscape if even a description of the storage mechanism is part of the input. △ Less

Submitted 2 August, 2021; originally announced August 2021.

Comments: To appear in proceedings of CONCUR 2021

arXiv:2107.10004 [pdf, other]

doi 10.1007/978-3-030-87202-1_37

Deep Iterative 2D/3D Registration

Authors: Srikrishna Jaganathan, Jian Wang, Anja Borsdorf, Karthik Shetty, Andreas Maier

Abstract: Deep Learning-based 2D/3D registration methods are highly robust but often lack the necessary registration accuracy for clinical application. A refinement step using the classical optimization-based 2D/3D registration method applied in combination with Deep Learning-based techniques can provide the required accuracy. However, it also increases the runtime. In this work, we propose a novel Deep Lea… ▽ More Deep Learning-based 2D/3D registration methods are highly robust but often lack the necessary registration accuracy for clinical application. A refinement step using the classical optimization-based 2D/3D registration method applied in combination with Deep Learning-based techniques can provide the required accuracy. However, it also increases the runtime. In this work, we propose a novel Deep Learning driven 2D/3D registration framework that can be used end-to-end for iterative registration tasks without relying on any further refinement step. We accomplish this by learning the update step of the 2D/3D registration framework using Point-to-Plane Correspondences. The update step is learned using iterative residual refinement-based optical flow estimation, in combination with the Point-to-Plane correspondence solver embedded as a known operator. Our proposed method achieves an average runtime of around 8s, a mean re-projection distance error of 0.60 $\pm$ 0.40 mm with a success ratio of 97 percent and a capture range of 60 mm. The combination of high registration accuracy, high robustness, and fast runtime makes our solution ideal for clinical applications. △ Less

Submitted 21 July, 2021; originally announced July 2021.

Comments: 10 pages,2 figures, Accepted at MICCAI 2021

arXiv:2104.11324 [pdf, other]

doi 10.1145/3492321.3519553

Isolating Functions at the Hardware Limit with Virtines

Authors: Nicholas C. Wanninger, Joshua J. Bowden, Kirtankumar Shetty, Ayush Garg, Kyle C. Hale

Abstract: An important class of applications, including programs that leverage third-party libraries, programs that use user-defined functions in databases, and serverless applications, benefit from isolating the execution of untrusted code at the granularity of individual functions or function invocations. However, existing isolation mechanisms were not designed for this use case; rather, they have been ad… ▽ More An important class of applications, including programs that leverage third-party libraries, programs that use user-defined functions in databases, and serverless applications, benefit from isolating the execution of untrusted code at the granularity of individual functions or function invocations. However, existing isolation mechanisms were not designed for this use case; rather, they have been adapted to it. We introduce \textit{virtines}, a new abstraction designed specifically for function granularity isolation, and describe how we build virtines from the ground up by pushing hardware virtualization to its limits. Virtines give developers fine-grained control in deciding which functions should run in isolated environments, and which should not. The virtine abstraction is a general one, and we demonstrate a prototype that adds extensions to the C language. We present a detailed analysis of the overheads of running individual functions in isolated VMs, and guided by those findings, we present Wasp, an embeddable hypervisor that allows programmers to easily use virtines. We describe several representative scenarios that employ individual function isolation, and demonstrate that virtines can be applied in these scenarios with only a few lines of changes to existing codebases and with acceptable slowdowns. △ Less

Submitted 16 March, 2022; v1 submitted 14 April, 2021; originally announced April 2021.

Comments: In Proceedings of the 17th European Conference on Computer Systems (EuroSys '22), April 5-8, 2022, Rennes, France

ACM Class: D.4.6

arXiv:2102.02912 [pdf, other]

Deep Learning compatible Differentiable X-ray Projections for Inverse Rendering

Authors: Karthik Shetty, Annette Birkhold, Norbert Strobel, Bernhard Egger, Srikrishna Jaganathan, Markus Kowarschik, Andreas Maier

Abstract: Many minimally invasive interventional procedures still rely on 2D fluoroscopic imaging. Generating a patient-specific 3D model from these X-ray projection data would allow to improve the procedural workflow, e.g. by providing assistance functions such as automatic positioning. To accomplish this, two things are required. First, a statistical human shape model of the human anatomy and second, a di… ▽ More Many minimally invasive interventional procedures still rely on 2D fluoroscopic imaging. Generating a patient-specific 3D model from these X-ray projection data would allow to improve the procedural workflow, e.g. by providing assistance functions such as automatic positioning. To accomplish this, two things are required. First, a statistical human shape model of the human anatomy and second, a differentiable X-ray renderer. In this work, we propose a differentiable renderer by deriving the distance travelled by a ray inside mesh structures to generate a distance map. To demonstrate its functioning, we use it for simulating X-ray images from human shape models. Then we show its application by solving the inverse problem, namely reconstructing 3D models from real 2D fluoroscopy images of the pelvis, which is an ideal anatomical structure for patient registration. This is accomplished by an iterative optimization strategy using gradient descent. With the majority of the pelvis being in the fluoroscopic field of view, we achieve a mean Hausdorff distance of 30 mm between the reconstructed model and the ground truth segmentation. △ Less

Submitted 4 February, 2021; originally announced February 2021.

Comments: 7 pages, 3 figures, Accepted for Bildverarbeitung für die Medizin 2021

arXiv:1208.4321 [pdf, other]

Formal Verification of Safety Properties for Ownership Authentication Transfer Protocol

Authors: Swaraj Bhat, Pradeep B. H, Keerthi S. Shetty, Sanjay Singh

Abstract: In ubiquitous computing devices, users tend to store some valuable information in their device. Even though the device can be borrowed by the other user temporarily, it is not safe for any user to borrow or lend the device as it may cause private data of the user to be public. To safeguard the user data and also to preserve user privacy we propose and model the technique of ownership authenticatio… ▽ More In ubiquitous computing devices, users tend to store some valuable information in their device. Even though the device can be borrowed by the other user temporarily, it is not safe for any user to borrow or lend the device as it may cause private data of the user to be public. To safeguard the user data and also to preserve user privacy we propose and model the technique of ownership authentication transfer. The user who is willing to sell the device has to transfer the ownership of the device under sale. Once the device is sold and the ownership has been transferred, the old owner will not be able to use that device at any cost. Either of the users will not be able to use the device if the process of ownership has not been carried out properly. This also takes care of the scenario when the device has been stolen or lost, avoiding the impersonation attack. The aim of this paper is to model basic process of proposed ownership authentication transfer protocol and check its safety properties by representing it using CSP and model checking approach. For model checking we have used a symbolic model checker tool called NuSMV. The safety properties of ownership transfer protocol has been modeled in terms of CTL specification and it is observed that the system satisfies all the protocol constraint and is safe to be deployed. △ Less

Submitted 21 August, 2012; originally announced August 2012.

Comments: 16 pages, 7 figures,Submitted to ADCOM 2012

arXiv:1111.1894 [pdf]

doi 10.5121/iju.2011.2404

Cloud Based Application Development for Accessing Restaurant Information on Mobile Device using LBS

Authors: Keerthi S. Shetty, Sanjay Singh

Abstract: Over the past couple of years, the extent of the services provided on the mobile devices has increased rapidly. A special class of service among them is the Location Based Service(LBS) which depends on the geographical position of the user to provide services to the end users. However, a mobile device is still resource constrained, and some applications usually demand more resources than a mobile… ▽ More Over the past couple of years, the extent of the services provided on the mobile devices has increased rapidly. A special class of service among them is the Location Based Service(LBS) which depends on the geographical position of the user to provide services to the end users. However, a mobile device is still resource constrained, and some applications usually demand more resources than a mobile device can a ord. To alleviate this, a mobile device should get resources from an external source. One of such sources is cloud computing platforms. We can predict that the mobile area will take on a boom with the advent of this new concept. The aim of this paper is to exchange messages between user and location service provider in mobile device accessing the cloud by minimizing cost, data storage and processing power. Our main goal is to provide dynamic location-based service and increase the information retrieve accuracy especially on the limited mobile screen by accessing cloud application. In this paper we present location based restaurant information retrieval system and we have developed our application in Android. △ Less

Submitted 8 November, 2011; originally announced November 2011.

Comments: 11 pages, 10 figures

Journal ref: International Journal of UbiComp (IJU), vol.2, no.4, 2011, pp.37-49

Showing 1–12 of 12 results for author: Shetty, K