-
Engineering an Efficient Approximate DNF-Counter
Authors:
Mate Soos,
Uddalok Sarkar,
Divesh Aggarwal,
Sourav Chakraborty,
Kuldeep S. Meel,
Maciej Obremski
Abstract:
Model counting is a fundamental problem in many practical applications, including query evaluation in probabilistic databases and failure-probability estimation of networks. In this work, we focus on a variant of this problem where the underlying formula is expressed in the Disjunctive Normal Form (DNF), also known as #DNF. This problem has been shown to be #P-complete, making it often intractable…
▽ More
Model counting is a fundamental problem in many practical applications, including query evaluation in probabilistic databases and failure-probability estimation of networks. In this work, we focus on a variant of this problem where the underlying formula is expressed in the Disjunctive Normal Form (DNF), also known as #DNF. This problem has been shown to be #P-complete, making it often intractable to solve exactly. Much research has therefore focused on obtaining approximate solutions, particularly in the form of $(\varepsilon, δ)$ approximations.
The primary contribution of this paper is a new approach, called pepin, an approximate #DNF counter that significantly outperforms prior state-of-the-art approaches. Our work is based on the recent breakthrough in the context of the union of sets in the streaming model. We demonstrate the effectiveness of our approach through extensive experiments and show that it provides an affirmative answer to the challenge of efficiently computing #DNF.
△ Less
Submitted 29 July, 2024;
originally announced July 2024.
-
"I understand why I got this grade": Automatic Short Answer Grading with Feedback
Authors:
Dishank Aggarwal,
Pushpak Bhattacharyya,
Bhaskaran Raman
Abstract:
The demand for efficient and accurate assessment methods has intensified as education systems transition to digital platforms. Providing feedback is essential in educational settings and goes beyond simply conveying marks as it justifies the assigned marks. In this context, we present a significant advancement in automated grading by introducing Engineering Short Answer Feedback (EngSAF) -- a data…
▽ More
The demand for efficient and accurate assessment methods has intensified as education systems transition to digital platforms. Providing feedback is essential in educational settings and goes beyond simply conveying marks as it justifies the assigned marks. In this context, we present a significant advancement in automated grading by introducing Engineering Short Answer Feedback (EngSAF) -- a dataset of 5.8k student answers accompanied by reference answers and questions for the Automatic Short Answer Grading (ASAG) task. The EngSAF dataset is meticulously curated to cover a diverse range of subjects, questions, and answer patterns from multiple engineering domains. We leverage state-of-the-art large language models' (LLMs) generative capabilities with our Label-Aware Synthetic Feedback Generation (LASFG) strategy to include feedback in our dataset. This paper underscores the importance of enhanced feedback in practical educational settings, outlines dataset annotation and feedback generation processes, conducts a thorough EngSAF analysis, and provides different LLMs-based zero-shot and finetuned baselines for future comparison. Additionally, we demonstrate the efficiency and effectiveness of the ASAG system through its deployment in a real-world end-semester exam at the Indian Institute of Technology Bombay (IITB), showcasing its practical viability and potential for broader implementation in educational institutions.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
Polynomial Time Algorithms for Integer Programming and Unbounded Subset Sum in the Total Regime
Authors:
Divesh Aggarwal,
Antoine Joux,
Miklos Santha,
Karol Węgrzycki
Abstract:
The Unbounded Subset Sum (USS) problem is an NP-hard computational problem where the goal is to decide whether there exist non-negative integers $x_1, \ldots, x_n$ such that $x_1 a_1 + \ldots + x_n a_n = b$, where $a_1 < \cdots < a_n < b$ are distinct positive integers with $\text{gcd}(a_1, \ldots, a_n)$ dividing $b$. The problem can be solved in pseudopolynomial time, while specialized cases, suc…
▽ More
The Unbounded Subset Sum (USS) problem is an NP-hard computational problem where the goal is to decide whether there exist non-negative integers $x_1, \ldots, x_n$ such that $x_1 a_1 + \ldots + x_n a_n = b$, where $a_1 < \cdots < a_n < b$ are distinct positive integers with $\text{gcd}(a_1, \ldots, a_n)$ dividing $b$. The problem can be solved in pseudopolynomial time, while specialized cases, such as when $b$ exceeds the Frobenius number of $a_1, \ldots, a_n$ simplify to a total problem where a solution always exists.
This paper explores the concept of totality in USS. The challenge in this setting is to actually find a solution, even though we know its existence is guaranteed. We focus on the instances of USS where solutions are guaranteed for large $b$. We show that when $b$ is slightly greater than the Frobenius number, we can find the solution to USS in polynomial time.
We then show how our results extend to Integer Programming with Equalities (ILPE), highlighting conditions under which ILPE becomes total. We investigate the diagonal Frobenius number, which is the appropriate generalization of the Frobenius number to this context. In this setting, we give a polynomial-time algorithm to find a solution of ILPE. The bound obtained from our algorithmic procedure for finding a solution almost matches the recent existential bound of Bach, Eisenbrand, Rothvoss, and Weismantel (2024).
△ Less
Submitted 11 July, 2024; v1 submitted 7 July, 2024;
originally announced July 2024.
-
Improving Self Consistency in LLMs through Probabilistic Tokenization
Authors:
Ashutosh Sathe,
Divyanshu Aggarwal,
Sunayana Sitaram
Abstract:
Prior research has demonstrated noticeable performance gains through the use of probabilistic tokenizations, an approach that involves employing multiple tokenizations of the same input string during the training phase of a language model. Despite these promising findings, modern large language models (LLMs) have yet to be trained using probabilistic tokenizations. Interestingly, while the tokeniz…
▽ More
Prior research has demonstrated noticeable performance gains through the use of probabilistic tokenizations, an approach that involves employing multiple tokenizations of the same input string during the training phase of a language model. Despite these promising findings, modern large language models (LLMs) have yet to be trained using probabilistic tokenizations. Interestingly, while the tokenizers of these contemporary LLMs have the capability to generate multiple tokenizations, this property remains underutilized.
In this work, we propose a novel method to leverage the multiple tokenization capabilities of modern LLM tokenizers, aiming to enhance the self-consistency of LLMs in reasoning tasks. Our experiments indicate that when utilizing probabilistic tokenizations, LLMs generate logically diverse reasoning paths, moving beyond mere surface-level linguistic diversity.We carefully study probabilistic tokenization and offer insights to explain the self consistency improvements it brings through extensive experimentation on 5 LLM families and 4 reasoning benchmarks.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Sketch-Plan-Generalize: Continual Few-Shot Learning of Inductively Generalizable Spatial Concepts
Authors:
Namasivayam Kalithasan,
Sachit Sachdeva,
Himanshu Gaurav Singh,
Vishal Bindal,
Arnav Tuli,
Gurarmaan Singh Panjeta,
Divyanshu Aggarwal,
Rohan Paul,
Parag Singla
Abstract:
Our goal is to enable embodied agents to learn inductively generalizable spatial concepts, e.g., learning staircase as an inductive composition of towers of increasing height. Given a human demonstration, we seek a learning architecture that infers a succinct ${program}$ representation that explains the observed instance. Additionally, the approach should generalize inductively to novel structures…
▽ More
Our goal is to enable embodied agents to learn inductively generalizable spatial concepts, e.g., learning staircase as an inductive composition of towers of increasing height. Given a human demonstration, we seek a learning architecture that infers a succinct ${program}$ representation that explains the observed instance. Additionally, the approach should generalize inductively to novel structures of different sizes or complex structures expressed as a hierarchical composition of previously learned concepts. Existing approaches that use code generation capabilities of pre-trained large (visual) language models, as well as purely neural models, show poor generalization to a-priori unseen complex concepts. Our key insight is to factor inductive concept learning as (i) ${\it Sketch:}$ detecting and inferring a coarse signature of a new concept (ii) ${\it Plan:}$ performing MCTS search over grounded action sequences (iii) ${\it Generalize:}$ abstracting out grounded plans as inductive programs. Our pipeline facilitates generalization and modular reuse, enabling continual concept learning. Our approach combines the benefits of the code generation ability of large language models (LLM) along with grounded neural representations, resulting in neuro-symbolic programs that show stronger inductive generalization on the task of constructing complex structures in relation to LLM-only and neural-only approaches. Furthermore, we demonstrate reasoning and planning capabilities with learned concepts for embodied instruction following.
△ Less
Submitted 29 May, 2024; v1 submitted 11 April, 2024;
originally announced April 2024.
-
Self-evolving Autoencoder Embedded Q-Network
Authors:
J. Senthilnath,
Bangjian Zhou,
Zhen Wei Ng,
Deeksha Aggarwal,
Rajdeep Dutta,
Ji Wei Yoon,
Aye Phyu Phyu Aung,
Keyu Wu,
Min Wu,
Xiaoli Li
Abstract:
In the realm of sequential decision-making tasks, the exploration capability of a reinforcement learning (RL) agent is paramount for achieving high rewards through interactions with the environment. To enhance this crucial ability, we propose SAQN, a novel approach wherein a self-evolving autoencoder (SA) is embedded with a Q-Network (QN). In SAQN, the self-evolving autoencoder architecture adapts…
▽ More
In the realm of sequential decision-making tasks, the exploration capability of a reinforcement learning (RL) agent is paramount for achieving high rewards through interactions with the environment. To enhance this crucial ability, we propose SAQN, a novel approach wherein a self-evolving autoencoder (SA) is embedded with a Q-Network (QN). In SAQN, the self-evolving autoencoder architecture adapts and evolves as the agent explores the environment. This evolution enables the autoencoder to capture a diverse range of raw observations and represent them effectively in its latent space. By leveraging the disentangled states extracted from the encoder generated latent space, the QN is trained to determine optimal actions that improve rewards. During the evolution of the autoencoder architecture, a bias-variance regulatory strategy is employed to elicit the optimal response from the RL agent. This strategy involves two key components: (i) fostering the growth of nodes to retain previously acquired knowledge, ensuring a rich representation of the environment, and (ii) pruning the least contributing nodes to maintain a more manageable and tractable latent space. Extensive experimental evaluations conducted on three distinct benchmark environments and a real-world molecular environment demonstrate that the proposed SAQN significantly outperforms state-of-the-art counterparts. The results highlight the effectiveness of the self-evolving autoencoder and its collaboration with the Q-Network in tackling sequential decision-making tasks.
△ Less
Submitted 18 February, 2024;
originally announced February 2024.
-
Self-Correcting Self-Consuming Loops for Generative Model Training
Authors:
Nate Gillman,
Michael Freeman,
Daksh Aggarwal,
Chia-Hong Hsu,
Calvin Luo,
Yonglong Tian,
Chen Sun
Abstract:
As synthetic data becomes higher quality and proliferates on the internet, machine learning models are increasingly trained on a mix of human- and machine-generated data. Despite the successful stories of using synthetic data for representation learning, using synthetic data for generative model training creates "self-consuming loops" which may lead to training instability or even collapse, unless…
▽ More
As synthetic data becomes higher quality and proliferates on the internet, machine learning models are increasingly trained on a mix of human- and machine-generated data. Despite the successful stories of using synthetic data for representation learning, using synthetic data for generative model training creates "self-consuming loops" which may lead to training instability or even collapse, unless certain conditions are met. Our paper aims to stabilize self-consuming generative model training. Our theoretical results demonstrate that by introducing an idealized correction function, which maps a data point to be more likely under the true data distribution, self-consuming loops can be made exponentially more stable. We then propose self-correction functions, which rely on expert knowledge (e.g. the laws of physics programmed in a simulator), and aim to approximate the idealized corrector automatically and at scale. We empirically validate the effectiveness of self-correcting self-consuming loops on the challenging human motion synthesis task, and observe that it successfully avoids model collapse, even when the ratio of synthetic data to real data is as high as 100%.
△ Less
Submitted 10 June, 2024; v1 submitted 10 February, 2024;
originally announced February 2024.
-
MAPLE: Multilingual Evaluation of Parameter Efficient Finetuning of Large Language Models
Authors:
Divyanshu Aggarwal,
Ashutosh Sathe,
Ishaan Watts,
Sunayana Sitaram
Abstract:
Parameter Efficient Finetuning (PEFT) has emerged as a viable solution for improving the performance of Large Language Models (LLMs) without requiring massive resources and compute. Prior work on multilingual evaluation has shown that there is a large gap between the performance of LLMs on English and other languages. Further, there is also a large gap between the performance of smaller open-sourc…
▽ More
Parameter Efficient Finetuning (PEFT) has emerged as a viable solution for improving the performance of Large Language Models (LLMs) without requiring massive resources and compute. Prior work on multilingual evaluation has shown that there is a large gap between the performance of LLMs on English and other languages. Further, there is also a large gap between the performance of smaller open-source models and larger LLMs. Finetuning can be an effective way to bridge this gap and make language models more equitable. In this work, we finetune the LLama-2-7B and Mistral-7B models on two synthetic multilingual instruction tuning datasets to determine its effect on model performance on six downstream tasks covering forty languages in all. Additionally, we experiment with various parameters, such as rank for low-rank adaptation and values of quantisation to determine their effects on downstream performance and find that higher rank and higher quantisation values benefit low-resource languages. We find that PEFT of smaller open-source models sometimes bridges the gap between the performance of these models and the larger ones, however, English performance can take a hit. We also find that finetuning sometimes improves performance on low-resource languages, while degrading performance on high-resource languages.
△ Less
Submitted 22 July, 2024; v1 submitted 15 January, 2024;
originally announced January 2024.
-
Recursive lattice reduction -- A framework for finding short lattice vectors
Authors:
Divesh Aggarwal,
Thomas Espitau,
Spencer Peters,
Noah Stephens-Davidowitz
Abstract:
We propose a new framework called recursive lattice reduction for finding short non-zero vectors in a lattice or for finding dense sublattices of a lattice. At a high level, the framework works by recursively searching for dense sublattices of dense sublattices (or their duals). Eventually, the procedure encounters a recursive call on a lattice $\mathcal{L}$ with relatively low rank $k$, at which…
▽ More
We propose a new framework called recursive lattice reduction for finding short non-zero vectors in a lattice or for finding dense sublattices of a lattice. At a high level, the framework works by recursively searching for dense sublattices of dense sublattices (or their duals). Eventually, the procedure encounters a recursive call on a lattice $\mathcal{L}$ with relatively low rank $k$, at which point we simply use a known algorithm to find a short non-zero vector in $\mathcal{L}$. We view our framework as complementary to basis reduction algorithms, which similarly work to reduce an $n$-dimensional lattice problem with some approximation factor $γ$ to an exact lattice problem in dimension $k < n$, with a tradeoff between $γ$, $n$, and $k$. Our framework provides an alternative and arguably simpler perspective, which in particular can be described without explicitly referencing any specific basis of the lattice, Gram-Schmidt vectors, or even projection (though implementations of algorithms in this framework will likely make use of such things). We present a number of specific instantiations of our framework. Our main concrete result is a reduction that matches the tradeoff between $γ$, $n$, and $k$ achieved by the best-known basis reduction algorithms (in terms of the Hermite factor, up to low-order terms) across all parameter regimes. In fact, this reduction also can be used to find dense sublattices with any rank $\ell$ satisfying $\min\{\ell,n-\ell\} \leq n-k+1$, using only an oracle for SVP (or even just Hermite SVP) in $k$ dimensions, which is itself a novel result (as far as the authors know). We also show a very simple reduction that achieves the same tradeoff in quasipolynomial time. Finally, we present an automated approach for searching for algorithms in this framework that (provably) achieve better approximations with fewer oracle calls.
△ Less
Submitted 25 November, 2023;
originally announced November 2023.
-
MEGAVERSE: Benchmarking Large Language Models Across Languages, Modalities, Models and Tasks
Authors:
Sanchit Ahuja,
Divyanshu Aggarwal,
Varun Gumma,
Ishaan Watts,
Ashutosh Sathe,
Millicent Ochieng,
Rishav Hada,
Prachi Jain,
Maxamed Axmed,
Kalika Bali,
Sunayana Sitaram
Abstract:
There has been a surge in LLM evaluation research to understand LLM capabilities and limitations. However, much of this research has been confined to English, leaving LLM building and evaluation for non-English languages relatively unexplored. Several new LLMs have been introduced recently, necessitating their evaluation on non-English languages. This study aims to perform a thorough evaluation of…
▽ More
There has been a surge in LLM evaluation research to understand LLM capabilities and limitations. However, much of this research has been confined to English, leaving LLM building and evaluation for non-English languages relatively unexplored. Several new LLMs have been introduced recently, necessitating their evaluation on non-English languages. This study aims to perform a thorough evaluation of the non-English capabilities of SoTA LLMs (GPT-3.5-Turbo, GPT-4, PaLM2, Gemini-Pro, Mistral, Llama2, and Gemma) by comparing them on the same set of multilingual datasets. Our benchmark comprises 22 datasets covering 83 languages, including low-resource African languages. We also include two multimodal datasets in the benchmark and compare the performance of LLaVA models, GPT-4-Vision and Gemini-Pro-Vision. Our experiments show that larger models such as GPT-4, Gemini-Pro and PaLM2 outperform smaller models on various tasks, notably on low-resource languages, with GPT-4 outperforming PaLM2 and Gemini-Pro on more datasets. We also perform a study on data contamination and find that several models are likely to be contaminated with multilingual evaluation benchmarks, necessitating approaches to detect and handle contamination while assessing the multilingual performance of LLMs.
△ Less
Submitted 2 April, 2024; v1 submitted 13 November, 2023;
originally announced November 2023.
-
Photon absorption in twisted bilayer graphene
Authors:
Disha Arora,
Deepanshu Aggarwal,
Sankalpa Ghosh,
Rohit Narula
Abstract:
We investigate one- and two-photon absorption in twisted bilayer graphene (TBLG) by examining the effects of tuning the twist angle $ θ$ and the excitation energy $ E_l $ on their corresponding absorption coefficients $ α_{i=1,2}$. We find that $ α_1 $ shows distinct peaks as a function of $ E_l $ which correspond to the van Hove singularities (vHS) of TBLG. In contrast to single- (SLG) and AB bil…
▽ More
We investigate one- and two-photon absorption in twisted bilayer graphene (TBLG) by examining the effects of tuning the twist angle $ θ$ and the excitation energy $ E_l $ on their corresponding absorption coefficients $ α_{i=1,2}$. We find that $ α_1 $ shows distinct peaks as a function of $ E_l $ which correspond to the van Hove singularities (vHS) of TBLG. In contrast to single- (SLG) and AB bilayer graphene (BLG), $ α_1 $ is substantially enhanced by $\sim 2$ and $\sim 1$ orders of magnitude, respectively, in the visible range. On the other hand, $α_2 $ exhibits a remarkable increase of $\sim 11$ and $\sim 9$ orders of magnitude. Interestingly, as $θ$ increases, the resonant features exhibited by $α_{i=1,2}$ \textit{vs.} $ E_l $ shift progressively from the infrared to the visible. On doping TBLG, both $α_1 $ and $ α_2 $ remain essentially unchanged \textit{vs.} $ E_l $ but with a minor red-shift in their resonant peaks. Additionally, we explore various polarization configurations for TPA and determine the conditions under which $α_2$ becomes extremal.
△ Less
Submitted 13 May, 2024; v1 submitted 8 November, 2023;
originally announced November 2023.
-
Machine learning Sasakian and $G_2$ topology on contact Calabi-Yau $7$-manifolds
Authors:
Daattavya Aggarwal,
Yang-Hui He,
Elli Heyes,
Edward Hirst,
Henrique N. Sá Earp,
Tomás S. R. Silva
Abstract:
We propose a machine learning approach to study topological quantities related to the Sasakian and $G_2$-geometries of contact Calabi-Yau $7$-manifolds. Specifically, we compute datasets for certain Sasakian Hodge numbers and for the Crowley-Nördstrom invariant of the natural $G_2$-structure of the $7$-dimensional link of a weighted projective Calabi-Yau $3$-fold hypersurface singularity, for 7549…
▽ More
We propose a machine learning approach to study topological quantities related to the Sasakian and $G_2$-geometries of contact Calabi-Yau $7$-manifolds. Specifically, we compute datasets for certain Sasakian Hodge numbers and for the Crowley-Nördstrom invariant of the natural $G_2$-structure of the $7$-dimensional link of a weighted projective Calabi-Yau $3$-fold hypersurface singularity, for 7549 of the 7555 possible $\mathbb{P}^4(\textbf{w})$ projective spaces. These topological quantities are then machine learnt with high performance scores, where learning the Sasakian Hodge numbers from the $\mathbb{P}^4(\textbf{w})$ weights alone, using both neural networks and a symbolic regressor which achieve $R^2$ scores of 0.969 and 0.993 respectively. Additionally, properties of the respective Gröbner bases are well-learnt, leading to a vast improvement in computation speeds which may be of independent interest. The data generation and analysis further induced novel conjectures to be raised.
△ Less
Submitted 23 February, 2024; v1 submitted 4 October, 2023;
originally announced October 2023.
-
Moiré fractals in twisted graphene layers
Authors:
Deepanshu Aggarwal,
Rohit Narula,
Sankalpa Ghosh
Abstract:
Twisted bilayer graphene (TBLG) subject to a sequence of commensurate external periodic potentials reveals the formation of moiré fractals (MF) that share striking similarities with the central place theory (CPT) of economic geography, thus uncovering a remarkable connection between twistronics and the geometry of economic zones. MFs arise from the self-similarity of the emergent hierarchy of Bril…
▽ More
Twisted bilayer graphene (TBLG) subject to a sequence of commensurate external periodic potentials reveals the formation of moiré fractals (MF) that share striking similarities with the central place theory (CPT) of economic geography, thus uncovering a remarkable connection between twistronics and the geometry of economic zones. MFs arise from the self-similarity of the emergent hierarchy of Brillouin zones (BZ), forming a nested subband structure within the bandwidth of the original moiré bands. We derive the fractal generators (FG) for TBLG under these external potentials and explore their impact on the hierarchy of the BZ edges and the wavefunctions at the Dirac point. By examining realistic super-moiré structures (SMS) and demonstrating their equivalence to MFs with periodic perturbations under specific conditions, we establish MFs as a general description for such systems. Furthermore, we uncover parallels between the modification of the BZ hierarchy and magnetic BZ formation in Hofstadter's butterfly (HB), allowing us to construct an incommensurability measure for MFs \textit{vs.} twist angle. The resulting bandstructure hierarchy bolsters correlation effects, pushing more bands within the same energy window for both commensurate and incommensurate TBLG.
△ Less
Submitted 16 February, 2024; v1 submitted 7 June, 2023;
originally announced June 2023.
-
Enumeration of splitting subsets of endofunctions on finite sets
Authors:
Divya Aggarwal
Abstract:
Let $d$ and $n$ be positive integers such that $d|n$. Let $[n]=\{1,2,\ldots,n\}$ and $T$ be an endofunction on $[n]$. A subset $W$ of $[n]$ of cardinality $n/d$ is said to be $d$-splitting if $W \cup TW \cup \cdots \cup T^{d-1}W =[n]$. Let $σ(d;T)$ denote the number of $d$-splitting subsets. If $σ(2;T)>0$, then we show that $σ(2;T)=g_T(-1)$, where $g_T(t)$ is the generating function for the number…
▽ More
Let $d$ and $n$ be positive integers such that $d|n$. Let $[n]=\{1,2,\ldots,n\}$ and $T$ be an endofunction on $[n]$. A subset $W$ of $[n]$ of cardinality $n/d$ is said to be $d$-splitting if $W \cup TW \cup \cdots \cup T^{d-1}W =[n]$. Let $σ(d;T)$ denote the number of $d$-splitting subsets. If $σ(2;T)>0$, then we show that $σ(2;T)=g_T(-1)$, where $g_T(t)$ is the generating function for the number of $T$-invariant subsets of $[n]$. It is interesting to note that substituting a root of unity into a polynomial with integer coefficients has an enumerative meaning. More generally, let $g_T(t_1,\ldots,t_d)$ be the generating function for the number of $d$-flags of $T$-invariant subsets. We prove for certain endofunctions $T$, if $σ(d;T)>0$, then $σ(d;T)=g_T(ζ,ζ^2,\ldots,ζ^d)$, where $ζ$ is a primitive $d^{th}$ root of unity.
△ Less
Submitted 7 June, 2023;
originally announced June 2023.
-
Evaluating Inter-Bilingual Semantic Parsing for Indian Languages
Authors:
Divyanshu Aggarwal,
Vivek Gupta,
Anoop Kunchukuttan
Abstract:
Despite significant progress in Natural Language Generation for Indian languages (IndicNLP), there is a lack of datasets around complex structured tasks such as semantic parsing. One reason for this imminent gap is the complexity of the logical form, which makes English to multilingual translation difficult. The process involves alignment of logical forms, intents and slots with translated unstruc…
▽ More
Despite significant progress in Natural Language Generation for Indian languages (IndicNLP), there is a lack of datasets around complex structured tasks such as semantic parsing. One reason for this imminent gap is the complexity of the logical form, which makes English to multilingual translation difficult. The process involves alignment of logical forms, intents and slots with translated unstructured utterance. To address this, we propose an Inter-bilingual Seq2seq Semantic parsing dataset IE-SEMPARSE for 11 distinct Indian languages. We highlight the proposed task's practicality, and evaluate existing multilingual seq2seq models across several train-test strategies. Our experiment reveals a high correlation across performance of original multilingual semantic parsing datasets (such as mTOP, multilingual TOP and multiATIS++) and our proposed IE-SEMPARSE suite.
△ Less
Submitted 5 June, 2023; v1 submitted 25 April, 2023;
originally announced April 2023.
-
A Selberg Trace Formula for $\text{GL}_{3}(\mathbb{F}_p)\backslash \text{GL}_{3}(\mathbb{F}_q)/K$
Authors:
Daksh Aggarwal,
Asghar Ghorbanpour,
Masoud Khalkhali,
Jiyuan Lu,
Balázs Németh,
C Shijia Yu
Abstract:
In this paper, we prove a discrete analog of the Selberg Trace Formula for the group $\text{GL}_{3}(\mathbb{F}_q).$ By considering a cubic extension of the finite field $\mathbb{F}_q$, we define an analog of the upper half space and an action of $\text{GL}_{3}(\mathbb{F}_q)$ on it. To compute the orbital sums we explicitly identify the double coset spaces and fundamental domains in our upper half…
▽ More
In this paper, we prove a discrete analog of the Selberg Trace Formula for the group $\text{GL}_{3}(\mathbb{F}_q).$ By considering a cubic extension of the finite field $\mathbb{F}_q$, we define an analog of the upper half space and an action of $\text{GL}_{3}(\mathbb{F}_q)$ on it. To compute the orbital sums we explicitly identify the double coset spaces and fundamental domains in our upper half space. To understand the spectral side of the trace formula we decompose the induced representation $ρ= \text{Ind}_Γ^{G} 1$ for $G= \text{GL}_{3}(\mathbb{F}_q) $ and $ Γ= \text{GL}_{3}(\mathbb{F}_p).$
△ Less
Submitted 5 January, 2023; v1 submitted 4 January, 2023;
originally announced January 2023.
-
Interference effects in polarization controlled Rayleigh scattering in twisted bilayer graphene
Authors:
Disha Arora,
Deepanshu Aggarwal,
Sankalpa Ghosh,
Rohit Narula
Abstract:
We calculate the \tco{polarization}-controlled Rayleigh scattering response of twisted bilayer graphene (tBLG) based on the continuum electronic band model developed by Bistritzer and MacDonald while considering its refinements which address the effects of structural corrugation, doping-dependent Hartree interactions and particle-hole asymmetry. The dominant wave vectors for the Rayleigh scatterin…
▽ More
We calculate the \tco{polarization}-controlled Rayleigh scattering response of twisted bilayer graphene (tBLG) based on the continuum electronic band model developed by Bistritzer and MacDonald while considering its refinements which address the effects of structural corrugation, doping-dependent Hartree interactions and particle-hole asymmetry. The dominant wave vectors for the Rayleigh scattering process emanate from various regions of the Moiré Brillouin zone (MBZ) in contrast to single-layer graphene (SLG) and AB-stacked bilayer graphene (AB-BLG), where the dominant contributions always stem from the vicinity of the $\bm{K}$ point for optical laser energies and below. Compared to SLG, the integrated Rayleigh intensity is strongly enhanced for small twist angles (\emph{e.g.}, at a twist angle $ θ= 1.2^{\circ} $, the integrated Rayleigh intensity at laser energy $ E_l=2~\si{\electronvolt} $ enhances by a factor of $\sim $ 100 for the case of parallel \tco{polarization}). While for the case of cross-\tco{polarization}, it exhibits a markedly complex \tco{behavior} suggestive of strong interference effects mediated by the optical matrix elements. We find that at small twist angles, \emph{e.g.}, $ θ= 1.05^{\circ} $, the corrugation effects strongly enhances the ratio $ \bm{R}_A = \frac{ \text{integrated Rayleigh intensity for parallel \tco{polarization}}}{\text{integrated Rayleigh intensity for cross-\tco{polarization}}} $ by $ \sim $ $ 1300 $ times \emph{viz a viz} SLG or AB-BLG.
△ Less
Submitted 18 May, 2023; v1 submitted 30 December, 2022;
originally announced December 2022.
-
A primer on twistronics: A massless Dirac fermion's journey to moiré patterns and flat bands in twisted bilayer graphene
Authors:
Deepanshu Aggarwal,
Rohit Narula,
Sankalpa Ghosh
Abstract:
The recent discovery of superconductivity in magic-angle twisted bilayer graphene has sparked a renewed interest in the strongly-correlated physics of $sp^2$ carbons, in stark contrast to preliminary investigations which were dominated by the one-body physics of the massless Dirac fermions. We thus provide a self-contained, theoretical perspective of the journey of graphene from its single-particl…
▽ More
The recent discovery of superconductivity in magic-angle twisted bilayer graphene has sparked a renewed interest in the strongly-correlated physics of $sp^2$ carbons, in stark contrast to preliminary investigations which were dominated by the one-body physics of the massless Dirac fermions. We thus provide a self-contained, theoretical perspective of the journey of graphene from its single-particle physics-dominated regime to the strongly-correlated physics of the flat bands. Beginning from the origin of the Dirac points in condensed matter systems, we discuss the effect of the superlattice on the Fermi velocity and Van Hove singularities in graphene and how it leads naturally to investigations of the moiré pattern in van der Waals heterostructures exemplified by graphene-hexagonal boron-nitride and twisted bilayer graphene. Subsequently, we illuminate the origin of flat bands in twisted bilayer graphene at the magic angles by elaborating on a broad range of prominent theoretical works in a pedagogical way while linking them to available experimental support, where appropriate. We conclude by providing a list of topics in the study of the electronic properties of twisted bilayer graphene not covered by this review but may readily be approached with the help of this primer.
△ Less
Submitted 8 February, 2023; v1 submitted 9 December, 2022;
originally announced December 2022.
-
Lattice Problems Beyond Polynomial Time
Authors:
Divesh Aggarwal,
Huck Bennett,
Zvika Brakerski,
Alexander Golovnev,
Rajendra Kumar,
Zeyong Li,
Spencer Peters,
Noah Stephens-Davidowitz,
Vinod Vaikuntanathan
Abstract:
We study the complexity of lattice problems in a world where algorithms, reductions, and protocols can run in superpolynomial time, revisiting four foundational results: two worst-case to average-case reductions and two protocols. We also show a novel protocol.
1. We prove that secret-key cryptography exists if $\widetilde{O}(\sqrt{n})$-approximate SVP is hard for $2^{\varepsilon n}$-time algori…
▽ More
We study the complexity of lattice problems in a world where algorithms, reductions, and protocols can run in superpolynomial time, revisiting four foundational results: two worst-case to average-case reductions and two protocols. We also show a novel protocol.
1. We prove that secret-key cryptography exists if $\widetilde{O}(\sqrt{n})$-approximate SVP is hard for $2^{\varepsilon n}$-time algorithms. I.e., we extend to our setting (Micciancio and Regev's improved version of) Ajtai's celebrated polynomial-time worst-case to average-case reduction from $\widetilde{O}(n)$-approximate SVP to SIS.
2. We prove that public-key cryptography exists if $\widetilde{O}(n)$-approximate SVP is hard for $2^{\varepsilon n}$-time algorithms. This extends to our setting Regev's celebrated polynomial-time worst-case to average-case reduction from $\widetilde{O}(n^{1.5})$-approximate SVP to LWE. In fact, Regev's reduction is quantum, but ours is classical, generalizing Peikert's polynomial-time classical reduction from $\widetilde{O}(n^2)$-approximate SVP.
3. We show a $2^{\varepsilon n}$-time coAM protocol for $O(1)$-approximate CVP, generalizing the celebrated polynomial-time protocol for $O(\sqrt{n/\log n})$-CVP due to Goldreich and Goldwasser. These results show complexity-theoretic barriers to extending the recent line of fine-grained hardness results for CVP and SVP to larger approximation factors. (This result also extends to arbitrary norms.)
4. We show a $2^{\varepsilon n}$-time co-non-deterministic protocol for $O(\sqrt{\log n})$-approximate SVP, generalizing the (also celebrated!) polynomial-time protocol for $O(\sqrt{n})$-CVP due to Aharonov and Regev.
5. We give a novel coMA protocol for $O(1)$-approximate CVP with a $2^{\varepsilon n}$-time verifier.
All of the results described above are special cases of more general theorems that achieve time-approximation factor tradeoffs.
△ Less
Submitted 21 November, 2022;
originally announced November 2022.
-
A Review of Deep Learning Techniques for Protein Function Prediction
Authors:
Divyanshu Aggarwal,
Yasha Hasija
Abstract:
Deep Learning and big data have shown tremendous success in bioinformatics and computational biology in recent years; artificial intelligence methods have also significantly contributed in the task of protein function classification. This review paper analyzes the recent developments in approaches for the task of predicting protein function using deep learning. We explain the importance of determi…
▽ More
Deep Learning and big data have shown tremendous success in bioinformatics and computational biology in recent years; artificial intelligence methods have also significantly contributed in the task of protein function classification. This review paper analyzes the recent developments in approaches for the task of predicting protein function using deep learning. We explain the importance of determining the protein function and why automating the following task is crucial. Then, after reviewing the widely used deep learning techniques for this task, we continue our review and highlight the emergence of the modern State of The Art (SOTA) deep learning models which have achieved groundbreaking results in the field of computer vision, natural language processing and multi-modal learning in the last few years. We hope that this review will provide a broad view of the current role and advances of deep learning in biological sciences, especially in predicting protein function tasks and encourage new researchers to contribute to this area.
△ Less
Submitted 27 October, 2022;
originally announced November 2022.
-
Why we couldn't prove SETH hardness of the Closest Vector Problem for even norms!
Authors:
Divesh Aggarwal,
Rajendra Kumar
Abstract:
Recent work [BGS17,ABGS19] has shown SETH hardness of CVP in the $\ell_p$ norm for any $p$ that is not an even integer. This result was shown by giving a Karp reduction from $k$-SAT on $n$ variables to CVP on a lattice of rank $n$. In this work, we show a barrier towards proving a similar result for CVP in the $\ell_p$ norm where $p$ is an even integer. We show that for any $c>0$, if for every…
▽ More
Recent work [BGS17,ABGS19] has shown SETH hardness of CVP in the $\ell_p$ norm for any $p$ that is not an even integer. This result was shown by giving a Karp reduction from $k$-SAT on $n$ variables to CVP on a lattice of rank $n$. In this work, we show a barrier towards proving a similar result for CVP in the $\ell_p$ norm where $p$ is an even integer. We show that for any $c>0$, if for every $k > 0$, there exists an efficient reduction that maps a $k$-SAT instance on $n$ variables to a CVP instance for a lattice of rank at most $n^{c}$ in the Euclidean norm, then $\mathsf{coNP} \subset \mathsf{NP/Poly}$. We prove a similar result for CVP for all even norms under a mild additional promise that the ratio of the distance of the target from the lattice and the shortest non-zero vector in the lattice is bounded by $exp(n^{O(1)})$.
Furthermore, we show that for any $c> 0$, and any even integer $p$, if for every $k > 0$, there exists an efficient reduction that maps a $k$-SAT instance on $n$ variables to a $SVP_p$ instance for a lattice of rank at most $n^{c}$, then $\mathsf{coNP} \subset \mathsf{NP/Poly}$. The result for SVP does not require any additional promise.
While prior results have indicated that lattice problems in the $\ell_2$ norm (Euclidean norm) are easier than lattice problems in other norms, this is the first result that shows a separation between these problems.
We achieve this by using a result by Dell and van Melkebeek [JACM, 2014] on the impossibility of the existence of a reduction that compresses an arbitrary $k$-SAT instance into a string of length $\mathcal{O}(n^{k-ε})$ for any $ε>0$. In addition to CVP, we also show that the same result holds for the Subset-Sum problem using similar techniques.
△ Less
Submitted 25 November, 2023; v1 submitted 8 November, 2022;
originally announced November 2022.
-
IndicXNLI: Evaluating Multilingual Inference for Indian Languages
Authors:
Divyanshu Aggarwal,
Vivek Gupta,
Anoop Kunchukuttan
Abstract:
While Indic NLP has made rapid advances recently in terms of the availability of corpora and pre-trained models, benchmark datasets on standard NLU tasks are limited. To this end, we introduce IndicXNLI, an NLI dataset for 11 Indic languages. It has been created by high-quality machine translation of the original English XNLI dataset and our analysis attests to the quality of IndicXNLI. By finetun…
▽ More
While Indic NLP has made rapid advances recently in terms of the availability of corpora and pre-trained models, benchmark datasets on standard NLU tasks are limited. To this end, we introduce IndicXNLI, an NLI dataset for 11 Indic languages. It has been created by high-quality machine translation of the original English XNLI dataset and our analysis attests to the quality of IndicXNLI. By finetuning different pre-trained LMs on this IndicXNLI, we analyze various cross-lingual transfer techniques with respect to the impact of the choice of language models, languages, multi-linguality, mix-language input, etc. These experiments provide us with useful insights into the behaviour of pre-trained models for a diverse set of languages.
△ Less
Submitted 19 April, 2022;
originally announced April 2022.
-
Nöther Currents, Black Hole Entropy Universality and CFT Duality in Conformal Weyl Gravity
Authors:
Daksh Aggarwal,
Dominic Chang,
Quentin Dancewicz Helmers,
Nesibe Sivrioglu,
L. R. Ram-Mohan,
Leo Rodriguez,
Shanshan Rodriguez,
Raid Suleiman
Abstract:
In this paper we study black hole entropy universality within the Conformal Weyl gravity paradigm. We do this by first computing the entropy of specific vacuum and non-vacuum solutions, previously unexplored in Conformal Weyl gravity via both the Nöther current method and Wald's entropy formula. For the vacuum case, we explore the near horizon near extremal Kerr metric, which is also a vacuum solu…
▽ More
In this paper we study black hole entropy universality within the Conformal Weyl gravity paradigm. We do this by first computing the entropy of specific vacuum and non-vacuum solutions, previously unexplored in Conformal Weyl gravity via both the Nöther current method and Wald's entropy formula. For the vacuum case, we explore the near horizon near extremal Kerr metric, which is also a vacuum solution to Conformal Weyl gravity and not previously studied in this setting. For the non-vacuum case we couple the conformal Weyl gravity field equations to a near horizon (linear) $U(1)$ gauge potential and analyze the respective found solutions. We highlight the non-universality of black hole entropy between our studied black hole solutions of varying symmetries. However despite non-universality, the respective black hole entropies are in congruence with Wald's entropy formula for the specific gravity theory. Finally and despite non-universality, we comment on the construction of a near horizon CFT dual to one of our unique non-vacuum solutions. Due to the non-universality, we must introduce a parameter (similarly to entropy calculations in LQG) which we also call $γ$ and relating to the Weyl anomaly coefficient. The construction follows an $AdS_2/CFT_1$ correspondence in the near horizon, which enables the computation of the full asymptotic symmetry group of the chosen non-vacuum conformal Weyl black hole and its near horizon quantum CFT dual. We conclude with a discussion and outlook for future work.
△ Less
Submitted 23 June, 2022; v1 submitted 4 April, 2022;
originally announced April 2022.
-
Quantum secure non-malleable codes in the split-state model
Authors:
Divesh Aggarwal,
Naresh Goud Boddu,
Rahul Jain
Abstract:
Non-malleable-codes introduced by Dziembowski, Pietrzak and Wichs [DPW18] encode a classical message $S$ in a manner such that tampering the codeword results in the decoder either outputting the original message $S$ or a message that is unrelated/independent of $S$. Providing such non-malleable security for various tampering function families has received significant attention in recent years. We…
▽ More
Non-malleable-codes introduced by Dziembowski, Pietrzak and Wichs [DPW18] encode a classical message $S$ in a manner such that tampering the codeword results in the decoder either outputting the original message $S$ or a message that is unrelated/independent of $S$. Providing such non-malleable security for various tampering function families has received significant attention in recent years. We consider the well-studied (2-part) split-state model, in which the message $S$ is encoded into two parts $X$ and $Y$, and the adversary is allowed to arbitrarily tamper with each $X$ and $Y$ individually. We consider the security of non-malleable-codes in the split-state model when the adversary is allowed to make use of arbitrary entanglement to tamper the parts $X$ and $Y$. We construct explicit quantum secure non-malleable-codes in the split-state model. Our construction of quantum secure non-malleable-codes is based on the recent construction of quantum secure $2$-source non-malleable-extractors by Boddu, Jain and Kapshikar [BJK21].
△ Less
Submitted 8 June, 2023; v1 submitted 27 February, 2022;
originally announced February 2022.
-
Extractors: Low Entropy Requirements Colliding With Non-Malleability
Authors:
Divesh Aggarwal,
Eldon Chung,
Maciej Obremski
Abstract:
The known constructions of negligible error (non-malleable) two-source extractors can be broadly classified in three categories:
(1) Constructions where one source has min-entropy rate about $1/2$, the other source can have small min-entropy rate, but the extractor doesn't guarantee non-malleability.
(2) Constructions where one source is uniform, and the other can have small min-entropy rate,…
▽ More
The known constructions of negligible error (non-malleable) two-source extractors can be broadly classified in three categories:
(1) Constructions where one source has min-entropy rate about $1/2$, the other source can have small min-entropy rate, but the extractor doesn't guarantee non-malleability.
(2) Constructions where one source is uniform, and the other can have small min-entropy rate, and the extractor guarantees non-malleability when the uniform source is tampered.
(3) Constructions where both sources have entropy rate very close to $1$ and the extractor guarantees non-malleability against the tampering of both sources.
We introduce a new notion of collision resistant extractors and in using it we obtain a strong two source non-malleable extractor where we require the first source to have $0.8$ entropy rate and the other source can have min-entropy polylogarithmic in the length of the source.
We show how the above extractor can be applied to obtain a non-malleable extractor with output rate $\frac 1 2$, which is optimal. We also show how, by using our extractor and extending the known protocol, one can obtain a privacy amplification secure against memory tampering where the size of the secret output is almost optimal.
△ Less
Submitted 9 June, 2023; v1 submitted 7 November, 2021;
originally announced November 2021.
-
A conjectural asymptotic formula for multiplicative chaos in number theory
Authors:
Daksh Aggarwal,
Unique Subedi,
William Verreault,
Asif Zaman,
Chenghui Zheng
Abstract:
We investigate a special sequence of random variables $A(N)$ defined by an exponential power series with independent standard complex Gaussians $(X(k))_{k \geq 1}$. Introduced by Hughes, Keating, and O'Connell in the study of random matrix theory, this sequence relates to Gaussian multiplicative chaos (in particular "holomorphic multiplicative chaos'' per Najnudel, Paquette, and Simm) and random m…
▽ More
We investigate a special sequence of random variables $A(N)$ defined by an exponential power series with independent standard complex Gaussians $(X(k))_{k \geq 1}$. Introduced by Hughes, Keating, and O'Connell in the study of random matrix theory, this sequence relates to Gaussian multiplicative chaos (in particular "holomorphic multiplicative chaos'' per Najnudel, Paquette, and Simm) and random multiplicative functions. Soundararajan and Zaman recently determined the order of $\mathbb{E}[|A(N)|]$. By constructing an algorithm to calculate $A(N)$ in $O(N^2 \log N)$ steps, we produce computational evidence that their result can likely be strengthened to an asymptotic result with a numerical estimate for the asymptotic constant. We also obtain similar conclusions when $A(N)$ is defined using standard real Gaussians or uniform $\pm 1$ random variables. However, our evidence suggests that the asymptotic constants do not possess a natural product structure.
△ Less
Submitted 25 August, 2021;
originally announced August 2021.
-
Sums of random multiplicative functions over function fields with few irreducible factors
Authors:
Daksh Aggarwal,
Unique Subedi,
William Verreault,
Asif Zaman,
Chenghui Zheng
Abstract:
We establish a normal approximation for the limiting distribution of partial sums of random Rademacher multiplicative functions over function fields, provided the number of irreducible factors of the polynomials is small enough. This parallels work of Harper for random Rademacher multiplicative functions over the integers.
We establish a normal approximation for the limiting distribution of partial sums of random Rademacher multiplicative functions over function fields, provided the number of irreducible factors of the polynomials is small enough. This parallels work of Harper for random Rademacher multiplicative functions over the integers.
△ Less
Submitted 28 January, 2022; v1 submitted 18 August, 2021;
originally announced August 2021.
-
Generation of 1 Gb full entropy random numbers with the enhanced-NRBG method
Authors:
Deepika Aggarwal,
Karthick Balaji R,
Rohit Ghatikar,
Sruthi Chennuri,
Anindita Banerjee
Abstract:
Random numbers have significant applications in fundamental science, high-level scientific research, cryptography, and several other areas where there is a pressing need for high-quality random numbers. We present an experimental demonstration of a non-deterministic random bit generator from a quantum entropy source and a deterministic random bit generator mechanism to provide high quality random…
▽ More
Random numbers have significant applications in fundamental science, high-level scientific research, cryptography, and several other areas where there is a pressing need for high-quality random numbers. We present an experimental demonstration of a non-deterministic random bit generator from a quantum entropy source and a deterministic random bit generator mechanism to provide high quality random numbers providing a throughput of 1 Gb. Quantum entropy is realized by a series of quantum chips based on radioactive isotope Americium-241. The extracted raw random numbers are further post-processed to generate a high-entropy seed for the hash based deterministic random bit generator. We discuss the implementation of randomness extraction algorithm and Hash-DRBG algorithm in detail. The random numbers pass all randomness measures provided in ENT and NIST test suites.
△ Less
Submitted 9 August, 2021;
originally announced August 2021.
-
Quantum Measurement Adversary
Authors:
Divesh Aggarwal,
Naresh Goud Boddu,
Rahul Jain,
Maciej Obremski
Abstract:
Multi-source-extractors are functions that extract uniform randomness from multiple (weak) sources of randomness. Quantum multi-source-extractors were considered by Kasher and Kempe (for the quantum-independent-adversary and the quantum-bounded-storage-adversary), Chung, Li and Wu (for the general-entangled-adversary) and Arnon-Friedman, Portmann and Scholz (for the quantum-Markov-adversary). One…
▽ More
Multi-source-extractors are functions that extract uniform randomness from multiple (weak) sources of randomness. Quantum multi-source-extractors were considered by Kasher and Kempe (for the quantum-independent-adversary and the quantum-bounded-storage-adversary), Chung, Li and Wu (for the general-entangled-adversary) and Arnon-Friedman, Portmann and Scholz (for the quantum-Markov-adversary). One of the main objectives of this work is to unify all the existing quantum multi-source adversary models. We propose two new models of adversaries: 1) the quantum-measurement-adversary (qm-adv), which generates side-information using entanglement and on post-measurement and 2) the quantum-communication-adversary (qc-adv), which generates side-information using entanglement and communication between multiple sources. We show that, 1. qm-adv is the strongest adversary among all the known adversaries, in the sense that the side-information of all other adversaries can be generated by qm-adv. 2. The (generalized) inner-product function (in fact a general class of two-wise independent functions) continues to work as a good extractor against qm-adv with matching parameters as that of Chor and Goldreich. 3. A non-malleable-extractor proposed by Li (against classical-adversaries) continues to be secure against quantum side-information. This result implies a non-malleable-extractor result of Aggarwal, Chung, Lin and Vidick with uniform seed. We strengthen their result via a completely different proof to make the non-malleable-extractor of Li secure against quantum side-information even when the seed is not uniform. 4. A modification (working with weak sources instead of uniform sources) of the Dodis and Wichs protocol for privacy-amplification is secure against active quantum adversaries. This strengthens on a recent result due to Aggarwal, Chung, Lin and Vidick which uses uniform sources.
△ Less
Submitted 6 June, 2023; v1 submitted 4 June, 2021;
originally announced June 2021.
-
Polynomial Matrices, Splitting Subspaces and Krylov Subspaces over Finite Fields
Authors:
Divya Aggarwal,
Samrith Ram
Abstract:
Let $T$ be a linear operator on an $\mathbb{F}_q$-vector space $V$ of dimension $n$. For any divisor $m$ of $n$, an $m$-dimensional subspace $W$ of $V$ is $T$-splitting if
$$ V =W\oplus TW\oplus \cdots \oplus T^{d-1}W, $$ where $d=n/m$. Let $σ(m,d;T)$ denote the number of $m$-dimensional $T$-splitting subspaces. Determining $σ(m,d;T)$ for an arbitrary operator $T$ is an open problem. This proble…
▽ More
Let $T$ be a linear operator on an $\mathbb{F}_q$-vector space $V$ of dimension $n$. For any divisor $m$ of $n$, an $m$-dimensional subspace $W$ of $V$ is $T$-splitting if
$$ V =W\oplus TW\oplus \cdots \oplus T^{d-1}W, $$ where $d=n/m$. Let $σ(m,d;T)$ denote the number of $m$-dimensional $T$-splitting subspaces. Determining $σ(m,d;T)$ for an arbitrary operator $T$ is an open problem. This problem is closely related to another open problem on Krylov spaces. We discuss this connection and give explicit formulae for $σ(m,d;T)$ in the case where the invariant factors of $T$ satisfy certain degree conditions. A connection with another enumeration problem on polynomial matrices is also discussed.
△ Less
Submitted 31 May, 2021;
originally announced May 2021.
-
Dimension-Preserving Reductions Between SVP and CVP in Different $p$-Norms
Authors:
Divesh Aggarwal,
Yanlin Chen,
Rajendra Kumar,
Zeyong Li,
Noah Stephens-Davidowitz
Abstract:
$ \newcommand{\SVP}{\textsf{SVP}} \newcommand{\CVP}{\textsf{CVP}} \newcommand{\eps}{\varepsilon} $We show a number of reductions between the Shortest Vector Problem and the Closest Vector Problem over lattices in different $\ell_p$ norms ($\SVP_p$ and $\CVP_p$ respectively). Specifically, we present the following $2^{\eps m}$-time reductions for $1 \leq p \leq q \leq \infty…
▽ More
$ \newcommand{\SVP}{\textsf{SVP}} \newcommand{\CVP}{\textsf{CVP}} \newcommand{\eps}{\varepsilon} $We show a number of reductions between the Shortest Vector Problem and the Closest Vector Problem over lattices in different $\ell_p$ norms ($\SVP_p$ and $\CVP_p$ respectively). Specifically, we present the following $2^{\eps m}$-time reductions for $1 \leq p \leq q \leq \infty$, which all increase the rank $n$ and dimension $m$ of the input lattice by at most one:
$\bullet$ a reduction from $\widetilde{O}(1/\eps^{1/p})γ$-approximate $\SVP_q$ to $γ$-approximate $\SVP_p$;
$\bullet$ a reduction from $\widetilde{O}(1/\eps^{1/p}) γ$-approximate $\CVP_p$ to $γ$-approximate $\CVP_q$; and
$\bullet$ a reduction from $\widetilde{O}(1/\eps^{1+1/p})$-$\CVP_q$ to $(1+\eps)$-unique $\SVP_p$ (which in turn trivially reduces to $(1+\eps)$-approximate $\SVP_p$).
The last reduction is interesting even in the case $p = q$. In particular, this special case subsumes much prior work adapting $2^{O(m)}$-time $\SVP_p$ algorithms to solve $O(1)$-approximate $\CVP_p$. In the (important) special case when $p = q$, $1 \leq p \leq 2$, and the $\SVP_p$ oracle is exact, we show a stronger reduction, from $O(1/\eps^{1/p})\text{-}\CVP_p$ to (exact) $\SVP_p$ in $2^{\eps m}$ time. For example, taking $\eps = \log m/m$ and $p = 2$ gives a slight improvement over Kannan's celebrated polynomial-time reduction from $\sqrt{m}\text{-}\CVP_2$ to $\SVP_2$. We also note that the last two reductions can be combined to give a reduction from approximate-$\CVP_p$ to $\SVP_q$ for any $p$ and $q$, regardless of whether $p \leq q$ or $p > q$.
Our techniques combine those from the recent breakthrough work of Eisenbrand and Venzin (which showed how to adapt the current fastest known algorithm for these problems in the $\ell_2$ norm to all $\ell_p$ norms) together with sparsification-based techniques.
△ Less
Submitted 13 April, 2021;
originally announced April 2021.
-
Tensor Processing Primitives: A Programming Abstraction for Efficiency and Portability in Deep Learning & HPC Workloads
Authors:
Evangelos Georganas,
Dhiraj Kalamkar,
Sasikanth Avancha,
Menachem Adelman,
Deepti Aggarwal,
Cristina Anderson,
Alexander Breuer,
Jeremy Bruestle,
Narendra Chaudhary,
Abhisek Kundu,
Denise Kutnick,
Frank Laub,
Vasimuddin Md,
Sanchit Misra,
Ramanarayan Mohanty,
Hans Pabst,
Brian Retford,
Barukh Ziv,
Alexander Heinecke
Abstract:
During the past decade, novel Deep Learning (DL) algorithms, workloads and hardware have been developed to tackle a wide range of problems. Despite the advances in workload and hardware ecosystems, the programming methodology of DL systems is stagnant. DL workloads leverage either highly-optimized, yet platform-specific and inflexible kernels from DL libraries, or in the case of novel operators, r…
▽ More
During the past decade, novel Deep Learning (DL) algorithms, workloads and hardware have been developed to tackle a wide range of problems. Despite the advances in workload and hardware ecosystems, the programming methodology of DL systems is stagnant. DL workloads leverage either highly-optimized, yet platform-specific and inflexible kernels from DL libraries, or in the case of novel operators, reference implementations are built via DL framework primitives with underwhelming performance. This work introduces the Tensor Processing Primitives (TPP), a programming abstraction striving for efficient, portable implementation of DL workloads with high-productivity. TPPs define a compact, yet versatile set of 2D-tensor operators (or a virtual Tensor ISA), which subsequently can be utilized as building-blocks to construct complex operators on high-dimensional tensors. The TPP specification is platform-agnostic, thus code expressed via TPPs is portable, whereas the TPP implementation is highly-optimized and platform-specific. We demonstrate the efficacy and viability of our approach using standalone kernels and end-to-end DL & HPC workloads expressed entirely via TPPs that outperform state-of-the-art implementations on multiple platforms.
△ Less
Submitted 30 November, 2021; v1 submitted 12 April, 2021;
originally announced April 2021.
-
FedFace: Collaborative Learning of Face Recognition Model
Authors:
Divyansh Aggarwal,
Jiayu Zhou,
Anil K. Jain
Abstract:
DNN-based face recognition models require large centrally aggregated face datasets for training. However, due to the growing data privacy concerns and legal restrictions, accessing and sharing face datasets has become exceedingly difficult. We propose FedFace, a federated learning (FL) framework for collaborative learning of face recognition models in a privacy-aware manner. FedFace utilizes the f…
▽ More
DNN-based face recognition models require large centrally aggregated face datasets for training. However, due to the growing data privacy concerns and legal restrictions, accessing and sharing face datasets has become exceedingly difficult. We propose FedFace, a federated learning (FL) framework for collaborative learning of face recognition models in a privacy-aware manner. FedFace utilizes the face images available on multiple clients to learn an accurate and generalizable face recognition model where the face images stored at each client are neither shared with other clients nor the central host and each client is a mobile device containing face images pertaining to only the owner of the device (one identity per client). Our experiments show the effectiveness of FedFace in enhancing the verification performance of pre-trained face recognition system on standard face verification benchmarks namely LFW, IJB-A, and IJB-C.
△ Less
Submitted 24 June, 2021; v1 submitted 7 April, 2021;
originally announced April 2021.
-
Splitting Subspaces of Linear Operators over Finite Fields
Authors:
Divya Aggarwal,
Samrith Ram
Abstract:
Let $V$ be a vector space of dimension $N$ over the finite field $\mathbb{F}_q$ and $T$ be a linear operator on $V$. Given an integer $m$ that divides $N$, an $m$-dimensional subspace $W$ of $V$ is $T$-splitting if $V=W\oplus TW\oplus \cdots \oplus T^{d-1}W$ where $d=N/m$. Let $σ(m,d;T)$ denote the number of $m$-dimensional $T$-splitting subspaces. Determining $σ(m,d;T)$ for an arbitrary operator…
▽ More
Let $V$ be a vector space of dimension $N$ over the finite field $\mathbb{F}_q$ and $T$ be a linear operator on $V$. Given an integer $m$ that divides $N$, an $m$-dimensional subspace $W$ of $V$ is $T$-splitting if $V=W\oplus TW\oplus \cdots \oplus T^{d-1}W$ where $d=N/m$. Let $σ(m,d;T)$ denote the number of $m$-dimensional $T$-splitting subspaces. Determining $σ(m,d;T)$ for an arbitrary operator $T$ is an open problem. We prove that $σ(m,d;T)$ depends only on the similarity class type of $T$ and give an explicit formula in the special case where $T$ is cyclic and nilpotent. Denote by $σ_q(m,d;τ)$ the number of $m$-dimensional splitting subspaces for a linear operator of similarity class type $τ$ over an $\\mathbb{F}_q$-vector space of dimension $md$. For fixed values of $m,d$ and $τ$, we show that $σ_q(m,d;τ)$ is a polynomial in $q$.
△ Less
Submitted 21 January, 2021; v1 submitted 15 December, 2020;
originally announced December 2020.
-
Lifting 2D StyleGAN for 3D-Aware Face Generation
Authors:
Yichun Shi,
Divyansh Aggarwal,
Anil K. Jain
Abstract:
We propose a framework, called LiftedGAN, that disentangles and lifts a pre-trained StyleGAN2 for 3D-aware face generation. Our model is "3D-aware" in the sense that it is able to (1) disentangle the latent space of StyleGAN2 into texture, shape, viewpoint, lighting and (2) generate 3D components for rendering synthetic images. Unlike most previous methods, our method is completely self-supervised…
▽ More
We propose a framework, called LiftedGAN, that disentangles and lifts a pre-trained StyleGAN2 for 3D-aware face generation. Our model is "3D-aware" in the sense that it is able to (1) disentangle the latent space of StyleGAN2 into texture, shape, viewpoint, lighting and (2) generate 3D components for rendering synthetic images. Unlike most previous methods, our method is completely self-supervised, i.e. it neither requires any manual annotation nor 3DMM model for training. Instead, it learns to generate images as well as their 3D components by distilling the prior knowledge in StyleGAN2 with a differentiable renderer. The proposed model is able to output both the 3D shape and texture, allowing explicit pose and lighting control over generated images. Qualitative and quantitative results show the superiority of our approach over existing methods on 3D-controllable GANs in content controllability while generating realistic high quality images.
△ Less
Submitted 18 April, 2021; v1 submitted 26 November, 2020;
originally announced November 2020.
-
Unpredictable and Uniform RNG based on time of arrival using InGaAs Detectors
Authors:
Anindita Banerjee,
Deepika Aggarwal,
Ankush Sharma,
Ganesh Yadav
Abstract:
Quantum random number generators are becoming mandatory in a demanding technology world of high performing learning algorithms and security guidelines. Our implementation based on principles of quantum mechanics enable us to achieve the required randomness. We have generated high-quality quantum random numbers from a weak coherent source at telecommunication wavelength. The entropy is based on tim…
▽ More
Quantum random number generators are becoming mandatory in a demanding technology world of high performing learning algorithms and security guidelines. Our implementation based on principles of quantum mechanics enable us to achieve the required randomness. We have generated high-quality quantum random numbers from a weak coherent source at telecommunication wavelength. The entropy is based on time of arrival of quantum states within a predefined time interval. The detection of photons by the InGaAs single-photon detectors and high precision time measurement of 5 ps enables us to generate 16 random bits per arrival time which is the highest reported to date. We have presented the theoretical analysis and experimental verification of the random number generation methodology. The method eliminates the requirement of any randomness extractor to be applied thereby, leveraging the principles of quantum physics to generate random numbers. The output data rate is on an average of 2.4 Mbps. The raw quantum random numbers are compared with NIST prescribed Blum-Blum-Shub pseudo random number generator and an in-house built hardware random number generator from FPGA, on the ENT and NIST Platform.
△ Less
Submitted 16 July, 2021; v1 submitted 24 October, 2020;
originally announced October 2020.
-
A $2^{n/2}$-Time Algorithm for $\sqrt{n}$-SVP and $\sqrt{n}$-Hermite SVP, and an Improved Time-Approximation Tradeoff for (H)SVP
Authors:
Divesh Aggarwal,
Zeyong Li,
Noah Stephens-Davidowitz
Abstract:
We show a $2^{n/2+o(n)}$-time algorithm that finds a (non-zero) vector in a lattice $\mathcal{L} \subset \mathbb{R}^n$ with norm at most $\tilde{O}(\sqrt{n})\cdot \min\{λ_1(\mathcal{L}), \det(\mathcal{L})^{1/n}\}$, where $λ_1(\mathcal{L})$ is the length of a shortest non-zero lattice vector and $\det(\mathcal{L})$ is the lattice determinant. Minkowski showed that…
▽ More
We show a $2^{n/2+o(n)}$-time algorithm that finds a (non-zero) vector in a lattice $\mathcal{L} \subset \mathbb{R}^n$ with norm at most $\tilde{O}(\sqrt{n})\cdot \min\{λ_1(\mathcal{L}), \det(\mathcal{L})^{1/n}\}$, where $λ_1(\mathcal{L})$ is the length of a shortest non-zero lattice vector and $\det(\mathcal{L})$ is the lattice determinant. Minkowski showed that $λ_1(\mathcal{L}) \leq \sqrt{n} \det(\mathcal{L})^{1/n}$ and that there exist lattices with $λ_1(\mathcal{L}) \geq Ω(\sqrt{n}) \cdot \det(\mathcal{L})^{1/n}$, so that our algorithm finds vectors that are as short as possible relative to the determinant (up to a polylogarithmic factor).
The main technical contribution behind this result is new analysis of (a simpler variant of) an algorithm from arXiv:1412.7994, which was only previously known to solve less useful problems. To achieve this, we rely crucially on the ``reverse Minkowski theorem'' (conjectured by Dadush arXiv:1606.06913 and proven by arXiv:1611.05979), which can be thought of as a partial converse to the fact that $λ_1(\mathcal{L}) \leq \sqrt{n} \det(\mathcal{L})^{1/n}$.
Previously, the fastest known algorithm for finding such a vector was the $2^{.802n + o(n)}$-time algorithm due to [Liu, Wang, Xu, and Zheng, 2011], which actually found a non-zero lattice vector with length $O(1) \cdot λ_1(\mathcal{L})$. Though we do not show how to find lattice vectors with this length in time $2^{n/2+o(n)}$, we do show that our algorithm suffices for the most important application of such algorithms: basis reduction. In particular, we show a modified version of Gama and Nguyen's slide-reduction algorithm [Gama and Nguyen, STOC 2008], which can be combined with the algorithm above to improve the time-length tradeoff for shortest-vector algorithms in nearly all regimes, including the regimes relevant to cryptography.
△ Less
Submitted 18 July, 2020;
originally announced July 2020.
-
A Note on the Concrete Hardness of the Shortest Independent Vectors Problem in Lattices
Authors:
Divesh Aggarwal,
Eldon Chung
Abstract:
Blömer and Seifert showed that $\mathsf{SIVP}_2$ is NP-hard to approximate by giving a reduction from $\mathsf{CVP}_2$ to $\mathsf{SIVP}_2$ for constant approximation factors as long as the $\mathsf{CVP}$ instance has a certain property. In order to formally define this requirement on the $\mathsf{CVP}$ instance, we introduce a new computational problem called the Gap Closest Vector Problem with B…
▽ More
Blömer and Seifert showed that $\mathsf{SIVP}_2$ is NP-hard to approximate by giving a reduction from $\mathsf{CVP}_2$ to $\mathsf{SIVP}_2$ for constant approximation factors as long as the $\mathsf{CVP}$ instance has a certain property. In order to formally define this requirement on the $\mathsf{CVP}$ instance, we introduce a new computational problem called the Gap Closest Vector Problem with Bounded Minima. We adapt the proof of Blömer and Seifert to show a reduction from the Gap Closest Vector Problem with Bounded Minima to $\mathsf{SIVP}$ for any $\ell_p$ norm for some constant approximation factor greater than $1$.
In a recent result, Bennett, Golovnev and Stephens-Davidowitz showed that under Gap-ETH, there is no $2^{o(n)}$-time algorithm for approximating $\mathsf{CVP}_p$ up to some constant factor $γ\geq 1$ for any $1 \leq p \leq \infty$. We observe that the reduction in their paper can be viewed as a reduction from $\mathsf{Gap3SAT}$ to the Gap Closest Vector Problem with Bounded Minima. This, together with the above mentioned reduction, implies that, under Gap-ETH, there is no $2^{o(n)}$-time algorithm for approximating $\mathsf{SIVP}_p$ up to some constant factor $γ\geq 1$ for any $1 \leq p \leq \infty$.
△ Less
Submitted 31 October, 2020; v1 submitted 24 May, 2020;
originally announced May 2020.
-
A Novel Column Generation Heuristic for Airline Crew Pairing Optimization with Large-scale Complex Flight Networks
Authors:
Divyam Aggarwal,
Dhish Kumar Saxena,
Saaju Pualose,
Thomas Bäck,
Michael Emmerich
Abstract:
Crew Pairing Optimization (CPO) is critical for an airlines' business viability, given that the crew operating cost is second only to the fuel cost. CPO aims at generating a set of flight sequences (crew pairings) to cover all scheduled flights, at minimum cost, while satisfying several legality constraints. The state-of-the-art heavily relies on relaxing the underlying Integer Programming Problem…
▽ More
Crew Pairing Optimization (CPO) is critical for an airlines' business viability, given that the crew operating cost is second only to the fuel cost. CPO aims at generating a set of flight sequences (crew pairings) to cover all scheduled flights, at minimum cost, while satisfying several legality constraints. The state-of-the-art heavily relies on relaxing the underlying Integer Programming Problem into a Linear Programming Problem, which in turn is solved through the Column Generation (CG) technique. However, with the alarmingly expanding airlines' operations, CPO is marred by the curse of dimensionality, rendering the exact CG-implementations obsolete, and necessitating the heuristic-based CG-implementations. Yet, in literature, the much prevalent large-scale complex flight networks involving multiple { crew bases and/or hub-and-spoke sub-networks, largely remain uninvestigated. This paper proposes a novel CG heuristic, which has enabled the in-house development of an Airline Crew Pairing Optimizer (AirCROP). The efficacy of the heuristic/AirCROP has been tested on real-world, large-scale, complex network instances with over 4,200 flights, 15 crew bases, and multiple hub-and-spoke sub-networks (resulting in billion-plus possible pairings). Notably, this paper has a dedicated focus on the proposed CG heuristic (not the entire AirCROP framework) based on balancing random exploration of pairings; exploitation of domain knowledge (on optimal solution features); and utilization of the past computational & search effort through archiving. Though this paper has an airline context, the proposed CG heuristic may find wider applications across different domains, by serving as a template on how to utilize domain knowledge to better tackle combinatorial optimization problems.
△ Less
Submitted 2 July, 2021; v1 submitted 18 May, 2020;
originally announced May 2020.
-
On Learning Combinatorial Patterns to Assist Large-Scale Airline Crew Pairing Optimization
Authors:
Divyam Aggarwal,
Yash Kumar Singh,
Dhish Kumar Saxena
Abstract:
Airline Crew Pairing Optimization (CPO) aims at generating a set of legal flight sequences (crew pairings), to cover an airline's flight schedule, at minimum cost. It is usually performed using Column Generation (CG), a mathematical programming technique for guided search-space exploration. CG exploits the interdependencies between the current and the preceding CG-iteration for generating new vari…
▽ More
Airline Crew Pairing Optimization (CPO) aims at generating a set of legal flight sequences (crew pairings), to cover an airline's flight schedule, at minimum cost. It is usually performed using Column Generation (CG), a mathematical programming technique for guided search-space exploration. CG exploits the interdependencies between the current and the preceding CG-iteration for generating new variables (pairings) during the optimization-search. However, with the unprecedented scale and complexity of the emergent flight networks, it has become imperative to learn higher-order interdependencies among the flight-connection graphs, and utilize those to enhance the efficacy of the CPO. In first of its kind and what marks a significant departure from the state-of-the-art, this paper proposes a novel adaptation of the Variational Graph Auto-Encoder for learning plausible combinatorial patterns among the flight-connection data obtained through the search-space exploration by an Airline Crew Pairing Optimizer, AirCROP (developed by the authors and validated by the research consortium's industrial sponsor, GE Aviation). The resulting flight-connection predictions are combined on-the-fly using a novel heuristic to generate new pairings for the optimizer. The utility of the proposed approach is demonstrated on large-scale (over 4200 flights), real-world, complex flight-networks of US-based airlines, characterized by multiple hub-and-spoke subnetworks and several crew bases.
△ Less
Submitted 2 May, 2020; v1 submitted 28 April, 2020;
originally announced April 2020.
-
Child Face Age-Progression via Deep Feature Aging
Authors:
Debayan Deb,
Divyansh Aggarwal,
Anil K. Jain
Abstract:
Given a gallery of face images of missing children, state-of-the-art face recognition systems fall short in identifying a child (probe) recovered at a later age. We propose a feature aging module that can age-progress deep face features output by a face matcher. In addition, the feature aging module guides age-progression in the image space such that synthesized aged faces can be utilized to enhan…
▽ More
Given a gallery of face images of missing children, state-of-the-art face recognition systems fall short in identifying a child (probe) recovered at a later age. We propose a feature aging module that can age-progress deep face features output by a face matcher. In addition, the feature aging module guides age-progression in the image space such that synthesized aged faces can be utilized to enhance longitudinal face recognition performance of any face matcher without requiring any explicit training. For time lapses larger than 10 years (the missing child is found after 10 or more years), the proposed age-progression module improves the closed-set identification accuracy of FaceNet from 16.53% to 21.44% and CosFace from 60.72% to 66.12% on a child celebrity dataset, namely ITWCC. The proposed method also outperforms state-of-the-art approaches with a rank-1 identification rate of 95.91%, compared to 94.91%, on a public aging dataset, FG-NET, and 99.58%, compared to 99.50%, on CACD-VS. These results suggest that aging face features enhances the ability to identify young children who are possible victims of child trafficking or abduction.
△ Less
Submitted 17 March, 2020;
originally announced March 2020.
-
On Initializing Airline Crew Pairing Optimization for Large-scale Complex Flight Networks
Authors:
Divyam Aggarwal,
Dhish Kumar Saxena,
Thomas Bäck,
Michael Emmerich
Abstract:
Crew pairing optimization (CPO) is critically important for any airline, since its crew operating costs are second-largest, next to the fuel-cost. CPO aims at generating a set of flight sequences (crew pairings) covering a flight-schedule, at minimum-cost, while satisfying several legality constraints. For large-scale complex flight networks, billion-plus legal pairings (variables) are possible, r…
▽ More
Crew pairing optimization (CPO) is critically important for any airline, since its crew operating costs are second-largest, next to the fuel-cost. CPO aims at generating a set of flight sequences (crew pairings) covering a flight-schedule, at minimum-cost, while satisfying several legality constraints. For large-scale complex flight networks, billion-plus legal pairings (variables) are possible, rendering their offline enumeration intractable and an exhaustive search for their minimum-cost full flight-coverage subset impractical. Even generating an initial feasible solution (IFS: a manageable set of legal pairings covering all flights), which could be subsequently optimized is a difficult (NP-complete) problem. Though, as part of a larger project the authors have developed a crew pairing optimizer (AirCROP), this paper dedicatedly focuses on IFS-generation through a novel heuristic based on divide-and-cover strategy and Integer Programming. For real-world large and complex flight network datasets (including over 3200 flights and 15 crew bases) provided by GE Aviation, the proposed heuristic shows upto a ten-fold speed improvement over another state-of-the-art approach. Unprecedentedly, this paper presents an empirical investigation of the impact of IFS-cost on the final (optimized) solution-cost, revealing that too low an IFS-cost does not necessarily imply faster convergence for AirCROP or even lower cost for the optimized solution.
△ Less
Submitted 15 March, 2020;
originally announced March 2020.
-
Airline Crew Pairing Optimization Framework for Large Networks with Multiple Crew Bases and Hub-and-Spoke Subnetworks
Authors:
Divyam Aggarwal,
Dhish Kumar Saxena,
Thomas Bäck,
Michael Emmerich
Abstract:
Crew Pairing Optimization aims at generating a set of flight sequences (crew pairings), covering all flights in an airline's flight schedule, at minimum cost, while satisfying several legality constraints. CPO is critically important for airlines' business viability, considering that the crew operating cost is their second-largest expense. It poses an NP-hard combinatorial optimization problem, to…
▽ More
Crew Pairing Optimization aims at generating a set of flight sequences (crew pairings), covering all flights in an airline's flight schedule, at minimum cost, while satisfying several legality constraints. CPO is critically important for airlines' business viability, considering that the crew operating cost is their second-largest expense. It poses an NP-hard combinatorial optimization problem, to tackle which, the state-of-the-art relies on relaxing the underlying Integer Programming Problem (IPP) into a Linear Programming Problem (LPP), solving the latter through Column Generation (CG) technique, and integerization of the resulting LPP solution. However, with the growing scale and complexity of the flight networks (those with a large number of flights, multiple crew bases and/or multiple hub-and-spoke subnetworks), the utility of the conventional CG-practices has become questionable. This paper proposed an Airline Crew Pairing Optimization Framework, AirCROP, whose constitutive modules include the Legal Crew Pairing Generator, Initial Feasible Solution Generator, and an Optimization Engine built on heuristic-based CG-implementation. In this paper, besides the design of AirCROP's modules, insights into important questions related to how these modules interact, which the literature is otherwise silent on, have been shared. These relate to the sensitivity of AirCROP's performance towards: sources of variability over multiple runs for a given problem, initialization method, and termination parameters for LPP-solutioning and IPP-solutioning. The efficacy of the AirCROP has been demonstrated on real-world large-scale and complex flight networks (with over 4200 flights, 15 crew bases, and billion-plus pairings). It is hoped that with the emergence of such complex flight networks, this paper shall serve as an important milestone for affiliated research and applications.
△ Less
Submitted 18 November, 2020; v1 submitted 9 March, 2020;
originally announced March 2020.
-
Real-World Airline Crew Pairing Optimization: Customized Genetic Algorithm versus Column Generation Method
Authors:
Divyam Aggarwal,
Dhish Kumar Saxena,
Thomas Back,
Michael Emmerich
Abstract:
Airline crew pairing optimization problem (CPOP) aims to find a set of flight sequences (crew pairings) that cover all flights in an airline's highly constrained flight schedule at minimum cost. Since crew cost is second only to the fuel cost, CPOP solutioning is critically important for an airline. However, CPOP is NP-hard, and tackling it is quite challenging. The literature suggests, that when…
▽ More
Airline crew pairing optimization problem (CPOP) aims to find a set of flight sequences (crew pairings) that cover all flights in an airline's highly constrained flight schedule at minimum cost. Since crew cost is second only to the fuel cost, CPOP solutioning is critically important for an airline. However, CPOP is NP-hard, and tackling it is quite challenging. The literature suggests, that when the CPOP's scale and complexity is reasonably limited, and an enumeration of all crew pairings is possible, then Metaheuristics are used, predominantly Genetic Algorithms (GAs). Else, Column Generation (CG) based Mixed Integer Programming techniques are used. Notably, as per the literature, a maximum of 45,000 crew pairings have been tackled by GAs. In a significant departure, this paper considers over 800 flights of a US-based large airline (with a monthly network of over 33,000 flights), and tests the efficacy of GAs by enumerating all 400,000+ crew pairings, apriori. Towards it, this paper proposes a domain-knowledge-driven customized-GA. The utility of incorporating domain-knowledge in GA operations, particularly initialization and crossover, is highlighted through suitable experiments. Finally, the proposed GA's performance is compared with a CG-based approach (developed in-house by the authors). Though the latter is found to perform better in terms of solution's cost-quality and run time, it is hoped that this paper will help in better understanding the strengths and limitations of domain-knowledge-driven customizations in GAs, for solving combinatorial optimization problems, including CPOPs.
△ Less
Submitted 27 May, 2023; v1 submitted 8 March, 2020;
originally announced March 2020.
-
Improved Classical and Quantum Algorithms for the Shortest Vector Problem via Bounded Distance Decoding
Authors:
Divesh Aggarwal,
Yanlin Chen,
Rajendra Kumar,
Yixin Shen
Abstract:
The most important computational problem on lattices is the Shortest Vector Problem (SVP). In this paper, we present new algorithms that improve the state-of-the-art for provable classical/quantum algorithms for SVP. We present the following results. $\bullet$ A new algorithm for SVP that provides a smooth tradeoff between time complexity and memory requirement. For any positive integer…
▽ More
The most important computational problem on lattices is the Shortest Vector Problem (SVP). In this paper, we present new algorithms that improve the state-of-the-art for provable classical/quantum algorithms for SVP. We present the following results. $\bullet$ A new algorithm for SVP that provides a smooth tradeoff between time complexity and memory requirement. For any positive integer $4\leq q\leq \sqrt{n}$, our algorithm takes $q^{13n+o(n)}$ time and requires $poly(n)\cdot q^{16n/q^2}$ memory. This tradeoff which ranges from enumeration ($q=\sqrt{n}$) to sieving ($q$ constant), is a consequence of a new time-memory tradeoff for Discrete Gaussian sampling above the smoothing parameter.
$\bullet$ A quantum algorithm for SVP that runs in time $2^{0.950n+o(n)}$ and requires $2^{0.5n+o(n)}$ classical memory and poly(n) qubits. In Quantum Random Access Memory (QRAM) model this algorithm takes only $2^{0.835n+o(n)}$ time and requires a QRAM of size $2^{0.293n+o(n)}$, poly(n) qubits and $2^{0.5n}$ classical space. This improves over the previously fastest classical (which is also the fastest quantum) algorithm due to [ADRS15] that has a time and space complexity $2^{n+o(n)}$.
$\bullet$ A classical algorithm for SVP that runs in time $2^{1.669n+o(n)}$ time and $2^{0.5n+o(n)}$ space. This improves over an algorithm of [CCL18] that has the same space complexity.
The time complexity of our classical and quantum algorithms are obtained using a known upper bound on a quantity related to the lattice kissing number which is $2^{0.402n}$. We conjecture that for most lattices this quantity is a $2^{o(n)}$. Assuming that this is the case, our classical algorithm runs in time $2^{1.292n+o(n)}$, our quantum algorithm runs in time $2^{0.750n+o(n)}$ and our quantum algorithm in QRAM model runs in time $2^{0.667n+o(n)}$.
△ Less
Submitted 10 May, 2022; v1 submitted 18 February, 2020;
originally announced February 2020.
-
Finding Missing Children: Aging Deep Face Features
Authors:
Debayan Deb,
Divyansh Aggarwal,
Anil K. Jain
Abstract:
Given a gallery of face images of missing children, state-of-the-art face recognition systems fall short in identifying a child (probe) recovered at a later age. We propose an age-progression module that can age-progress deep face features output by any commodity face matcher. For time lapses larger than 10 years (the missing child is found after 10 or more years), the proposed age-progression mod…
▽ More
Given a gallery of face images of missing children, state-of-the-art face recognition systems fall short in identifying a child (probe) recovered at a later age. We propose an age-progression module that can age-progress deep face features output by any commodity face matcher. For time lapses larger than 10 years (the missing child is found after 10 or more years), the proposed age-progression module improves the closed-set identification accuracy of FaceNet from 40% to 49.56% and CosFace from 56.88% to 61.25% on a child celebrity dataset, namely ITWCC. The proposed method also outperforms state-of-the-art approaches with a rank-1 identification rate from 94.91% to 95.91% on a public aging dataset, FG-NET, and from 99.50% to 99.58% on CACD-VS. These results suggest that aging face features enhances the ability to identify young children who are possible victims of child trafficking or abduction.
△ Less
Submitted 18 November, 2019; v1 submitted 18 November, 2019;
originally announced November 2019.
-
Generalized Boolean Functions and Quantum Circuits on IBM-Q
Authors:
Sugata Gangopadhyay,
Vishvendra Singh Poonia,
Daattavya Aggarwal,
Rhea Parekh
Abstract:
We explicitly derive a connection between quantum circuits utilising IBM's quantum gate set and multivariate quadratic polynomials over integers modulo 8. We demonstrate that the action of a quantum circuit over input qubits can be written as generalized Walsh-Hadamard transform. Here, we derive the polynomials corresponding to implementations of the Swap gate and Toffoli gate using IBM-Q gate set…
▽ More
We explicitly derive a connection between quantum circuits utilising IBM's quantum gate set and multivariate quadratic polynomials over integers modulo 8. We demonstrate that the action of a quantum circuit over input qubits can be written as generalized Walsh-Hadamard transform. Here, we derive the polynomials corresponding to implementations of the Swap gate and Toffoli gate using IBM-Q gate set.
△ Less
Submitted 15 November, 2019;
originally announced November 2019.
-
Fine-grained hardness of CVP(P) -- Everything that we can prove (and nothing else)
Authors:
Divesh Aggarwal,
Huck Bennett,
Alexander Golovnev,
Noah Stephens-Davidowitz
Abstract:
We show a number of fine-grained hardness results for the Closest Vector Problem in the $\ell_p$ norm ($\mathrm{CVP}_p$), and its approximate and non-uniform variants. First, we show that $\mathrm{CVP}_p$ cannot be solved in $2^{(1-\varepsilon)n}$ time for all $p \notin 2\mathbb{Z}$ and $\varepsilon > 0$, assuming the Strong Exponential Time Hypothesis (SETH). Second, we extend this by showing tha…
▽ More
We show a number of fine-grained hardness results for the Closest Vector Problem in the $\ell_p$ norm ($\mathrm{CVP}_p$), and its approximate and non-uniform variants. First, we show that $\mathrm{CVP}_p$ cannot be solved in $2^{(1-\varepsilon)n}$ time for all $p \notin 2\mathbb{Z}$ and $\varepsilon > 0$, assuming the Strong Exponential Time Hypothesis (SETH). Second, we extend this by showing that there is no $2^{(1-\varepsilon)n}$-time algorithm for approximating $\mathrm{CVP}_p$ to within a constant factor $γ$ for such $p$ assuming a "gap" version of SETH, with an explicit relationship between $γ$, $p$, and the arity $k = k(\varepsilon)$ of the underlying hard CSP. Third, we show the same hardness result for (exact) $\mathrm{CVP}_p$ with preprocessing (assuming non-uniform SETH).
For exact "plain" $\mathrm{CVP}_p$, the same hardness result was shown in [Bennett, Golovnev, and Stephens-Davidowitz FOCS 2017] for all but finitely many $p \notin 2\mathbb{Z}$, where the set of exceptions depended on $\varepsilon$ and was not explicit. For the approximate and preprocessing problems, only very weak bounds were known prior to this work.
We also show that the restriction to $p \notin 2\mathbb{Z}$ is in some sense inherent. In particular, we show that no "natural" reduction can rule out even a $2^{3n/4}$-time algorithm for $\mathrm{CVP}_2$ under SETH. For this, we prove that the possible sets of closest lattice vectors to a target in the $\ell_2$ norm have quite rigid structure, which essentially prevents them from being as expressive as $3$-CNFs.
We prove these results using techniques from many different fields, including complex analysis, functional analysis, additive combinatorics, and discrete Fourier analysis. E.g., along the way, we give a new (and tighter) proof of Szemerédi's cube lemma for the boolean cube.
△ Less
Submitted 7 August, 2021; v1 submitted 6 November, 2019;
originally announced November 2019.
-
Slide Reduction, Revisited---Filling the Gaps in SVP Approximation
Authors:
Divesh Aggarwal,
Jianwei Li,
Phong Q. Nguyen,
Noah Stephens-Davidowitz
Abstract:
We show how to generalize Gama and Nguyen's slide reduction algorithm [STOC '08] for solving the approximate Shortest Vector Problem over lattices (SVP). As a result, we show the fastest provably correct algorithm for $δ$-approximate SVP for all approximation factors $n^{1/2+\varepsilon} \leq δ\leq n^{O(1)}$. This is the range of approximation factors most relevant for cryptography.
We show how to generalize Gama and Nguyen's slide reduction algorithm [STOC '08] for solving the approximate Shortest Vector Problem over lattices (SVP). As a result, we show the fastest provably correct algorithm for $δ$-approximate SVP for all approximation factors $n^{1/2+\varepsilon} \leq δ\leq n^{O(1)}$. This is the range of approximation factors most relevant for cryptography.
△ Less
Submitted 10 August, 2019;
originally announced August 2019.
-
An improved constant in Banaszczyk's transference theorem
Authors:
Divesh Aggarwal,
Noah Stephens-Davidowitz
Abstract:
$ \newcommand{\R}{\ensuremath{\mathbb{R}}} \newcommand{\lat}{\mathcal{L}} \newcommand{\ensuremath}[1]{#1} $We show that \[ μ(\lat) λ_1(\lat^*) < \big( 0.1275 + o(1) \big) \cdot n \; , \] where $μ(\lat)$ is the covering radius of an $n$-dimensional lattice $\lat \subset \R^n$ and $λ_1(\lat^*)$ is the length of the shortest non-zero vector in the dual lattice $\lat^*$. This improves on Banaszczyk's…
▽ More
$ \newcommand{\R}{\ensuremath{\mathbb{R}}} \newcommand{\lat}{\mathcal{L}} \newcommand{\ensuremath}[1]{#1} $We show that \[ μ(\lat) λ_1(\lat^*) < \big( 0.1275 + o(1) \big) \cdot n \; , \] where $μ(\lat)$ is the covering radius of an $n$-dimensional lattice $\lat \subset \R^n$ and $λ_1(\lat^*)$ is the length of the shortest non-zero vector in the dual lattice $\lat^*$. This improves on Banaszczyk's celebrated transference theorem (Math. Annal., 1993) by about 20%.
Our proof follows Banaszczyk exactly, except in one step, where we replace a Fourier-analytic bound on the discrete Gaussian mass with a slightly stronger bound based on packing. The packing-based bound that we use was already proven by Aggarwal, Dadush, Regev, and Stephens-Davidowitz (STOC, 2015) in a very different context. Our contribution is therefore simply the observation that this implies a better transference theorem.
△ Less
Submitted 21 July, 2019;
originally announced July 2019.