-
SIMD-Aware Homomorphic Compression and Application to Private Database Query
Authors:
Jung Hee Cheon,
Keewoo Lee,
Jai Hyun Park,
Yongdong Yeo
Abstract:
In a private database query scheme (PDQ), a server maintains a database, and users send queries to retrieve records of interest from the server while keeping their queries private. A crucial step in PDQ protocols based on homomorphic encryption is homomorphic compression, which compresses encrypted sparse vectors consisting of query results. In this work, we propose a new homomorphic compression s…
▽ More
In a private database query scheme (PDQ), a server maintains a database, and users send queries to retrieve records of interest from the server while keeping their queries private. A crucial step in PDQ protocols based on homomorphic encryption is homomorphic compression, which compresses encrypted sparse vectors consisting of query results. In this work, we propose a new homomorphic compression scheme with PDQ as its main application. Unlike existing approaches, our scheme (i) can be efficiently implemented by fully exploiting homomorphic SIMD technique and (ii) enjoys both asymptotically optimal compression rate and asymptotically good decompression complexity. Experimental results show that our approach is 4.7x to 33.2x faster than the previous best results.
△ Less
Submitted 30 August, 2024;
originally announced August 2024.
-
ETSCL: An Evidence Theory-Based Supervised Contrastive Learning Framework for Multi-modal Glaucoma Grading
Authors:
Zhiyuan Yang,
Bo Zhang,
Yufei Shi,
Ningze Zhong,
Johnathan Loh,
Huihui Fang,
Yanwu Xu,
Si Yong Yeo
Abstract:
Glaucoma is one of the leading causes of vision impairment. Digital imaging techniques, such as color fundus photography (CFP) and optical coherence tomography (OCT), provide quantitative and noninvasive methods for glaucoma diagnosis. Recently, in the field of computer-aided glaucoma diagnosis, multi-modality methods that integrate the CFP and OCT modalities have achieved greater diagnostic accur…
▽ More
Glaucoma is one of the leading causes of vision impairment. Digital imaging techniques, such as color fundus photography (CFP) and optical coherence tomography (OCT), provide quantitative and noninvasive methods for glaucoma diagnosis. Recently, in the field of computer-aided glaucoma diagnosis, multi-modality methods that integrate the CFP and OCT modalities have achieved greater diagnostic accuracy compared to single-modality methods. However, it remains challenging to extract reliable features due to the high similarity of medical images and the unbalanced multi-modal data distribution. Moreover, existing methods overlook the uncertainty estimation of different modalities, leading to unreliable predictions. To address these challenges, we propose a novel framework, namely ETSCL, which consists of a contrastive feature extraction stage and a decision-level fusion stage. Specifically, the supervised contrastive loss is employed to enhance the discriminative power in the feature extraction process, resulting in more effective features. In addition, we utilize the Frangi vesselness algorithm as a preprocessing step to incorporate vessel information to assist in the prediction. In the decision-level fusion stage, an evidence theory-based multi-modality classifier is employed to combine multi-source information with uncertainty estimation. Extensive experiments demonstrate that our method achieves state-of-the-art performance. The code is available at \url{https://github.com/master-Shix/ETSCL}.
△ Less
Submitted 19 July, 2024;
originally announced July 2024.
-
Two Heads are Better Than One: Neural Networks Quantization with 2D Hilbert Curve-based Output Representation
Authors:
Mykhailo Uss,
Ruslan Yermolenko,
Olena Kolodiazhna,
Oleksii Shashko,
Ivan Safonov,
Volodymyr Savin,
Yoonjae Yeo,
Seowon Ji,
Jaeyun Jeong
Abstract:
Quantization is widely used to increase deep neural networks' (DNN) memory, computation, and power efficiency. Various techniques, such as post-training quantization and quantization-aware training, have been proposed to improve quantization quality. We introduce a novel approach for DNN quantization that uses a redundant representation of DNN's output. We represent the target quantity as a point…
▽ More
Quantization is widely used to increase deep neural networks' (DNN) memory, computation, and power efficiency. Various techniques, such as post-training quantization and quantization-aware training, have been proposed to improve quantization quality. We introduce a novel approach for DNN quantization that uses a redundant representation of DNN's output. We represent the target quantity as a point on a 2D parametric curve. The DNN model is modified to predict 2D points that are mapped back to the target quantity at a post-processing stage. We demonstrate that this mapping can reduce quantization error. For the low-order parametric Hilbert curve, Depth-From-Stereo task, and two models represented by U-Net architecture and vision transformer, we achieved a quantization error reduction by about 5 times for the INT8 model at both CPU and DSP delegates. This gain comes with a minimal inference time increase (less than 7%). Our approach can be applied to other tasks, including segmentation, object detection, and key-points prediction.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
HyperCLOVA X Technical Report
Authors:
Kang Min Yoo,
Jaegeun Han,
Sookyo In,
Heewon Jeon,
Jisu Jeong,
Jaewook Kang,
Hyunwook Kim,
Kyung-Min Kim,
Munhyong Kim,
Sungju Kim,
Donghyun Kwak,
Hanock Kwak,
Se Jung Kwon,
Bado Lee,
Dongsoo Lee,
Gichang Lee,
Jooho Lee,
Baeseong Park,
Seongjin Shin,
Joonsang Yu,
Seolki Baek,
Sumin Byeon,
Eungsup Cho,
Dooseok Choe,
Jeesung Han
, et al. (371 additional authors not shown)
Abstract:
We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t…
▽ More
We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment to responsible AI. The model is evaluated across various benchmarks, including comprehensive reasoning, knowledge, commonsense, factuality, coding, math, chatting, instruction-following, and harmlessness, in both Korean and English. HyperCLOVA X exhibits strong reasoning capabilities in Korean backed by a deep understanding of the language and cultural nuances. Further analysis of the inherent bilingual nature and its extension to multilingualism highlights the model's cross-lingual proficiency and strong generalization ability to untargeted languages, including machine translation between several language pairs and cross-lingual inference tasks. We believe that HyperCLOVA X can provide helpful guidance for regions or countries in developing their sovereign LLMs.
△ Less
Submitted 13 April, 2024; v1 submitted 2 April, 2024;
originally announced April 2024.
-
Help Me Reflect: Leveraging Self-Reflection Interface Nudges to Enhance Deliberativeness on Online Deliberation Platforms
Authors:
Shun Yi Yeo,
Gionnieve Lim,
Jie Gao,
Weiyu Zhang,
Simon Tangi Perrault
Abstract:
The deliberative potential of online platforms has been widely examined. However, little is known about how various interface-based reflection nudges impact the quality of deliberation. This paper presents two user studies with 12 and 120 participants, respectively, to investigate the impacts of different reflective nudges on the quality of deliberation. In the first study, we examined five distin…
▽ More
The deliberative potential of online platforms has been widely examined. However, little is known about how various interface-based reflection nudges impact the quality of deliberation. This paper presents two user studies with 12 and 120 participants, respectively, to investigate the impacts of different reflective nudges on the quality of deliberation. In the first study, we examined five distinct reflective nudges: persona, temporal prompts, analogies and metaphors, cultural prompts and storytelling. Persona, temporal prompts, and storytelling emerged as the preferred nudges for implementation on online deliberation platforms. In the second study, we assess the impacts of these preferred reflectors more thoroughly. Results revealed a significant positive impact of these reflectors on deliberative quality. Specifically, persona promotes a deliberative environment for balanced and opinionated viewpoints while temporal prompts promote more individualised viewpoints. Our findings suggest that the choice of reflectors can significantly influence the dynamics and shape the nature of online discussions.
△ Less
Submitted 19 January, 2024;
originally announced January 2024.
-
Revisiting Cephalometric Landmark Detection from the view of Human Pose Estimation with Lightweight Super-Resolution Head
Authors:
Qian Wu,
Si Yong Yeo,
Yufei Chen,
Jun Liu
Abstract:
Accurate localization of cephalometric landmarks holds great importance in the fields of orthodontics and orthognathics due to its potential for automating key point labeling. In the context of landmark detection, particularly in cephalometrics, it has been observed that existing methods often lack standardized pipelines and well-designed bias reduction processes, which significantly impact their…
▽ More
Accurate localization of cephalometric landmarks holds great importance in the fields of orthodontics and orthognathics due to its potential for automating key point labeling. In the context of landmark detection, particularly in cephalometrics, it has been observed that existing methods often lack standardized pipelines and well-designed bias reduction processes, which significantly impact their performance. In this paper, we revisit a related task, human pose estimation (HPE), which shares numerous similarities with cephalometric landmark detection (CLD), and emphasize the potential for transferring techniques from the former field to benefit the latter. Motivated by this insight, we have developed a robust and adaptable benchmark based on the well-established HPE codebase known as MMPose. This benchmark can serve as a dependable baseline for achieving exceptional CLD performance. Furthermore, we introduce an upscaling design within the framework to further enhance performance. This enhancement involves the incorporation of a lightweight and efficient super-resolution module, which generates heatmap predictions on high-resolution features and leads to further performance refinement, benefiting from its ability to reduce quantization bias. In the MICCAI CLDetection2023 challenge, our method achieves 1st place ranking on three metrics and 3rd place on the remaining one. The code for our method is available at https://github.com/5k5000/CLdetection2023.
△ Less
Submitted 29 September, 2023;
originally announced September 2023.
-
SINC: Self-Supervised In-Context Learning for Vision-Language Tasks
Authors:
Yi-Syuan Chen,
Yun-Zhu Song,
Cheng Yu Yeo,
Bei Liu,
Jianlong Fu,
Hong-Han Shuai
Abstract:
Large Pre-trained Transformers exhibit an intriguing capacity for in-context learning. Without gradient updates, these models can rapidly construct new predictors from demonstrations presented in the inputs. Recent works promote this ability in the vision-language domain by incorporating visual information into large language models that can already make in-context predictions. However, these meth…
▽ More
Large Pre-trained Transformers exhibit an intriguing capacity for in-context learning. Without gradient updates, these models can rapidly construct new predictors from demonstrations presented in the inputs. Recent works promote this ability in the vision-language domain by incorporating visual information into large language models that can already make in-context predictions. However, these methods could inherit issues in the language domain, such as template sensitivity and hallucination. Also, the scale of these language models raises a significant demand for computations, making learning and operating these models resource-intensive. To this end, we raise a question: ``How can we enable in-context learning without relying on the intrinsic in-context ability of large language models?". To answer it, we propose a succinct and general framework, Self-supervised IN-Context learning (SINC), that introduces a meta-model to learn on self-supervised prompts consisting of tailored demonstrations. The learned models can be transferred to downstream tasks for making in-context predictions on-the-fly. Extensive experiments show that SINC outperforms gradient-based methods in various vision-language tasks under few-shot settings. Furthermore, the designs of SINC help us investigate the benefits of in-context learning across different tasks, and the analysis further reveals the essential components for the emergence of in-context learning in the vision-language domain.
△ Less
Submitted 19 August, 2023; v1 submitted 15 July, 2023;
originally announced July 2023.
-
Star-specific Key-homomorphic PRFs from Learning with Linear Regression
Authors:
Vipin Singh Sehrawat,
Foo Yee Yeo,
Dmitriy Vassilyev
Abstract:
We introduce a novel method to derandomize the learning with errors (LWE) problem by generating deterministic yet sufficiently independent LWE instances that are constructed by using linear regression models, which are generated via (wireless) communication errors. We also introduce star-specific key-homomorphic (SSKH) pseudorandom functions (PRFs), which are defined by the respective sets of part…
▽ More
We introduce a novel method to derandomize the learning with errors (LWE) problem by generating deterministic yet sufficiently independent LWE instances that are constructed by using linear regression models, which are generated via (wireless) communication errors. We also introduce star-specific key-homomorphic (SSKH) pseudorandom functions (PRFs), which are defined by the respective sets of parties that construct them. We use our derandomized variant of LWE to construct a SSKH PRF family. The sets of parties constructing SSKH PRFs are arranged as star graphs with possibly shared vertices, i.e., the pairs of sets may have non-empty intersections. We reduce the security of our SSKH PRF family to the hardness of LWE. To establish the maximum number of SSKH PRFs that can be constructed -- by a set of parties -- in the presence of passive/active and external/internal adversaries, we prove several bounds on the size of maximally cover-free at most $t$-intersecting $k$-uniform family of sets $\mathcal{H}$, where the three properties are defined as: (i) $k$-uniform: $\forall A \in \mathcal{H}: |A| = k$, (ii) at most $t$-intersecting: $\forall A, B \in \mathcal{H}, B \neq A: |A \cap B| \leq t$, (iii) maximally cover-free: $\forall A \in \mathcal{H}: A \not\subseteq \bigcup\limits_{\substack{B \in \mathcal{H} \\ B \neq A}} B$. For the same purpose, we define and compute the mutual information between different linear regression hypotheses that are generated from overlapping training datasets.
△ Less
Submitted 28 July, 2023; v1 submitted 2 May, 2022;
originally announced May 2022.
-
Animal Kingdom: A Large and Diverse Dataset for Animal Behavior Understanding
Authors:
Xun Long Ng,
Kian Eng Ong,
Qichen Zheng,
Yun Ni,
Si Yong Yeo,
Jun Liu
Abstract:
Understanding animals' behaviors is significant for a wide range of applications. However, existing animal behavior datasets have limitations in multiple aspects, including limited numbers of animal classes, data samples and provided tasks, and also limited variations in environmental conditions and viewpoints. To address these limitations, we create a large and diverse dataset, Animal Kingdom, th…
▽ More
Understanding animals' behaviors is significant for a wide range of applications. However, existing animal behavior datasets have limitations in multiple aspects, including limited numbers of animal classes, data samples and provided tasks, and also limited variations in environmental conditions and viewpoints. To address these limitations, we create a large and diverse dataset, Animal Kingdom, that provides multiple annotated tasks to enable a more thorough understanding of natural animal behaviors. The wild animal footages used in our dataset record different times of the day in extensive range of environments containing variations in backgrounds, viewpoints, illumination and weather conditions. More specifically, our dataset contains 50 hours of annotated videos to localize relevant animal behavior segments in long videos for the video grounding task, 30K video sequences for the fine-grained multi-label action recognition task, and 33K frames for the pose estimation task, which correspond to a diverse range of animals with 850 species across 6 major animal classes. Such a challenging and comprehensive dataset shall be able to facilitate the community to develop, adapt, and evaluate various types of advanced methods for animal behavior analysis. Moreover, we propose a Collaborative Action Recognition (CARe) model that learns general and specific features for action recognition with unseen new animals. This method achieves promising performance in our experiments. Our dataset can be found at https://sutdcv.github.io/Animal-Kingdom.
△ Less
Submitted 3 June, 2022; v1 submitted 17 April, 2022;
originally announced April 2022.
-
Image Generation with Self Pixel-wise Normalization
Authors:
Yoon-Jae Yeo,
Min-Cheol Sagong,
Seung Park,
Sung-Jea Ko,
Yong-Goo Shin
Abstract:
Region-adaptive normalization (RAN) methods have been widely used in the generative adversarial network (GAN)-based image-to-image translation technique. However, since these approaches need a mask image to infer the pixel-wise affine transformation parameters, they cannot be applied to the general image generation models having no paired mask images. To resolve this problem, this paper presents a…
▽ More
Region-adaptive normalization (RAN) methods have been widely used in the generative adversarial network (GAN)-based image-to-image translation technique. However, since these approaches need a mask image to infer the pixel-wise affine transformation parameters, they cannot be applied to the general image generation models having no paired mask images. To resolve this problem, this paper presents a novel normalization method, called self pixel-wise normalization (SPN), which effectively boosts the generative performance by performing the pixel-adaptive affine transformation without the mask image. In our method, the transforming parameters are derived from a self-latent mask that divides the feature map into the foreground and background regions. The visualization of the self-latent masks shows that SPN effectively captures a single object to be generated as the foreground. Since the proposed method produces the self-latent mask without external data, it is easily applicable in the existing generative models. Extensive experiments on various datasets reveal that the proposed method significantly improves the performance of image generation technique in terms of Frechet inception distance (FID) and Inception score (IS).
△ Less
Submitted 25 January, 2022;
originally announced January 2022.
-
Function-private Conditional Disclosure of Secrets and Multi-evaluation Threshold Distributed Point Functions
Authors:
Nolan Miranda,
Foo Yee Yeo,
Vipin Singh Sehrawat
Abstract:
Conditional disclosure of secrets (CDS) allows multiple parties to reveal a secret to a third party if and only if some pre-decided condition is satisfied. In this work, we bolster the privacy guarantees of CDS by introducing function-private CDS wherein the pre-decided condition is never revealed to the third party. We also derive a function secret sharing scheme from our function-private CDS sol…
▽ More
Conditional disclosure of secrets (CDS) allows multiple parties to reveal a secret to a third party if and only if some pre-decided condition is satisfied. In this work, we bolster the privacy guarantees of CDS by introducing function-private CDS wherein the pre-decided condition is never revealed to the third party. We also derive a function secret sharing scheme from our function-private CDS solution. The second problem that we consider concerns threshold distributed point functions, which allow one to split a point function such that at least a threshold number of shares are required to evaluate it at any given input. We consider a setting wherein a point function is split among a set of parties such that multiple evaluations do not leak non-negligible information about it. Finally, we present a provably optimal procedure to perform threshold function secret sharing of any polynomial in a finite field.
△ Less
Submitted 8 October, 2021;
originally announced October 2021.
-
What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers
Authors:
Boseop Kim,
HyoungSeok Kim,
Sang-Woo Lee,
Gichang Lee,
Donghyun Kwak,
Dong Hyeon Jeon,
Sunghyun Park,
Sungju Kim,
Seonhoon Kim,
Dongpil Seo,
Heungsub Lee,
Minyoung Jeong,
Sungjae Lee,
Minsub Kim,
Suk Hyun Ko,
Seokhun Kim,
Taeyong Park,
Jinuk Kim,
Soyoung Kang,
Na-Hyeon Ryu,
Kang Min Yoo,
Minsuk Chang,
Soobin Suh,
Sookyo In,
Jinseong Park
, et al. (12 additional authors not shown)
Abstract:
GPT-3 shows remarkable in-context learning ability of large-scale language models (LMs) trained on hundreds of billion scale data. Here we address some remaining issues less reported by the GPT-3 paper, such as a non-English LM, the performances of different sized models, and the effect of recently introduced prompt optimization on in-context learning. To achieve this, we introduce HyperCLOVA, a K…
▽ More
GPT-3 shows remarkable in-context learning ability of large-scale language models (LMs) trained on hundreds of billion scale data. Here we address some remaining issues less reported by the GPT-3 paper, such as a non-English LM, the performances of different sized models, and the effect of recently introduced prompt optimization on in-context learning. To achieve this, we introduce HyperCLOVA, a Korean variant of 82B GPT-3 trained on a Korean-centric corpus of 560B tokens. Enhanced by our Korean-specific tokenization, HyperCLOVA with our training configuration shows state-of-the-art in-context zero-shot and few-shot learning performances on various downstream tasks in Korean. Also, we show the performance benefits of prompt-based learning and demonstrate how it can be integrated into the prompt engineering pipeline. Then we discuss the possibility of materializing the No Code AI paradigm by providing AI prototyping capabilities to non-experts of ML by introducing HyperCLOVA studio, an interactive prompt engineering interface. Lastly, we demonstrate the potential of our methods with three successful in-house applications.
△ Less
Submitted 28 November, 2021; v1 submitted 9 September, 2021;
originally announced September 2021.
-
Content-aware Directed Propagation Network with Pixel Adaptive Kernel Attention
Authors:
Min-Cheol Sagong,
Yoon-Jae Yeo,
Seung-Won Jung,
Sung-Jea Ko
Abstract:
Convolutional neural networks (CNNs) have been not only widespread but also achieved noticeable results on numerous applications including image classification, restoration, and generation. Although the weight-sharing property of convolutions makes them widely adopted in various tasks, its content-agnostic characteristic can also be considered a major drawback. To solve this problem, in this paper…
▽ More
Convolutional neural networks (CNNs) have been not only widespread but also achieved noticeable results on numerous applications including image classification, restoration, and generation. Although the weight-sharing property of convolutions makes them widely adopted in various tasks, its content-agnostic characteristic can also be considered a major drawback. To solve this problem, in this paper, we propose a novel operation, called pixel adaptive kernel attention (PAKA). PAKA provides directivity to the filter weights by multiplying spatially varying attention from learnable features. The proposed method infers pixel-adaptive attention maps along the channel and spatial directions separately to address the decomposed model with fewer parameters. Our method is trainable in an end-to-end manner and applicable to any CNN-based models. In addition, we propose an improved information aggregation module with PAKA, called the hierarchical PAKA module (HPM). We demonstrate the superiority of our HPM by presenting state-of-the-art performance on semantic segmentation compared to the conventional information aggregation modules. We validate the proposed method through additional ablation studies and visualizing the effect of PAKA providing directivity to the weights of convolutions. We also show the generalizability of the proposed method by applying it to multi-modal tasks especially color-guided depth map super-resolution.
△ Less
Submitted 13 September, 2022; v1 submitted 27 July, 2021;
originally announced July 2021.
-
PConv: Simple yet Effective Convolutional Layer for Generative Adversarial Network
Authors:
Seung Park,
Yoon-Jae Yeo,
Yong-Goo Shin
Abstract:
This paper presents a novel convolutional layer, called perturbed convolution (PConv), which focuses on achieving two goals simultaneously: improving the generative adversarial network (GAN) performance and alleviating the memorization problem in which the discriminator memorizes all images from a given dataset as training progresses. In PConv, perturbed features are generated by randomly disturbi…
▽ More
This paper presents a novel convolutional layer, called perturbed convolution (PConv), which focuses on achieving two goals simultaneously: improving the generative adversarial network (GAN) performance and alleviating the memorization problem in which the discriminator memorizes all images from a given dataset as training progresses. In PConv, perturbed features are generated by randomly disturbing an input tensor before performing the convolution operation. This approach is simple but surprisingly effective. First, to produce a similar output even with the perturbed tensor, each layer in the discriminator should learn robust features having a small local Lipschitz value. Second, since the input tensor is randomly perturbed during the training procedure like the dropout in neural networks, the memorization problem could be alleviated. To show the generalization ability of the proposed method, we conducted extensive experiments with various loss functions and datasets including CIFAR-10, CelebA, CelebA-HQ, LSUN, and tiny-ImageNet. The quantitative evaluations demonstrate that PConv effectively boosts the performance of GAN and conditional GAN in terms of Frechet inception distance (FID).
△ Less
Submitted 7 November, 2021; v1 submitted 19 January, 2021;
originally announced January 2021.
-
Extremal Set Theory and LWE Based Access Structure Hiding Verifiable Secret Sharing with Malicious-Majority and Free Verification
Authors:
Vipin Singh Sehrawat,
Foo Yee Yeo,
Yvo Desmedt
Abstract:
Secret sharing allows distributing a secret among several parties such that only authorized subsets, specified by an access structure, can reconstruct the secret. Sehrawat and Desmedt (COCOON 2020) introduced hidden access structures, that remain secret until some authorized subset of parties collaborate. However, their scheme assumes semi-honest parties and supports only restricted access structu…
▽ More
Secret sharing allows distributing a secret among several parties such that only authorized subsets, specified by an access structure, can reconstruct the secret. Sehrawat and Desmedt (COCOON 2020) introduced hidden access structures, that remain secret until some authorized subset of parties collaborate. However, their scheme assumes semi-honest parties and supports only restricted access structures. We address these shortcomings by constructing an access structure hiding verifiable secret sharing scheme that supports all monotone access structures. It is the first secret sharing scheme to support cheater identification and share verifiability in malicious-majority settings. The verification procedure of our scheme incurs no communication overhead. As the building blocks of our scheme, we introduce and construct: (i) a set-system with $> \exp\left(c\frac{2(\log h)^2}{(\log\log h)}\right)+2\exp\left(c\frac{(\log h)^2}{(\log\log h)}\right)$ subsets of a set of $h$ elements. Our set-system, $\mathcal{H}$, is defined over $\mathbb{Z}_m$, where $m$ is a non-prime-power. The size of each set in $\mathcal{H}$ is divisible by $m$ but the sizes of their pairwise intersections are not, unless one set is a subset of another, (ii) a new variant of the learning with errors (LWE) problem, called PRIM-LWE, wherein the secret matrix is sampled such that its determinant is a generator of $\mathbb{Z}_q^*$, where $q$ is the LWE modulus. The security of our scheme relies on the hardness of the LWE problem, and its share size is $$(1+ o(1)) \dfrac{2^{\ell}}{\sqrt{π\ell/2}}(2 q^{\varrho + 0.5} + \sqrt{q} + \mathrmΘ(h)),$$ where $\varrho \leq 1$ is a constant and $\ell$ is the total number of parties. We also provide directions for future work to reduce the share size to
\[\leq \dfrac{1}{3} \left( (1+ o(1)) \dfrac{2^{\ell}}{\sqrt{π\ell/2}}(2 q^{\varrho + 0.5} + 2\sqrt{q}) \right).\]
△ Less
Submitted 13 September, 2021; v1 submitted 30 November, 2020;
originally announced November 2020.
-
Learning by Semantic Similarity Makes Abstractive Summarization Better
Authors:
Wonjin Yoon,
Yoon Sun Yeo,
Minbyul Jeong,
Bong-Jun Yi,
Jaewoo Kang
Abstract:
By harnessing pre-trained language models, summarization models had rapid progress recently. However, the models are mainly assessed by automatic evaluation metrics such as ROUGE. Although ROUGE is known for having a positive correlation with human evaluation scores, it has been criticized for its vulnerability and the gap between actual qualities. In this paper, we compare the generated summaries…
▽ More
By harnessing pre-trained language models, summarization models had rapid progress recently. However, the models are mainly assessed by automatic evaluation metrics such as ROUGE. Although ROUGE is known for having a positive correlation with human evaluation scores, it has been criticized for its vulnerability and the gap between actual qualities. In this paper, we compare the generated summaries from recent LM, BART, and the reference summaries from a benchmark dataset, CNN/DM, using a crowd-sourced human evaluation metric. Interestingly, model-generated summaries receive higher scores relative to reference summaries. Stemming from our experimental results, we first argue the intrinsic characteristics of the CNN/DM dataset, the progress of pre-trained language models, and their ability to generalize on the training data. Finally, we share our insights into the model-generated summaries and presents our thought on learning methods for abstractive summarization.
△ Less
Submitted 2 June, 2021; v1 submitted 18 February, 2020;
originally announced February 2020.
-
Simple yet Effective Way for Improving the Performance of GAN
Authors:
Yong-Goo Shin,
Yoon-Jae Yeo,
Sung-Jea Ko
Abstract:
In adversarial learning, discriminator often fails to guide the generator successfully since it distinguishes between real and generated images using silly or non-robust features. To alleviate this problem, this brief presents a simple but effective way that improves the performance of generative adversarial network (GAN) without imposing the training overhead or modifying the network architecture…
▽ More
In adversarial learning, discriminator often fails to guide the generator successfully since it distinguishes between real and generated images using silly or non-robust features. To alleviate this problem, this brief presents a simple but effective way that improves the performance of generative adversarial network (GAN) without imposing the training overhead or modifying the network architectures of existing methods. The proposed method employs a novel cascading rejection (CR) module for discriminator, which extracts multiple non-overlapped features in an iterative manner using the vector rejection operation. Since the extracted diverse features prevent the discriminator from concentrating on non-meaningful features, the discriminator can guide the generator effectively to produce the images that are more similar to the real images. In addition, since the proposed CR module requires only a few simple vector operations, it can be readily applied to existing frameworks with marginal training overheads. Quantitative evaluations on various datasets including CIFAR-10, CelebA, CelebA-HQ, LSUN, and tiny-ImageNet confirm that the proposed method significantly improves the performance of GAN and conditional GAN in terms of Frechet inception distance (FID) indicating the diversity and visual appearance of the generated images.
△ Less
Submitted 19 January, 2021; v1 submitted 19 November, 2019;
originally announced November 2019.
-
Towards Debiasing Fact Verification Models
Authors:
Tal Schuster,
Darsh J Shah,
Yun Jie Serene Yeo,
Daniel Filizzola,
Enrico Santus,
Regina Barzilay
Abstract:
Fact verification requires validating a claim in the context of evidence. We show, however, that in the popular FEVER dataset this might not necessarily be the case. Claim-only classifiers perform competitively with top evidence-aware models. In this paper, we investigate the cause of this phenomenon, identifying strong cues for predicting labels solely based on the claim, without considering any…
▽ More
Fact verification requires validating a claim in the context of evidence. We show, however, that in the popular FEVER dataset this might not necessarily be the case. Claim-only classifiers perform competitively with top evidence-aware models. In this paper, we investigate the cause of this phenomenon, identifying strong cues for predicting labels solely based on the claim, without considering any evidence. We create an evaluation set that avoids those idiosyncrasies. The performance of FEVER-trained models significantly drops when evaluated on this test set. Therefore, we introduce a regularization method which alleviates the effect of bias in the training data, obtaining improvements on the newly created test set. This work is a step towards a more sound evaluation of reasoning capabilities in fact verification models.
△ Less
Submitted 30 August, 2019; v1 submitted 14 August, 2019;
originally announced August 2019.
-
cGANs with Conditional Convolution Layer
Authors:
Min-Cheol Sagong,
Yong-Goo Shin,
Yoon-Jae Yeo,
Seung Park,
Sung-Jea Ko
Abstract:
Conditional generative adversarial networks (cGANs) have been widely researched to generate class conditional images using a single generator. However, in the conventional cGANs techniques, it is still challenging for the generator to learn condition-specific features, since a standard convolutional layer with the same weights is used regardless of the condition. In this paper, we propose a novel…
▽ More
Conditional generative adversarial networks (cGANs) have been widely researched to generate class conditional images using a single generator. However, in the conventional cGANs techniques, it is still challenging for the generator to learn condition-specific features, since a standard convolutional layer with the same weights is used regardless of the condition. In this paper, we propose a novel convolution layer, called the conditional convolution layer, which directly generates different feature maps by employing the weights which are adjusted depending on the conditions. More specifically, in each conditional convolution layer, the weights are conditioned in a simple but effective way through filter-wise scaling and channel-wise shifting operations. In contrast to the conventional methods, the proposed method with a single generator can effectively handle condition-specific characteristics. The experimental results on CIFAR, LSUN and ImageNet datasets show that the generator with the proposed conditional convolution layer achieves a higher quality of conditional image generation than that with the standard convolution layer.
△ Less
Submitted 8 April, 2020; v1 submitted 3 June, 2019;
originally announced June 2019.
-
PEPSI++: Fast and Lightweight Network for Image Inpainting
Authors:
Yong-Goo Shin,
Min-Cheol Sagong,
Yoon-Jae Yeo,
Seung-Wook Kim,
Sung-Jea Ko
Abstract:
Among the various generative adversarial network (GAN)-based image inpainting methods, a coarse-to-fine network with a contextual attention module (CAM) has shown remarkable performance. However, owing to two stacked generative networks, the coarse-to-fine network needs numerous computational resources such as convolution operations and network parameters, which result in low speed. To address thi…
▽ More
Among the various generative adversarial network (GAN)-based image inpainting methods, a coarse-to-fine network with a contextual attention module (CAM) has shown remarkable performance. However, owing to two stacked generative networks, the coarse-to-fine network needs numerous computational resources such as convolution operations and network parameters, which result in low speed. To address this problem, we propose a novel network architecture called PEPSI: parallel extended-decoder path for semantic inpainting network, which aims at reducing the hardware costs and improving the inpainting performance. PEPSI consists of a single shared encoding network and parallel decoding networks called coarse and inpainting paths. The coarse path produces a preliminary inpainting result to train the encoding network for the prediction of features for the CAM. Simultaneously, the inpainting path generates higher inpainting quality using the refined features reconstructed via the CAM. In addition, we propose Diet-PEPSI that significantly reduces the network parameters while maintaining the performance. In Diet-PEPSI, to capture the global contextual information with low hardware costs, we propose novel rate-adaptive dilated convolutional layers, which employ the common weights but produce dynamic features depending on the given dilation rates. Extensive experiments comparing the performance with state-of-the-art image inpainting methods demonstrate that both PEPSI and Diet-PEPSI improve the qualitative scores, i.e. the peak signal-to-noise ratio (PSNR) and structural similarity (SSIM), as well as significantly reduce hardware costs such as computational time and the number of network parameters.
△ Less
Submitted 6 March, 2020; v1 submitted 22 May, 2019;
originally announced May 2019.
-
Unsupervised Deep Contrast Enhancement with Power Constraint for OLED Displays
Authors:
Yong-Goo Shin,
Seung Park,
Yoon-Jae Yeo,
Min-Jae Yoo,
Sung-Jea Ko
Abstract:
Various power-constrained contrast enhancement (PCCE) techniques have been applied to an organic light emitting diode (OLED) display for reducing the power demands of the display while preserving the image quality. In this paper, we propose a new deep learning-based PCCE scheme that constrains the power consumption of the OLED displays while enhancing the contrast of the displayed image. In the pr…
▽ More
Various power-constrained contrast enhancement (PCCE) techniques have been applied to an organic light emitting diode (OLED) display for reducing the power demands of the display while preserving the image quality. In this paper, we propose a new deep learning-based PCCE scheme that constrains the power consumption of the OLED displays while enhancing the contrast of the displayed image. In the proposed method, the power consumption is constrained by simply reducing the brightness a certain ratio, whereas the perceived visual quality is preserved as much as possible by enhancing the contrast of the image using a convolutional neural network (CNN). Furthermore, our CNN can learn the PCCE technique without a reference image by unsupervised learning. Experimental results show that the proposed method is superior to conventional ones in terms of image quality assessment metrics such as a visual saliency-induced index (VSI) and a measure of enhancement (EME).
△ Less
Submitted 9 December, 2019; v1 submitted 14 May, 2019;
originally announced May 2019.
-
Real-Time Dense Mapping for Self-driving Vehicles using Fisheye Cameras
Authors:
Zhaopeng Cui,
Lionel Heng,
Ye Chuan Yeo,
Andreas Geiger,
Marc Pollefeys,
Torsten Sattler
Abstract:
We present a real-time dense geometric mapping algorithm for large-scale environments. Unlike existing methods which use pinhole cameras, our implementation is based on fisheye cameras which have larger field of view and benefit some other tasks including Visual-Inertial Odometry, localization and object detection around vehicles. Our algorithm runs on in-vehicle PCs at 15 Hz approximately, enabli…
▽ More
We present a real-time dense geometric mapping algorithm for large-scale environments. Unlike existing methods which use pinhole cameras, our implementation is based on fisheye cameras which have larger field of view and benefit some other tasks including Visual-Inertial Odometry, localization and object detection around vehicles. Our algorithm runs on in-vehicle PCs at 15 Hz approximately, enabling vision-only 3D scene perception for self-driving vehicles. For each synchronized set of images captured by multiple cameras, we first compute a depth map for a reference camera using plane-sweeping stereo. To maintain both accuracy and efficiency, while accounting for the fact that fisheye images have a rather low resolution, we recover the depths using multiple image resolutions. We adopt the fast object detection framework YOLOv3 to remove potentially dynamic objects. At the end of the pipeline, we fuse the fisheye depth images into the truncated signed distance function (TSDF) volume to obtain a 3D map. We evaluate our method on large-scale urban datasets, and results show that our method works well even in complex environments.
△ Less
Submitted 18 April, 2019; v1 submitted 17 September, 2018;
originally announced September 2018.
-
Project AutoVision: Localization and 3D Scene Perception for an Autonomous Vehicle with a Multi-Camera System
Authors:
Lionel Heng,
Benjamin Choi,
Zhaopeng Cui,
Marcel Geppert,
Sixing Hu,
Benson Kuan,
Peidong Liu,
Rang Nguyen,
Ye Chuan Yeo,
Andreas Geiger,
Gim Hee Lee,
Marc Pollefeys,
Torsten Sattler
Abstract:
Project AutoVision aims to develop localization and 3D scene perception capabilities for a self-driving vehicle. Such capabilities will enable autonomous navigation in urban and rural environments, in day and night, and with cameras as the only exteroceptive sensors. The sensor suite employs many cameras for both 360-degree coverage and accurate multi-view stereo; the use of low-cost cameras keeps…
▽ More
Project AutoVision aims to develop localization and 3D scene perception capabilities for a self-driving vehicle. Such capabilities will enable autonomous navigation in urban and rural environments, in day and night, and with cameras as the only exteroceptive sensors. The sensor suite employs many cameras for both 360-degree coverage and accurate multi-view stereo; the use of low-cost cameras keeps the cost of this sensor suite to a minimum. In addition, the project seeks to extend the operating envelope to include GNSS-less conditions which are typical for environments with tall buildings, foliage, and tunnels. Emphasis is placed on leveraging multi-view geometry and deep learning to enable the vehicle to localize and perceive in 3D space. This paper presents an overview of the project, and describes the sensor suite and current progress in the areas of calibration, localization, and perception.
△ Less
Submitted 4 March, 2019; v1 submitted 14 September, 2018;
originally announced September 2018.
-
A Novel Multi-task Deep Learning Model for Skin Lesion Segmentation and Classification
Authors:
Xulei Yang,
Zeng Zeng,
Si Yong Yeo,
Colin Tan,
Hong Liang Tey,
Yi Su
Abstract:
In this study, a multi-task deep neural network is proposed for skin lesion analysis. The proposed multi-task learning model solves different tasks (e.g., lesion segmentation and two independent binary lesion classifications) at the same time by exploiting commonalities and differences across tasks. This results in improved learning efficiency and potential prediction accuracy for the task-specifi…
▽ More
In this study, a multi-task deep neural network is proposed for skin lesion analysis. The proposed multi-task learning model solves different tasks (e.g., lesion segmentation and two independent binary lesion classifications) at the same time by exploiting commonalities and differences across tasks. This results in improved learning efficiency and potential prediction accuracy for the task-specific models, when compared to training the individual models separately. The proposed multi-task deep learning model is trained and evaluated on the dermoscopic image sets from the International Skin Imaging Collaboration (ISIC) 2017 Challenge - Skin Lesion Analysis towards Melanoma Detection, which consists of 2000 training samples and 150 evaluation samples. The experimental results show that the proposed multi-task deep learning model achieves promising performances on skin lesion segmentation and classification. The average value of Jaccard index for lesion segmentation is 0.724, while the average values of area under the receiver operating characteristic curve (AUC) on two individual lesion classifications are 0.880 and 0.972, respectively.
△ Less
Submitted 2 March, 2017;
originally announced March 2017.
-
ODYS: A Massively-Parallel Search Engine Using a DB-IR Tightly-Integrated Parallel DBMS
Authors:
Kyu-Young Whang,
Tae-Seob Yun,
Yeon-Mi Yeo,
Il-Yeol Song,
Hyuk-Yoon Kwon,
In-Joong Kim
Abstract:
Recently, parallel search engines have been implemented based on scalable distributed file systems such as Google File System. However, we claim that building a massively-parallel search engine using a parallel DBMS can be an attractive alternative since it supports a higher-level (i.e., SQL-level) interface than that of a distributed file system for easy and less error-prone application developme…
▽ More
Recently, parallel search engines have been implemented based on scalable distributed file systems such as Google File System. However, we claim that building a massively-parallel search engine using a parallel DBMS can be an attractive alternative since it supports a higher-level (i.e., SQL-level) interface than that of a distributed file system for easy and less error-prone application development while providing scalability. In this paper, we propose a new approach of building a massively-parallel search engine using a DB-IR tightly-integrated parallel DBMS and demonstrate its commercial-level scalability and performance. In addition, we present a hybrid (i.e., analytic and experimental) performance model for the parallel search engine. We have built a five-node parallel search engine according to the proposed architecture using a DB-IR tightly-integrated DBMS. Through extensive experiments, we show the correctness of the model by comparing the projected output with the experimental results of the five-node engine. Our model demonstrates that ODYS is capable of handling 1 billion queries per day (81 queries/sec) for 30 billion web pages by using only 43,472 nodes with an average query response time of 211 ms, which is equivalent to or better than those of commercial search engines. We also show that, by using twice as many (86,944) nodes, ODYS can provide an average query response time of 162 ms, which is significantly lower than those of commercial search engines.
△ Less
Submitted 21 August, 2012;
originally announced August 2012.
-
Swarm-NG: a CUDA Library for Parallel n-body Integrations with focus on Simulations of Planetary Systems
Authors:
Saleh Dindar,
Eric B. Ford,
Mario Juric,
Young In Yeo,
Jianwei Gao,
Aaron C. Boley,
Benjamin Nelson,
Jorg Peters
Abstract:
We present Swarm-NG, a C++ library for the efficient direct integration of many n-body systems using highly-parallel Graphics Processing Unit (GPU), such as NVIDIA's Tesla T10 and M2070 GPUs. While previous studies have demonstrated the benefit of GPUs for n-body simulations with thousands to millions of bodies, Swarm-NG focuses on many few-body systems, e.g., thousands of systems with 3...15 bodi…
▽ More
We present Swarm-NG, a C++ library for the efficient direct integration of many n-body systems using highly-parallel Graphics Processing Unit (GPU), such as NVIDIA's Tesla T10 and M2070 GPUs. While previous studies have demonstrated the benefit of GPUs for n-body simulations with thousands to millions of bodies, Swarm-NG focuses on many few-body systems, e.g., thousands of systems with 3...15 bodies each, as is typical for the study of planetary systems. Swarm-NG parallelizes the simulation, including both the numerical integration of the equations of motion and the evaluation of forces using NVIDIA's "Compute Unified Device Architecture" (CUDA) on the GPU. Swarm-NG includes optimized implementations of 4th order time-symmetrized Hermite integration and mixed variable symplectic integration, as well as several sample codes for other algorithms to illustrate how non-CUDA-savvy users may themselves introduce customized integrators into the Swarm-NG framework. To optimize performance, we analyze the effect of GPU-specific parameters on performance under double precision.
Applications of Swarm-NG include studying the late stages of planet formation, testing the stability of planetary systems and evaluating the goodness-of-fit between many planetary system models and observations of extrasolar planet host stars (e.g., radial velocity, astrometry, transit timing). While Swarm-NG focuses on the parallel integration of many planetary systems,the underlying integrators could be applied to a wide variety of problems that require repeatedly integrating a set of ordinary differential equations many times using different initial conditions and/or parameter values.
△ Less
Submitted 24 September, 2012; v1 submitted 6 August, 2012;
originally announced August 2012.