Search | arXiv e-print repository

TabularFM: An Open Framework For Tabular Foundational Models

Authors: Quan M. Tran, Suong N. Hoang, Lam M. Nguyen, Dzung Phan, Hoang Thanh Lam

Abstract: Foundational models (FMs), pretrained on extensive datasets using self-supervised techniques, are capable of learning generalized patterns from large amounts of data. This reduces the need for extensive labeled datasets for each new task, saving both time and resources by leveraging the broad knowledge base established during pretraining. Most research on FMs has primarily focused on unstructured… ▽ More Foundational models (FMs), pretrained on extensive datasets using self-supervised techniques, are capable of learning generalized patterns from large amounts of data. This reduces the need for extensive labeled datasets for each new task, saving both time and resources by leveraging the broad knowledge base established during pretraining. Most research on FMs has primarily focused on unstructured data, such as text and images, or semi-structured data, like time-series. However, there has been limited attention to structured data, such as tabular data, which, despite its prevalence, remains under-studied due to a lack of clean datasets and insufficient research on the transferability of FMs for various tabular data tasks. In response to this gap, we introduce a framework called TabularFM, which incorporates state-of-the-art methods for developing FMs specifically for tabular data. This includes variations of neural architectures such as GANs, VAEs, and Transformers. We have curated a million of tabular datasets and released cleaned versions to facilitate the development of tabular FMs. We pretrained FMs on this curated data, benchmarked various learning methods on these datasets, and released the pretrained models along with leaderboards for future comparative studies. Our fully open-sourced system provides a comprehensive analysis of the transferability of tabular FMs. By releasing these datasets, pretrained models, and leaderboards, we aim to enhance the validity and usability of tabular FMs in the near future. △ Less

Submitted 17 June, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

arXiv:2406.02245 [pdf, other]

Description Boosting for Zero-Shot Entity and Relation Classification

Authors: Gabriele Picco, Leopold Fuchs, Marcos Martínez Galindo, Alberto Purpura, Vanessa López, Hoang Thanh Lam

Abstract: Zero-shot entity and relation classification models leverage available external information of unseen classes -- e.g., textual descriptions -- to annotate input text data. Thanks to the minimum data requirement, Zero-Shot Learning (ZSL) methods have high value in practice, especially in applications where labeled data is scarce. Even though recent research in ZSL has demonstrated significant resul… ▽ More Zero-shot entity and relation classification models leverage available external information of unseen classes -- e.g., textual descriptions -- to annotate input text data. Thanks to the minimum data requirement, Zero-Shot Learning (ZSL) methods have high value in practice, especially in applications where labeled data is scarce. Even though recent research in ZSL has demonstrated significant results, our analysis reveals that those methods are sensitive to provided textual descriptions of entities (or relations). Even a minor modification of descriptions can lead to a change in the decision boundary between entity (or relation) classes. In this paper, we formally define the problem of identifying effective descriptions for zero shot inference. We propose a strategy for generating variations of an initial description, a heuristic for ranking them and an ensemble method capable of boosting the predictions of zero-shot models through description enhancement. Empirical results on four different entity and relation classification datasets show that our proposed method outperform existing approaches and achieve new SOTA results on these datasets under the ZSL settings. The source code of the proposed solutions and the evaluation framework are open-sourced. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2307.13497 [pdf, other]

Zshot: An Open-source Framework for Zero-Shot Named Entity Recognition and Relation Extraction

Authors: Gabriele Picco, Marcos Martínez Galindo, Alberto Purpura, Leopold Fuchs, Vanessa López, Hoang Thanh Lam

Abstract: The Zero-Shot Learning (ZSL) task pertains to the identification of entities or relations in texts that were not seen during training. ZSL has emerged as a critical research area due to the scarcity of labeled data in specific domains, and its applications have grown significantly in recent years. With the advent of large pretrained language models, several novel methods have been proposed, result… ▽ More The Zero-Shot Learning (ZSL) task pertains to the identification of entities or relations in texts that were not seen during training. ZSL has emerged as a critical research area due to the scarcity of labeled data in specific domains, and its applications have grown significantly in recent years. With the advent of large pretrained language models, several novel methods have been proposed, resulting in substantial improvements in ZSL performance. There is a growing demand, both in the research community and industry, for a comprehensive ZSL framework that facilitates the development and accessibility of the latest methods and pretrained models.In this study, we propose a novel ZSL framework called Zshot that aims to address the aforementioned challenges. Our primary objective is to provide a platform that allows researchers to compare different state-of-the-art ZSL methods with standard benchmark datasets. Additionally, we have designed our framework to support the industry with readily available APIs for production under the standard SpaCy NLP pipeline. Our API is extendible and evaluable, moreover, we include numerous enhancements such as boosting the accuracy with pipeline ensembling and visualization utilities available as a SpaCy extension. △ Less

Submitted 25 July, 2023; originally announced July 2023.

Comments: Accepted at ACL 2023

Journal ref: Association for Computational Linguistics. 3 (2023) 357-368

arXiv:2306.12802 [pdf, other]

Otter-Knowledge: benchmarks of multimodal knowledge graph representation learning from different sources for drug discovery

Authors: Hoang Thanh Lam, Marco Luca Sbodio, Marcos Martínez Galindo, Mykhaylo Zayats, Raúl Fernández-Díaz, Víctor Valls, Gabriele Picco, Cesar Berrospi Ramis, Vanessa López

Abstract: Recent research on predicting the binding affinity between drug molecules and proteins use representations learned, through unsupervised learning techniques, from large databases of molecule SMILES and protein sequences. While these representations have significantly enhanced the predictions, they are usually based on a limited set of modalities, and they do not exploit available knowledge about e… ▽ More Recent research on predicting the binding affinity between drug molecules and proteins use representations learned, through unsupervised learning techniques, from large databases of molecule SMILES and protein sequences. While these representations have significantly enhanced the predictions, they are usually based on a limited set of modalities, and they do not exploit available knowledge about existing relations among molecules and proteins. In this study, we demonstrate that by incorporating knowledge graphs from diverse sources and modalities into the sequences or SMILES representation, we can further enrich the representation and achieve state-of-the-art results for drug-target binding affinity prediction in the established Therapeutic Data Commons (TDC) benchmarks. We release a set of multimodal knowledge graphs, integrating data from seven public data sources, and containing over 30 million triples. Our intention is to foster additional research to explore how multimodal knowledge enhanced protein/molecule embeddings can improve prediction tasks, including prediction of binding affinity. We also release some pretrained models learned from our multimodal knowledge graphs, along with source code for running standard benchmark tasks for prediction of biding affinity. △ Less

Submitted 19 October, 2023; v1 submitted 22 June, 2023; originally announced June 2023.

arXiv:2302.03525 [pdf, other]

Multi-Task Deep Recommender Systems: A Survey

Authors: Yuhao Wang, Ha Tsz Lam, Yi Wong, Ziru Liu, Xiangyu Zhao, Yichao Wang, Bo Chen, Huifeng Guo, Ruiming Tang

Abstract: Multi-task learning (MTL) aims at learning related tasks in a unified model to achieve mutual improvement among tasks considering their shared knowledge. It is an important topic in recommendation due to the demand for multi-task prediction considering performance and efficiency. Although MTL has been well studied and developed, there is still a lack of systematic review in the recommendation comm… ▽ More Multi-task learning (MTL) aims at learning related tasks in a unified model to achieve mutual improvement among tasks considering their shared knowledge. It is an important topic in recommendation due to the demand for multi-task prediction considering performance and efficiency. Although MTL has been well studied and developed, there is still a lack of systematic review in the recommendation community. To fill the gap, we provide a comprehensive review of existing multi-task deep recommender systems (MTDRS) in this survey. To be specific, the problem definition of MTDRS is first given, and it is compared with other related areas. Next, the development of MTDRS is depicted and the taxonomy is introduced from the task relation and methodology aspects. Specifically, the task relation is categorized into parallel, cascaded, and auxiliary with main, while the methodology is grouped into parameter sharing, optimization, and training mechanism. The survey concludes by summarizing the application and public datasets of MTDRS and highlighting the challenges and future directions of the field. △ Less

Submitted 8 February, 2023; v1 submitted 7 February, 2023; originally announced February 2023.

arXiv:2202.03558 [pdf, other]

Attacking c-MARL More Effectively: A Data Driven Approach

Authors: Nhan H. Pham, Lam M. Nguyen, Jie Chen, Hoang Thanh Lam, Subhro Das, Tsui-Wei Weng

Abstract: In recent years, a proliferation of methods were developed for cooperative multi-agent reinforcement learning (c-MARL). However, the robustness of c-MARL agents against adversarial attacks has been rarely explored. In this paper, we propose to evaluate the robustness of c-MARL agents via a model-based approach, named c-MBA. Our proposed formulation can craft much stronger adversarial state perturb… ▽ More In recent years, a proliferation of methods were developed for cooperative multi-agent reinforcement learning (c-MARL). However, the robustness of c-MARL agents against adversarial attacks has been rarely explored. In this paper, we propose to evaluate the robustness of c-MARL agents via a model-based approach, named c-MBA. Our proposed formulation can craft much stronger adversarial state perturbations of c-MARL agents to lower total team rewards than existing model-free approaches. In addition, we propose the first victim-agent selection strategy and the first data-driven approach to define targeted failure states where each of them allows us to develop even stronger adversarial attack without the expert knowledge to the underlying environment. Our numerical experiments on two representative MARL benchmarks illustrate the advantage of our approach over other baselines: our model-based attack consistently outperforms other baselines in all tested environments. △ Less

Submitted 10 September, 2023; v1 submitted 7 February, 2022; originally announced February 2022.

arXiv:2110.09131 [pdf, other]

Ensembling Graph Predictions for AMR Parsing

Authors: Hoang Thanh Lam, Gabriele Picco, Yufang Hou, Young-Suk Lee, Lam M. Nguyen, Dzung T. Phan, Vanessa López, Ramon Fernandez Astudillo

Abstract: In many machine learning tasks, models are trained to predict structure data such as graphs. For example, in natural language processing, it is very common to parse texts into dependency trees or abstract meaning representation (AMR) graphs. On the other hand, ensemble methods combine predictions from multiple models to create a new one that is more robust and accurate than individual predictions.… ▽ More In many machine learning tasks, models are trained to predict structure data such as graphs. For example, in natural language processing, it is very common to parse texts into dependency trees or abstract meaning representation (AMR) graphs. On the other hand, ensemble methods combine predictions from multiple models to create a new one that is more robust and accurate than individual predictions. In the literature, there are many ensembling techniques proposed for classification or regression problems, however, ensemble graph prediction has not been studied thoroughly. In this work, we formalize this problem as mining the largest graph that is the most supported by a collection of graph predictions. As the problem is NP-Hard, we propose an efficient heuristic algorithm to approximate the optimal solution. To validate our approach, we carried out experiments in AMR parsing problems. The experimental results demonstrate that the proposed approach can combine the strength of state-of-the-art AMR parsers to create new predictions that are more accurate than any individual models in five standard benchmark datasets. △ Less

Submitted 24 January, 2022; v1 submitted 18 October, 2021; originally announced October 2021.

Comments: Published at NeurIPS 2021

arXiv:2109.08460 [pdf, other]

Neural Unification for Logic Reasoning over Natural Language

Authors: Gabriele Picco, Hoang Thanh Lam, Marco Luca Sbodio, Vanessa Lopez Garcia

Abstract: Automated Theorem Proving (ATP) deals with the development of computer programs being able to show that some conjectures (queries) are a logical consequence of a set of axioms (facts and rules). There exists several successful ATPs where conjectures and axioms are formally provided (e.g. formalised as First Order Logic formulas). Recent approaches, such as (Clark et al., 2020), have proposed trans… ▽ More Automated Theorem Proving (ATP) deals with the development of computer programs being able to show that some conjectures (queries) are a logical consequence of a set of axioms (facts and rules). There exists several successful ATPs where conjectures and axioms are formally provided (e.g. formalised as First Order Logic formulas). Recent approaches, such as (Clark et al., 2020), have proposed transformer-based architectures for deriving conjectures given axioms expressed in natural language (English). The conjecture is verified through a binary text classifier, where the transformers model is trained to predict the truth value of a conjecture given the axioms. The RuleTaker approach of (Clark et al., 2020) achieves appealing results both in terms of accuracy and in the ability to generalize, showing that when the model is trained with deep enough queries (at least 3 inference steps), the transformers are able to correctly answer the majority of queries (97.6%) that require up to 5 inference steps. In this work we propose a new architecture, namely the Neural Unifier, and a relative training procedure, which achieves state-of-the-art results in term of generalisation, showing that mimicking a well-known inference procedure, the backward chaining, it is possible to answer deep queries even when the model is trained only on shallow ones. The approach is demonstrated in experiments using a diverse set of benchmark data. △ Less

Submitted 17 September, 2021; originally announced September 2021.

Comments: Accepted at EMNLP2021 Findings

arXiv:1806.05886 [pdf, other]

Automated Image Data Preprocessing with Deep Reinforcement Learning

Authors: Tran Ngoc Minh, Mathieu Sinn, Hoang Thanh Lam, Martin Wistuba

Abstract: Data preparation, i.e. the process of transforming raw data into a format that can be used for training effective machine learning models, is a tedious and time-consuming task. For image data, preprocessing typically involves a sequence of basic transformations such as cropping, filtering, rotating or flipping images. Currently, data scientists decide manually based on their experience which trans… ▽ More Data preparation, i.e. the process of transforming raw data into a format that can be used for training effective machine learning models, is a tedious and time-consuming task. For image data, preprocessing typically involves a sequence of basic transformations such as cropping, filtering, rotating or flipping images. Currently, data scientists decide manually based on their experience which transformations to apply in which particular order to a given image data set. Besides constituting a bottleneck in real-world data science projects, manual image data preprocessing may yield suboptimal results as data scientists need to rely on intuition or trial-and-error approaches when exploring the space of possible image transformations and thus might not be able to discover the most effective ones. To mitigate the inefficiency and potential ineffectiveness of manual data preprocessing, this paper proposes a deep reinforcement learning framework to automatically discover the optimal data preprocessing steps for training an image classifier. The framework takes as input sets of labeled images and predefined preprocessing transformations. It jointly learns the classifier and the optimal preprocessing transformations for individual images. Experimental results show that the proposed approach not only improves the accuracy of image classifiers, but also makes them substantially more robust to noisy inputs at test time. △ Less

Submitted 29 April, 2021; v1 submitted 15 June, 2018; originally announced June 2018.

arXiv:1802.03628 [pdf, other]

Learning Correlation Space for Time Series

Authors: Han Qiu, Hoang Thanh Lam, Francesco Fusco, Mathieu Sinn

Abstract: We propose an approximation algorithm for efficient correlation search in time series data. In our method, we use Fourier transform and neural network to embed time series into a low-dimensional Euclidean space. The given space is learned such that time series correlation can be effectively approximated from Euclidean distance between corresponding embedded vectors. Therefore, search for correlate… ▽ More We propose an approximation algorithm for efficient correlation search in time series data. In our method, we use Fourier transform and neural network to embed time series into a low-dimensional Euclidean space. The given space is learned such that time series correlation can be effectively approximated from Euclidean distance between corresponding embedded vectors. Therefore, search for correlated time series can be done using an index in the embedding space for efficient nearest neighbor search. Our theoretical analysis illustrates that our method's accuracy can be guaranteed under certain regularity conditions. We further conduct experiments on real-world datasets and the results show that our method indeed outperforms the baseline solution. In particular, for approximation of correlation, our method reduces the approximation loss by a half in most test cases compared to the baseline solution. For top-$k$ highest correlation search, our method improves the precision from 5\% to 20\% while the query time is similar to the baseline approach query time. △ Less

Submitted 15 May, 2018; v1 submitted 10 February, 2018; originally announced February 2018.

arXiv:1801.05372 [pdf, other]

Neural Feature Learning From Relational Database

Authors: Hoang Thanh Lam, Tran Ngoc Minh, Mathieu Sinn, Beat Buesser, Martin Wistuba

Abstract: Feature engineering is one of the most important but most tedious tasks in data science. This work studies automation of feature learning from relational database. We first prove theoretically that finding the optimal features from relational data for predictive tasks is NP-hard. We propose an efficient rule-based approach based on heuristics and a deep neural network to automatically learn approp… ▽ More Feature engineering is one of the most important but most tedious tasks in data science. This work studies automation of feature learning from relational database. We first prove theoretically that finding the optimal features from relational data for predictive tasks is NP-hard. We propose an efficient rule-based approach based on heuristics and a deep neural network to automatically learn appropriate features from relational data. We benchmark our approaches in ensembles in past Kaggle competitions. Our new approach wins late medals and beats the state-of-the-art solutions with significant margins. To the best of our knowledge, this is the first time an automated data science system could win medals in Kaggle competitions with complex relational database. △ Less

Submitted 15 June, 2019; v1 submitted 16 January, 2018; originally announced January 2018.

arXiv:1706.00327 [pdf, other]

One button machine for automating feature engineering in relational databases

Authors: Hoang Thanh Lam, Johann-Michael Thiebaut, Mathieu Sinn, Bei Chen, Tiep Mai, Oznur Alkan

Abstract: Feature engineering is one of the most important and time consuming tasks in predictive analytics projects. It involves understanding domain knowledge and data exploration to discover relevant hand-crafted features from raw data. In this paper, we introduce a system called One Button Machine, or OneBM for short, which automates feature discovery in relational databases. OneBM automatically perform… ▽ More Feature engineering is one of the most important and time consuming tasks in predictive analytics projects. It involves understanding domain knowledge and data exploration to discover relevant hand-crafted features from raw data. In this paper, we introduce a system called One Button Machine, or OneBM for short, which automates feature discovery in relational databases. OneBM automatically performs a key activity of data scientists, namely, joining of database tables and applying advanced data transformations to extract useful features from data. We validated OneBM in Kaggle competitions in which OneBM achieved performance as good as top 16% to 24% data scientists in three Kaggle competitions. More importantly, OneBM outperformed the state-of-the-art system in a Kaggle competition in terms of prediction accuracy and ranking on Kaggle leaderboard. The results show that OneBM can be useful for both data scientists and non-experts. It helps data scientists reduce data exploration time allowing them to try and error many ideas in short time. On the other hand, it enables non-experts, who are not familiar with data science, to quickly extract value from their data with a little effort, time and cost. △ Less

Submitted 1 June, 2017; originally announced June 2017.

arXiv:1509.05257 [pdf, other]

(Blue) Taxi Destination and Trip Time Prediction from Partial Trajectories

Authors: Hoang Thanh Lam, Ernesto Diaz-Aviles, Alessandra Pascale, Yiannis Gkoufas, Bei Chen

Abstract: Real-time estimation of destination and travel time for taxis is of great importance for existing electronic dispatch systems. We present an approach based on trip matching and ensemble learning, in which we leverage the patterns observed in a dataset of roughly 1.7 million taxi journeys to predict the corresponding final destination and travel time for ongoing taxi trips, as a solution for the EC… ▽ More Real-time estimation of destination and travel time for taxis is of great importance for existing electronic dispatch systems. We present an approach based on trip matching and ensemble learning, in which we leverage the patterns observed in a dataset of roughly 1.7 million taxi journeys to predict the corresponding final destination and travel time for ongoing taxi trips, as a solution for the ECML/PKDD Discovery Challenge 2015 competition. The results of our empirical evaluation show that our approach is effective and very robust, which led our team -- BlueTaxi -- to the 3rd and 7th position of the final rankings for the trip time and destination prediction tasks, respectively. Given the fact that the final rankings were computed using a very small test set (with only 320 trips) we believe that our approach is one of the most robust solutions for the challenge based on the consistency of our good results across the test sets. △ Less

Submitted 17 September, 2015; originally announced September 2015.

Comments: ECML/PKDD Discovery Challenge 2015

ACM Class: I.2.6; I.5.2

arXiv:1412.7990 [pdf, other]

doi 10.1145/2668067.2668072

Predicting User Engagement in Twitter with Collaborative Ranking

Authors: Ernesto Diaz-Aviles, Hoang Thanh Lam, Fabio Pinelli, Stefano Braghin, Yiannis Gkoufas, Michele Berlingerio, Francesco Calabrese

Abstract: Collaborative Filtering (CF) is a core component of popular web-based services such as Amazon, YouTube, Netflix, and Twitter. Most applications use CF to recommend a small set of items to the user. For instance, YouTube presents to a user a list of top-n videos she would likely watch next based on her rating and viewing history. Current methods of CF evaluation have been focused on assessing the q… ▽ More Collaborative Filtering (CF) is a core component of popular web-based services such as Amazon, YouTube, Netflix, and Twitter. Most applications use CF to recommend a small set of items to the user. For instance, YouTube presents to a user a list of top-n videos she would likely watch next based on her rating and viewing history. Current methods of CF evaluation have been focused on assessing the quality of a predicted rating or the ranking performance for top-n recommended items. However, restricting the recommender system evaluation to these two aspects is rather limiting and neglects other dimensions that could better characterize a well-perceived recommendation. In this paper, instead of optimizing rating or top-n recommendation, we focus on the task of predicting which items generate the highest user engagement. In particular, we use Twitter as our testbed and cast the problem as a Collaborative Ranking task where the rich features extracted from the metadata of the tweets help to complement the transaction information limited to user ids, item ids, ratings and timestamps. We learn a scoring function that directly optimizes the user engagement in terms of nDCG@10 on the predicted ranking. Experiments conducted on an extended version of the MovieTweetings dataset, released as part of the RecSys Challenge 2014, show the effectiveness of our approach. △ Less

Submitted 26 December, 2014; originally announced December 2014.

Comments: RecSysChallenge'14 at RecSys 2014, October 10, 2014, Foster City, CA, USA

ACM Class: H.3.3; I.2.6

Journal ref: In Proceedings of the 2014 Recommender Systems Challenge (RecSysChallenge'14). ACM, New York, NY, USA, , Pages 41 , 6 pages

arXiv:1412.4218 [pdf]

Optimization of Reliability of Network of Given Connectivity using Genetic Algorithm

Authors: Ho Tat Lam, Kwok Yip Szeto

Abstract: Reliability is one of the important measures of how well the system meets its design objective, and mathematically is the probability that a system will perform satisfactorily for at least a given period of time. When the system is described by a connected network of N components (nodes) and their L connection (links), the reliability of the system becomes a difficult network design problem which… ▽ More Reliability is one of the important measures of how well the system meets its design objective, and mathematically is the probability that a system will perform satisfactorily for at least a given period of time. When the system is described by a connected network of N components (nodes) and their L connection (links), the reliability of the system becomes a difficult network design problem which solutions are of great practical interest in science and engineering. This paper discusses the numerical method of finding the most reliable network for a given N and L using genetic algorithm. For a given topology of the network, the reliability is numerically computed using adjacency matrix. For a search in the space of all possible topologies of the connected network with N nodes and L links, genetic operators such as mutation and crossover are applied to the adjacency matrix through a string representation. In the context of graphs, the mutation of strings in genetic algorithm corresponds to the rewiring of graphs, while crossover corresponds to the interchange of the sub-graphs. For small networks where the most reliable network can be found by exhaustive search, genetic algorithm is very efficient. For larger networks, our results not only demonstrate the efficiency of our algorithm, but also suggest that the most reliable network will have high symmetry. △ Less

Submitted 13 December, 2014; originally announced December 2014.

Comments: 9 pages, 10 figures, 3 tables

arXiv:1403.4165 [pdf, ps, other]

Heisenberg Groups as Platform for the AAG key-exchange protocol

Authors: Delaram Kahrobaei, Ha T. Lam

Abstract: Garber, Kahrobaei, and Lam studied polycyclic groups generated by number field as platform for the AAG key-exchange protocol. In this paper, we discuss the use of a different kind of polycyclic groups, Heisenberg groups, as a platform group for AAG by submitting Heisenberg groups to one of AAG's major attacks, the length-based attack. Garber, Kahrobaei, and Lam studied polycyclic groups generated by number field as platform for the AAG key-exchange protocol. In this paper, we discuss the use of a different kind of polycyclic groups, Heisenberg groups, as a platform group for AAG by submitting Heisenberg groups to one of AAG's major attacks, the length-based attack. △ Less

Submitted 17 March, 2014; originally announced March 2014.

Comments: arXiv admin note: text overlap with arXiv:1305.0548

arXiv:1305.0548 [pdf, ps, other]

Length-based attacks in polycyclic groups

Authors: David Garber, Delaram Kahrobaei, Ha T. Lam

Abstract: After the Anshel-Anshel-Goldfeld (AAG) key-exchange protocol was introduced in 1999, it was implemented and studied with braid groups and with the Thompson group as its underlying platforms. The length-based attack, introduced by Hughes and Tannenbaum, has been used to extensively study AAG with the braid group as the underlying platform. Meanwhile, a new platform, using polycyclic groups, was pro… ▽ More After the Anshel-Anshel-Goldfeld (AAG) key-exchange protocol was introduced in 1999, it was implemented and studied with braid groups and with the Thompson group as its underlying platforms. The length-based attack, introduced by Hughes and Tannenbaum, has been used to extensively study AAG with the braid group as the underlying platform. Meanwhile, a new platform, using polycyclic groups, was proposed by Eick and Kahrobaei. In this paper, we show that with a high enough Hirsch length, the polycyclic group as an underlying platform for AAG is resistant to the length-based attack. In particular, polycyclic groups could provide a secure platform for any cryptosystem based on conjugacy search problem such as non-commutative Diffie-Hellman, ElGamal and Cramer-Shoup key exchange protocols. △ Less

Submitted 22 November, 2014; v1 submitted 2 May, 2013; originally announced May 2013.

Comments: J. Math. Crypt. 2014

Showing 1–17 of 17 results for author: Lam, H T