-
Federated XGBoost on Sample-Wise Non-IID Data
Authors:
Katelinh Jones,
Yuya Jeremy Ong,
Yi Zhou,
Nathalie Baracaldo
Abstract:
Federated Learning (FL) is a paradigm for jointly training machine learning algorithms in a decentralized manner which allows for parties to communicate with an aggregator to create and train a model, without exposing the underlying raw data distribution of the local parties involved in the training process. Most research in FL has been focused on Neural Network-based approaches, however Tree-Base…
▽ More
Federated Learning (FL) is a paradigm for jointly training machine learning algorithms in a decentralized manner which allows for parties to communicate with an aggregator to create and train a model, without exposing the underlying raw data distribution of the local parties involved in the training process. Most research in FL has been focused on Neural Network-based approaches, however Tree-Based methods, such as XGBoost, have been underexplored in Federated Learning due to the challenges in overcoming the iterative and additive characteristics of the algorithm. Decision tree-based models, in particular XGBoost, can handle non-IID data, which is significant for algorithms used in Federated Learning frameworks since the underlying characteristics of the data are decentralized and have risks of being non-IID by nature. In this paper, we focus on investigating the effects of how Federated XGBoost is impacted by non-IID distributions by performing experiments on various sample size-based data skew scenarios and how these models perform under various non-IID scenarios. We conduct a set of extensive experiments across multiple different datasets and different data skew partitions. Our experimental results demonstrate that despite the various partition ratios, the performance of the models stayed consistent and performed close to or equally well against models that were trained in a centralized manner.
△ Less
Submitted 3 September, 2022;
originally announced September 2022.
-
Towards a New Science of Disinformation
Authors:
Claudio S. Pinhanez,
German H. Flores,
Marisa A. Vasconcelos,
Mu Qiao,
Nick Linck,
Rogério de Paula,
Yuya J. Ong
Abstract:
How can we best address the dangerous impact that deep learning-generated fake audios, photographs, and videos (a.k.a. deepfakes) may have in personal and societal life? We foresee that the availability of cheap deepfake technology will create a second wave of disinformation where people will receive specific, personalized disinformation through different channels, making the current approaches to…
▽ More
How can we best address the dangerous impact that deep learning-generated fake audios, photographs, and videos (a.k.a. deepfakes) may have in personal and societal life? We foresee that the availability of cheap deepfake technology will create a second wave of disinformation where people will receive specific, personalized disinformation through different channels, making the current approaches to fight disinformation obsolete. We argue that fake media has to be seen as an upcoming cybersecurity problem, and we have to shift from combating its spread to a prevention and cure framework where users have available ways to verify, challenge, and argue against the veracity of each piece of media they are exposed to. To create the technologies behind this framework, we propose that a new Science of Disinformation is needed, one which creates a theoretical framework both for the processes of communication and consumption of false content. Key scientific and technological challenges facing this research agenda are listed and discussed in the light of state-of-art technologies for fake media generation and detection, argument finding and construction, and how to effectively engage users in the prevention and cure processes.
△ Less
Submitted 17 March, 2022;
originally announced April 2022.
-
SimPO: Simultaneous Prediction and Optimization
Authors:
Bing Zhang,
Yuya Jeremy Ong,
Taiga Nakamura
Abstract:
Many machine learning (ML) models are integrated within the context of a larger system as part of a key component for decision making processes. Concretely, predictive models are often employed in estimating the parameters for the input values that are utilized for optimization models as isolated processes. Traditionally, the predictive models are built first, then the model outputs are used to ge…
▽ More
Many machine learning (ML) models are integrated within the context of a larger system as part of a key component for decision making processes. Concretely, predictive models are often employed in estimating the parameters for the input values that are utilized for optimization models as isolated processes. Traditionally, the predictive models are built first, then the model outputs are used to generate decision values separately. However, it is often the case that the prediction values that are trained independently of the optimization process produce sub-optimal solutions. In this paper, we propose a formulation for the Simultaneous Prediction and Optimization (SimPO) framework. This framework introduces the use of a joint weighted loss of a decision-driven predictive ML model and an optimization objective function, which is optimized end-to-end directly through gradient-based methods.
△ Less
Submitted 31 March, 2022;
originally announced April 2022.
-
Predicting Loss Risks for B2B Tendering Processes
Authors:
Eelaaf Zahid,
Yuya Jeremy Ong,
Aly Megahed,
Taiga Nakamura
Abstract:
Sellers and executives who maintain a bidding pipeline of sales engagements with multiple clients for many opportunities significantly benefit from data-driven insight into the health of each of their bids. There are many predictive models that offer likelihood insights and win prediction modeling for these opportunities. Currently, these win prediction models are in the form of binary classificat…
▽ More
Sellers and executives who maintain a bidding pipeline of sales engagements with multiple clients for many opportunities significantly benefit from data-driven insight into the health of each of their bids. There are many predictive models that offer likelihood insights and win prediction modeling for these opportunities. Currently, these win prediction models are in the form of binary classification and only make a prediction for the likelihood of a win or loss. The binary formulation is unable to offer any insight as to why a particular deal might be predicted as a loss. This paper offers a multi-class classification model to predict win probability, with the three loss classes offering specific reasons as to why a loss is predicted, including no bid, customer did not pursue, and lost to competition. These classes offer an indicator of how that opportunity might be handled given the nature of the prediction. Besides offering baseline results on the multi-class classification, this paper also offers results on the model after class imbalance handling, with the results achieving a high accuracy of 85% and an average AUC score of 0.94.
△ Less
Submitted 14 September, 2021;
originally announced September 2021.
-
Adaptive Histogram-Based Gradient Boosted Trees for Federated Learning
Authors:
Yuya Jeremy Ong,
Yi Zhou,
Nathalie Baracaldo,
Heiko Ludwig
Abstract:
Federated Learning (FL) is an approach to collaboratively train a model across multiple parties without sharing data between parties or an aggregator. It is used both in the consumer domain to protect personal data as well as in enterprise settings, where dealing with data domicile regulation and the pragmatics of data silos are the main drivers. While gradient boosted tree implementations such as…
▽ More
Federated Learning (FL) is an approach to collaboratively train a model across multiple parties without sharing data between parties or an aggregator. It is used both in the consumer domain to protect personal data as well as in enterprise settings, where dealing with data domicile regulation and the pragmatics of data silos are the main drivers. While gradient boosted tree implementations such as XGBoost have been very successful for many use cases, its federated learning adaptations tend to be very slow due to using cryptographic and privacy methods and have not experienced widespread use. We propose the Party-Adaptive XGBoost (PAX) for federated learning, a novel implementation of gradient boosting which utilizes a party adaptive histogram aggregation method, without the need for data encryption. It constructs a surrogate representation of the data distribution for finding splits of the decision tree. Our experimental results demonstrate strong model performance, especially on non-IID distributions, and significantly faster training run-time across different data sets than existing federated implementations. This approach makes the use of gradient boosted trees practical in enterprise federated learning.
△ Less
Submitted 11 December, 2020;
originally announced December 2020.
-
Temporal Tensor Transformation Network for Multivariate Time Series Prediction
Authors:
Yuya Jeremy Ong,
Mu Qiao,
Divyesh Jadav
Abstract:
Multivariate time series prediction has applications in a wide variety of domains and is considered to be a very challenging task, especially when the variables have correlations and exhibit complex temporal patterns, such as seasonality and trend. Many existing methods suffer from strong statistical assumptions, numerical issues with high dimensionality, manual feature engineering efforts, and sc…
▽ More
Multivariate time series prediction has applications in a wide variety of domains and is considered to be a very challenging task, especially when the variables have correlations and exhibit complex temporal patterns, such as seasonality and trend. Many existing methods suffer from strong statistical assumptions, numerical issues with high dimensionality, manual feature engineering efforts, and scalability. In this work, we present a novel deep learning architecture, known as Temporal Tensor Transformation Network, which transforms the original multivariate time series into a higher order of tensor through the proposed Temporal-Slicing Stack Transformation. This yields a new representation of the original multivariate time series, which enables the convolution kernel to extract complex and non-linear features as well as variable interactional signals from a relatively large temporal region. Experimental results show that Temporal Tensor Transformation Network outperforms several state-of-the-art methods on window-based predictions across various tasks. The proposed architecture also demonstrates robust prediction performance through an extensive sensitivity analysis.
△ Less
Submitted 4 January, 2020;
originally announced January 2020.