-
Enhancing Sequential Music Recommendation with Negative Feedback-informed Contrastive Learning
Authors:
Pavan Seshadri,
Shahrzad Shashaani,
Peter Knees
Abstract:
Modern music streaming services are heavily based on recommendation engines to serve content to users. Sequential recommendation -- continuously providing new items within a single session in a contextually coherent manner -- has been an emerging topic in current literature. User feedback -- a positive or negative response to the item presented -- is used to drive content recommendations by learni…
▽ More
Modern music streaming services are heavily based on recommendation engines to serve content to users. Sequential recommendation -- continuously providing new items within a single session in a contextually coherent manner -- has been an emerging topic in current literature. User feedback -- a positive or negative response to the item presented -- is used to drive content recommendations by learning user preferences. We extend this idea to session-based recommendation to provide context-coherent music recommendations by modelling negative user feedback, i.e., skips, in the loss function. We propose a sequence-aware contrastive sub-task to structure item embeddings in session-based music recommendation, such that true next-positive items (ignoring skipped items) are structured closer in the session embedding space, while skipped tracks are structured farther away from all items in the session. This directly affects item rankings using a K-nearest-neighbors search for next-item recommendations, while also promoting the rank of the true next item. Experiments incorporating this task into SoTA methods for sequential item recommendation show consistent performance gains in terms of next-item hit rate, item ranking, and skip down-ranking on three music recommendation datasets, strongly benefiting from the increasing presence of user feedback.
△ Less
Submitted 11 September, 2024;
originally announced September 2024.
-
Complexity of Zeroth- and First-order Stochastic Trust-Region Algorithms
Authors:
Yunsoo Ha,
Sara Shashaani,
Raghu Pasupathy
Abstract:
Model update (MU) and candidate evaluation (CE) are classical steps incorporated inside many stochastic trust-region (TR) algorithms. The sampling effort exerted within these steps, often decided with the aim of controlling model error, largely determines a stochastic TR algorithm's sample complexity. Given that MU and CE are amenable to variance reduction, we investigate the effect of incorporati…
▽ More
Model update (MU) and candidate evaluation (CE) are classical steps incorporated inside many stochastic trust-region (TR) algorithms. The sampling effort exerted within these steps, often decided with the aim of controlling model error, largely determines a stochastic TR algorithm's sample complexity. Given that MU and CE are amenable to variance reduction, we investigate the effect of incorporating common random numbers (CRN) within MU and CE on complexity. Using ASTRO and ASTRO-DF as prototype first-order and zeroth-order families of algorithms, we demonstrate that CRN's effectiveness leads to a range of complexities depending on sample-path regularity and the oracle order. For instance, we find that in first-order oracle settings with smooth sample paths, CRN's effect is pronounced -- ASTRO with CRN achieves $\tilde{O}(ε^{-2})$ a.s. sample complexity compared to $\tilde{O}(ε^{-6})$ a.s. in the generic no-CRN setting. By contrast, CRN's effect is muted when the sample paths are not Lipschitz, with the sample complexity improving from $\tilde{O}(ε^{-6})$ a.s. to $\tilde{O}(ε^{-5})$ and $\tilde{O}(ε^{-4})$ a.s. in the zeroth- and first-order settings, respectively. Since our results imply that improvements in complexity are largely inherited from generic aspects of variance reduction, e.g., finite-differencing for zeroth-order settings and sample-path smoothness for first-order settings within MU, we anticipate similar trends in other contexts.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
COVID-19 Detection Based on Blood Test Parameters using Various Artificial Intelligence Methods
Authors:
Kavian Khanjani,
Seyed Rasoul Hosseini,
Hamid Taheri,
Shahrzad Shashaani,
Mohammad Teshnehlab
Abstract:
In 2019, the world faced a new challenge: a COVID-19 disease caused by the novel coronavirus, SARS-CoV-2. The virus rapidly spread across the globe, leading to a high rate of mortality, which prompted health organizations to take measures to control its transmission. Early disease detection is crucial in the treatment process, and computer-based automatic detection systems have been developed to a…
▽ More
In 2019, the world faced a new challenge: a COVID-19 disease caused by the novel coronavirus, SARS-CoV-2. The virus rapidly spread across the globe, leading to a high rate of mortality, which prompted health organizations to take measures to control its transmission. Early disease detection is crucial in the treatment process, and computer-based automatic detection systems have been developed to aid in this effort. These systems often rely on artificial intelligence (AI) approaches such as machine learning, neural networks, fuzzy systems, and deep learning to classify diseases. This study aimed to differentiate COVID-19 patients from others using self-categorizing classifiers and employing various AI methods. This study used two datasets: the blood test samples and radiography images. The best results for the blood test samples obtained from San Raphael Hospital, which include two classes of individuals, those with COVID-19 and those with non-COVID diseases, were achieved through the use of the Ensemble method (a combination of a neural network and two machines learning methods). The results showed that this approach for COVID-19 diagnosis is cost-effective and provides results in a shorter amount of time than other methods. The proposed model achieved an accuracy of 94.09% on the dataset used. Secondly, the radiographic images were divided into four classes: normal, viral pneumonia, ground glass opacity, and COVID-19 infection. These were used for segmentation and classification. The lung lobes were extracted from the images and then categorized into specific classes. We achieved an accuracy of 91.1% on the image dataset. Generally, this study highlights the potential of AI in detecting and managing COVID-19 and underscores the importance of continued research and development in this field.
△ Less
Submitted 6 August, 2024; v1 submitted 2 April, 2024;
originally announced April 2024.
-
Building Trees for Probabilistic Prediction via Scoring Rules
Authors:
Sara Shashaani,
Ozge Surer,
Matthew Plumlee,
Seth Guikema
Abstract:
Decision trees built with data remain in widespread use for nonparametric prediction. Predicting probability distributions is preferred over point predictions when uncertainty plays a prominent role in analysis and decision-making. We study modifying a tree to produce nonparametric predictive distributions. We find the standard method for building trees may not result in good predictive distributi…
▽ More
Decision trees built with data remain in widespread use for nonparametric prediction. Predicting probability distributions is preferred over point predictions when uncertainty plays a prominent role in analysis and decision-making. We study modifying a tree to produce nonparametric predictive distributions. We find the standard method for building trees may not result in good predictive distributions and propose changing the splitting criteria for trees to one based on proper scoring rules. Analysis of both simulated data and several real datasets demonstrates that using these new splitting criteria results in trees with improved predictive properties considering the entire predictive distribution.
△ Less
Submitted 16 February, 2024;
originally announced February 2024.
-
Simulation Model Calibration with Dynamic Stratification and Adaptive Sampling
Authors:
Pranav Jain,
Sara Shashaani,
Eunshin Byon
Abstract:
Calibrating simulation models that take large quantities of multi-dimensional data as input is a hard simulation optimization problem. Existing adaptive sampling strategies offer a methodological solution. However, they may not sufficiently reduce the computational cost for estimation and solution algorithm's progress within a limited budget due to extreme noise levels and heteroskedasticity of sy…
▽ More
Calibrating simulation models that take large quantities of multi-dimensional data as input is a hard simulation optimization problem. Existing adaptive sampling strategies offer a methodological solution. However, they may not sufficiently reduce the computational cost for estimation and solution algorithm's progress within a limited budget due to extreme noise levels and heteroskedasticity of system responses. We propose integrating stratification with adaptive sampling for the purpose of efficiency in optimization. Stratification can exploit local dependence in the simulation inputs and outputs. Yet, the state-of-the-art does not provide a full capability to adaptively stratify the data as different solution alternatives are evaluated. We devise two procedures for data-driven calibration problems that involve a large dataset with multiple covariates to calibrate models within a fixed overall simulation budget. The first approach dynamically stratifies the input data using binary trees, while the second approach uses closed-form solutions based on linearity assumptions between the objective function and concomitant variables. We find that dynamical adjustment of stratification structure accelerates optimization and reduces run-to-run variability in generated solutions. Our case study for calibrating a wind power simulation model, widely used in the wind industry, using the proposed stratified adaptive sampling, shows better-calibrated parameters under a limited budget.
△ Less
Submitted 16 July, 2024; v1 submitted 25 January, 2024;
originally announced January 2024.
-
Two-Stage Estimation and Variance Modeling for Latency-Constrained Variational Quantum Algorithms
Authors:
Yunsoo Ha,
Sara Shashaani,
Matt Menickelly
Abstract:
The Quantum Approximate Optimization Algorithm (QAOA) has enjoyed increasing attention in noisy intermediate-scale quantum computing due to its application to combinatorial optimization problems. Because combinatorial optimization problems are NP-hard, QAOA could serve as a potential demonstration of quantum advantage in the future. As a hybrid quantum-classical algorithm, the classical component…
▽ More
The Quantum Approximate Optimization Algorithm (QAOA) has enjoyed increasing attention in noisy intermediate-scale quantum computing due to its application to combinatorial optimization problems. Because combinatorial optimization problems are NP-hard, QAOA could serve as a potential demonstration of quantum advantage in the future. As a hybrid quantum-classical algorithm, the classical component of QAOA resembles a simulation optimization problem, in which the simulation outcomes are attainable only through the quantum computer. The simulation that derives from QAOA exhibits two unique features that can have a substantial impact on the optimization process: (i) the variance of the stochastic objective values typically decreases in proportion to the optimality gap, and (ii) querying samples from a quantum computer introduces an additional latency overhead. In this paper, we introduce a novel stochastic trust-region method, derived from a derivative-free adaptive sampling trust-region optimization (ASTRO-DF) method, intended to efficiently solve the classical optimization problem in QAOA, by explicitly taking into account the two mentioned characteristics. The key idea behind the proposed algorithm involves constructing two separate local models in each iteration: a model of the objective function, and a model of the variance of the objective function. Exploiting the variance model allows us to both restrict the number of communications with the quantum computer, and also helps navigate the nonconvex objective landscapes typical in the QAOA optimization problems. We numerically demonstrate the superiority of our proposed algorithm using the SimOpt library and Qiskit, when we consider a metric of computational burden that explicitly accounts for communication costs.
△ Less
Submitted 16 January, 2024;
originally announced January 2024.
-
Uncertainty Quantification using Simulation Output: Batching as an Inferential Device
Authors:
Yongseok Jeon,
Yi Chu,
Raghu Pasupathy,
Sara Shashaani
Abstract:
We present batching as an omnibus device for uncertainty quantification using simulation output. We consider the classical context of a simulationist performing uncertainty quantification on an estimator $θ_n$ (of an unknown fixed quantity $θ$) using only the output data $(Y_1,Y_2,\ldots,Y_n)$ gathered from a simulation. By uncertainty quantification, we mean approximating the sampling distributio…
▽ More
We present batching as an omnibus device for uncertainty quantification using simulation output. We consider the classical context of a simulationist performing uncertainty quantification on an estimator $θ_n$ (of an unknown fixed quantity $θ$) using only the output data $(Y_1,Y_2,\ldots,Y_n)$ gathered from a simulation. By uncertainty quantification, we mean approximating the sampling distribution of the error $θ_n-θ$ toward: (A) estimating an assessment functional $ψ$, e.g., bias, variance, or quantile; or (B) constructing a $(1-α)$-confidence region on $θ$. We argue that batching is a remarkably simple and effective device for this purpose, and is especially suited for handling dependent output data such as what one frequently encounters in simulation contexts. We demonstrate that if the number of batches and the extent of their overlap are chosen appropriately, batching retains bootstrap's attractive theoretical properties of strong consistency and higher-order accuracy. For constructing confidence regions, we characterize two limiting distributions associated with a Studentized statistic. Our extensive numerical experience confirms theoretical insight, especially about the effects of batch size and batch overlap.
△ Less
Submitted 26 August, 2024; v1 submitted 7 November, 2023;
originally announced November 2023.
-
Iteration Complexity and Finite-Time Efficiency of Adaptive Sampling Trust-Region Methods for Stochastic Derivative-Free Optimization
Authors:
Yunsoo Ha,
Sara Shashaani
Abstract:
Adaptive sampling with interpolation-based trust regions or ASTRO-DF is a successful algorithm for stochastic derivative-free optimization with an easy-to-understand-and-implement concept that guarantees almost sure convergence to a first-order critical point. To reduce its dependence on the problem dimension, we present local models with diagonal Hessians constructed on interpolation points based…
▽ More
Adaptive sampling with interpolation-based trust regions or ASTRO-DF is a successful algorithm for stochastic derivative-free optimization with an easy-to-understand-and-implement concept that guarantees almost sure convergence to a first-order critical point. To reduce its dependence on the problem dimension, we present local models with diagonal Hessians constructed on interpolation points based on a coordinate basis. We also leverage the interpolation points in a direct search manner whenever possible to boost ASTRO-DF's performance in a finite time. We prove that the algorithm has a canonical iteration complexity of $\mathcal{O}(ε^{-2})$ almost surely, which is the first guarantee of its kind without placing assumptions on the quality of function estimates or model quality or independence between them. Numerical experimentation reveals the computational advantage of ASTRO-DF with coordinate direct search due to saving and better steps in the early iterations of the search.
△ Less
Submitted 16 January, 2024; v1 submitted 17 May, 2023;
originally announced May 2023.
-
Network Intrusion Detection with Limited Labeled Data Using Self-supervision
Authors:
S. Lotfi,
M. Modirrousta,
S. Shashaani,
M. Aliyari Shoorehdeli
Abstract:
With the increasing dependency of daily life over computer networks, the importance of these networks security becomes prominent. Different intrusion attacks to networks have been designed and the attackers are working on improving them. Thus the ability to detect intrusion with limited number of labeled data is desirable to provide networks with higher level of security. In this paper we design a…
▽ More
With the increasing dependency of daily life over computer networks, the importance of these networks security becomes prominent. Different intrusion attacks to networks have been designed and the attackers are working on improving them. Thus the ability to detect intrusion with limited number of labeled data is desirable to provide networks with higher level of security. In this paper we design an intrusion detection system based on a deep neural network. The proposed system is based on self-supervised contrastive learning where a huge amount of unlabeled data can be used to generate informative representation suitable for various downstream tasks with limited number of labeled data. Using different experiments, we have shown that the proposed system presents an accuracy of 94.05% over the UNSW-NB15 dataset, an improvement of 4.22% in comparison to previous method based on self-supervised learning. Our simulations have also shown impressive results when the size of labeled training data is limited. The performance of the resulting Encoder Block trained on UNSW-NB15 dataset has also been tested on other datasets for representation extraction which shows competitive results in downstream tasks.
△ Less
Submitted 31 March, 2023; v1 submitted 1 September, 2022;
originally announced September 2022.
-
Robust Output Analysis with Monte-Carlo Methodology
Authors:
Kimia Vahdat,
Sara Shashaani
Abstract:
In predictive modeling with simulation or machine learning, it is critical to accurately assess the quality of estimated values through output analysis. In recent decades output analysis has become enriched with methods that quantify the impact of input data uncertainty in the model outputs to increase robustness. However, most developments are applicable assuming that the input data adheres to a…
▽ More
In predictive modeling with simulation or machine learning, it is critical to accurately assess the quality of estimated values through output analysis. In recent decades output analysis has become enriched with methods that quantify the impact of input data uncertainty in the model outputs to increase robustness. However, most developments are applicable assuming that the input data adheres to a parametric family of distributions. We propose a unified output analysis framework for simulation and machine learning outputs through the lens of Monte Carlo sampling. This framework provides nonparametric quantification of the variance and bias induced in the outputs with higher-order accuracy. Our new bias-corrected estimation from the model outputs leverages the extension of fast iterative bootstrap sampling and higher-order influence functions. For the scalability of the proposed estimation methods, we devise budget-optimal rules and leverage control variates for variance reduction. Our theoretical and numerical results demonstrate a clear advantage in building more robust confidence intervals from the model outputs with higher coverage probability.
△ Less
Submitted 25 October, 2023; v1 submitted 27 July, 2022;
originally announced July 2022.
-
ASTRO-DF: A Class of Adaptive Sampling Trust-Region Algorithms for Derivative-Free Stochastic Optimization
Authors:
Sara Shashaani,
Fatemeh Hashemi,
Raghu Pasupathy
Abstract:
We consider unconstrained optimization problems where only "stochastic" estimates of the objective function are observable as replicates from a Monte Carlo oracle. The Monte Carlo oracle is assumed to provide no direct observations of the function gradient. We present ASTRO-DF --- a class of derivative-free trust-region algorithms, where a stochastic local interpolation model is constructed, optim…
▽ More
We consider unconstrained optimization problems where only "stochastic" estimates of the objective function are observable as replicates from a Monte Carlo oracle. The Monte Carlo oracle is assumed to provide no direct observations of the function gradient. We present ASTRO-DF --- a class of derivative-free trust-region algorithms, where a stochastic local interpolation model is constructed, optimized, and updated iteratively. Function estimation and model construction within ASTRO-DF is adaptive in the sense that the extent of Monte Carlo sampling is determined by continuously monitoring and balancing metrics of sampling error (or variance) and structural error (or model bias) within ASTRO-DF. Such balancing of errors is designed to ensure that Monte Carlo effort within ASTRO-DF is sensitive to algorithm trajectory, sampling more whenever an iterate is inferred to be close to a critical point and less when far away. We demonstrate the almost-sure convergence of ASTRO-DF's iterates to a first-order critical point when using linear or quadratic stochastic interpolation models. The question of using more complicated models, e.g., regression or stochastic kriging, in combination with adaptive sampling is worth further investigation and will benefit from the methods of proof presented here. We speculate that ASTRO-DF's iterates achieve the canonical Monte Carlo convergence rate, although a proof remains elusive.
△ Less
Submitted 20 October, 2016;
originally announced October 2016.