Zum Hauptinhalt springen

Showing 1–25 of 25 results for author: Gittens, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.04224  [pdf, other

    cs.CL cs.AI cs.LG

    Aligners: Decoupling LLMs and Alignment

    Authors: Lilian Ngweta, Mayank Agarwal, Subha Maity, Alex Gittens, Yuekai Sun, Mikhail Yurochkin

    Abstract: Large Language Models (LLMs) need to be aligned with human expectations to ensure their safety and utility in most applications. Alignment is challenging, costly, and needs to be repeated for every LLM and alignment criterion. We propose to decouple LLMs and alignment by training aligner models that can be used to align any LLM for a given criteria on an as-needed basis, thus also reducing the pot… ▽ More

    Submitted 16 June, 2024; v1 submitted 6 March, 2024; originally announced March 2024.

    Comments: Tiny Papers at the International Conference on Learning Representations (ICLR) 2024

  2. arXiv:2308.15027  [pdf, ps, other

    cs.IR cs.CL

    Improving Neural Ranking Models with Traditional IR Methods

    Authors: Anik Saha, Oktie Hassanzadeh, Alex Gittens, Jian Ni, Kavitha Srinivas, Bulent Yener

    Abstract: Neural ranking methods based on large transformer models have recently gained significant attention in the information retrieval community, and have been adopted by major commercial solutions. Nevertheless, they are computationally expensive to create, and require a great deal of labeled data for specialized corpora. In this paper, we explore a low resource alternative which is a bag-of-embedding… ▽ More

    Submitted 29 August, 2023; originally announced August 2023.

    Comments: Short paper, 4 pages

  3. arXiv:2308.03891  [pdf, other

    cs.CL

    A Cross-Domain Evaluation of Approaches for Causal Knowledge Extraction

    Authors: Anik Saha, Oktie Hassanzadeh, Alex Gittens, Jian Ni, Kavitha Srinivas, Bulent Yener

    Abstract: Causal knowledge extraction is the task of extracting relevant causes and effects from text by detecting the causal relation. Although this task is important for language understanding and knowledge discovery, recent works in this domain have largely focused on binary classification of a text segment as causal or non-causal. In this regard, we perform a thorough analysis of three sequence tagging… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

  4. arXiv:2305.20043  [pdf, ps, other

    cs.LG stat.ML

    Deception by Omission: Using Adversarial Missingness to Poison Causal Structure Learning

    Authors: Deniz Koyuncu, Alex Gittens, Bülent Yener, Moti Yung

    Abstract: Inference of causal structures from observational data is a key component of causal machine learning; in practice, this data may be incompletely observed. Prior work has demonstrated that adversarial perturbations of completely observed training data may be used to force the learning of inaccurate causal structural models (SCMs). However, when the data can be audited for correctness (e.g., it is c… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

  5. arXiv:2305.07486  [pdf, ps, other

    cs.LG cs.DS

    Reduced Label Complexity For Tight $\ell_2$ Regression

    Authors: Alex Gittens, Malik Magdon-Ismail

    Abstract: Given data ${\rm X}\in\mathbb{R}^{n\times d}$ and labels $\mathbf{y}\in\mathbb{R}^{n}$ the goal is find $\mathbf{w}\in\mathbb{R}^d$ to minimize $\Vert{\rm X}\mathbf{w}-\mathbf{y}\Vert^2$. We give a polynomial algorithm that, \emph{oblivious to $\mathbf{y}$}, throws out $n/(d+\sqrt{n})$ data points and is a $(1+d/n)$-approximation to optimal in expectation. The motivation is tight approximation wit… ▽ More

    Submitted 12 May, 2023; originally announced May 2023.

  6. arXiv:2304.10642  [pdf, other

    cs.CL

    Word Sense Induction with Knowledge Distillation from BERT

    Authors: Anik Saha, Alex Gittens, Bulent Yener

    Abstract: Pre-trained contextual language models are ubiquitously employed for language understanding tasks, but are unsuitable for resource-constrained systems. Noncontextual word embeddings are an efficient alternative in these settings. Such methods typically use one vector to encode multiple different meanings of a word, and incur errors due to polysemy. This paper proposes a two-stage method to distill… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

  7. arXiv:2302.09795  [pdf, other

    cs.LG cs.CV stat.ML

    Simple Disentanglement of Style and Content in Visual Representations

    Authors: Lilian Ngweta, Subha Maity, Alex Gittens, Yuekai Sun, Mikhail Yurochkin

    Abstract: Learning visual representations with interpretable features, i.e., disentangled representations, remains a challenging problem. Existing methods demonstrate some success but are hard to apply to large-scale vision datasets like ImageNet. In this work, we propose a simple post-processing framework to disentangle content and style in learned representations from pre-trained vision models. We model t… ▽ More

    Submitted 31 May, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

    Comments: International Conference on Machine Learning (ICML) 2023

  8. arXiv:2107.03806  [pdf, other

    cs.LG cs.CR

    Output Randomization: A Novel Defense for both White-box and Black-box Adversarial Models

    Authors: Daniel Park, Haidar Khan, Azer Khan, Alex Gittens, Bülent Yener

    Abstract: Adversarial examples pose a threat to deep neural network models in a variety of scenarios, from settings where the adversary has complete knowledge of the model in a "white box" setting and to the opposite in a "black box" setting. In this paper, we explore the use of output randomization as a defense against attacks in both the black box and white box models and propose two defenses. In the firs… ▽ More

    Submitted 8 July, 2021; originally announced July 2021.

    Comments: This is a substantially changed version of an earlier preprint (arXiv:1905.09871)

  9. arXiv:2106.04447  [pdf, other

    cs.CL

    Reading StackOverflow Encourages Cheating: Adding Question Text Improves Extractive Code Generation

    Authors: Gabriel Orlanski, Alex Gittens

    Abstract: Answering a programming question using only its title is difficult as salient contextual information is omitted. Based on this observation, we present a corpus of over 40,000 StackOverflow question texts to be used in conjunction with their corresponding intents from the CoNaLa dataset (Yin et al., 2018). Using both the intent and question body, we use BART to establish a baseline BLEU score of 34… ▽ More

    Submitted 8 June, 2021; originally announced June 2021.

    Comments: To be published in ACL-IJCNLP NLP4Prog workshop. (The First Workshop on Natural Language Processing for Programming)

  10. arXiv:2104.13504  [pdf, other

    cs.LG stat.ML

    Learning Fair Canonical Polyadical Decompositions using a Kernel Independence Criterion

    Authors: Kevin Kim, Alex Gittens

    Abstract: This work proposes to learn fair low-rank tensor decompositions by regularizing the Canonical Polyadic Decomposition factorization with the kernel Hilbert-Schmidt independence criterion (KHSIC). It is shown, theoretically and empirically, that a small KHSIC between a latent factor and the sensitive features guarantees approximate statistical parity. The proposed algorithm surpasses the state-of-th… ▽ More

    Submitted 27 April, 2021; originally announced April 2021.

  11. arXiv:2104.08026  [pdf, other

    cs.LG math.NA

    NoisyCUR: An algorithm for two-cost budgeted matrix completion

    Authors: Dong Hu, Alex Gittens, Malik Magdon-Ismail

    Abstract: Matrix completion is a ubiquitous tool in machine learning and data analysis. Most work in this area has focused on the number of observations necessary to obtain an accurate low-rank approximation. In practice, however, the cost of observations is an important limiting factor, and experimentalists may have on hand multiple modes of observation with differing noise-vs-cost trade-offs. This paper c… ▽ More

    Submitted 16 April, 2021; originally announced April 2021.

  12. arXiv:2102.05571  [pdf, other

    cs.CR cs.AI cs.IR cs.LG

    TINKER: A framework for Open source Cyberthreat Intelligence

    Authors: Nidhi Rastogi, Sharmishtha Dutta, Mohammed J. Zaki, Alex Gittens, Charu Aggarwal

    Abstract: Threat intelligence on malware attacks and campaigns is increasingly being shared with other security experts for a cost or for free. Other security analysts use this intelligence to inform them of indicators of compromise, attack techniques, and preventative actions. Security analysts prepare threat analysis reports after investigating an attack, an emerging cyber threat, or a recently discovered… ▽ More

    Submitted 19 January, 2023; v1 submitted 10 February, 2021; originally announced February 2021.

    Comments: 9 pages

  13. MALOnt: An Ontology for Malware Threat Intelligence

    Authors: Nidhi Rastogi, Sharmishtha Dutta, Mohammed J. Zaki, Alex Gittens, Charu Aggarwal

    Abstract: Malware threat intelligence uncovers deep information about malware, threat actors, and their tactics, Indicators of Compromise(IoC), and vulnerabilities in different platforms from scattered threat sources. This collective information can guide decision making in cyber defense applications utilized by security operation centers(SoCs). In this paper, we introduce an open-source malware ontology -… ▽ More

    Submitted 19 June, 2020; originally announced June 2020.

  14. arXiv:1909.12580  [pdf, other

    cs.LG stat.ML

    Fast Fixed Dimension L2-Subspace Embeddings of Arbitrary Accuracy, With Application to L1 and L2 Tasks

    Authors: Malik Magdon-Ismail, Alex Gittens

    Abstract: We give a fast oblivious L2-embedding of $A\in \mathbb{R}^{n x d}$ to $B\in \mathbb{R}^{r x d}$ satisfying $(1-\varepsilon)\|A x\|_2^2 \le \|B x\|_2^2 <= (1+\varepsilon) \|Ax\|_2^2.$ Our embedding dimension $r$ equals $d$, a constant independent of the distortion $\varepsilon$. We use as a black-box any L2-embedding $Π^T A$ and inherit its runtime and accuracy, effectively decoupling the dimension… ▽ More

    Submitted 27 September, 2019; originally announced September 2019.

  15. arXiv:1806.01270  [pdf, other

    cs.DC cs.DB physics.data-an stat.CO

    Alchemist: An Apache Spark <=> MPI Interface

    Authors: Alex Gittens, Kai Rothauge, Shusen Wang, Michael W. Mahoney, Jey Kottalam, Lisa Gerhardt, Prabhat, Michael Ringenburg, Kristyn Maschhoff

    Abstract: The Apache Spark framework for distributed computation is popular in the data analytics community due to its ease of use, but its MapReduce-style programming model can incur significant overheads when performing computations that do not map directly onto this model. One way to mitigate these costs is to off-load computations onto MPI codes. In recent work, we introduced Alchemist, a system for the… ▽ More

    Submitted 3 June, 2018; originally announced June 2018.

    Comments: Accepted for publication in Concurrency and Computation: Practice and Experience, Special Issue on the Cray User Group 2018. arXiv admin note: text overlap with arXiv:1805.11800

  16. arXiv:1805.11800  [pdf, other

    cs.DC cs.DB physics.data-an stat.CO

    Accelerating Large-Scale Data Analysis by Offloading to High-Performance Computing Libraries using Alchemist

    Authors: Alex Gittens, Kai Rothauge, Shusen Wang, Michael W. Mahoney, Lisa Gerhardt, Prabhat, Jey Kottalam, Michael Ringenburg, Kristyn Maschhoff

    Abstract: Apache Spark is a popular system aimed at the analysis of large data sets, but recent studies have shown that certain computations---in particular, many linear algebra computations that are the basis for solving common machine learning problems---are significantly slower in Spark than when done using libraries written in a high-performance computing framework such as the Message-Passing Interface… ▽ More

    Submitted 30 May, 2018; originally announced May 2018.

    Comments: Accepted for publication in Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, London, UK, 2018

  17. arXiv:1706.02803  [pdf, other

    cs.LG stat.ML

    Scalable Kernel K-Means Clustering with Nystrom Approximation: Relative-Error Bounds

    Authors: Shusen Wang, Alex Gittens, Michael W. Mahoney

    Abstract: Kernel $k$-means clustering can correctly identify and extract a far more varied collection of cluster structures than the linear $k$-means clustering algorithm. However, kernel $k$-means clustering is computationally expensive when the non-linear feature map is high-dimensional and there are many input points. Kernel approximation, e.g., the Nyström method, has been applied in previous works to a… ▽ More

    Submitted 10 February, 2019; v1 submitted 8 June, 2017; originally announced June 2017.

    Journal ref: Journal of Machine Learning Research 20 (2019) 1-49

  18. arXiv:1702.04837  [pdf, other

    stat.ML cs.LG math.NA

    Sketched Ridge Regression: Optimization Perspective, Statistical Perspective, and Model Averaging

    Authors: Shusen Wang, Alex Gittens, Michael W. Mahoney

    Abstract: We address the statistical and optimization impacts of the classical sketch and Hessian sketch used to approximately solve the Matrix Ridge Regression (MRR) problem. Prior research has quantified the effects of classical sketch on the strictly simpler least squares regression (LSR) problem. We establish that classical sketch has a similar effect upon the optimization properties of MRR as it does o… ▽ More

    Submitted 5 May, 2018; v1 submitted 15 February, 2017; originally announced February 2017.

    Comments: To appear in Journal of Machine Learning Research, 2018. A short version has appeared in International Conference on Machine Learning (ICML), 2017

    Journal ref: Journal of Machine Learning Research, 19, pp1-50, 2018

  19. arXiv:1607.01335  [pdf, other

    cs.DC

    Matrix Factorization at Scale: a Comparison of Scientific Data Analytics in Spark and C+MPI Using Three Case Studies

    Authors: Alex Gittens, Aditya Devarakonda, Evan Racah, Michael Ringenburg, Lisa Gerhardt, Jey Kottalam, Jialin Liu, Kristyn Maschhoff, Shane Canon, Jatin Chhugani, Pramod Sharma, Jiyan Yang, James Demmel, Jim Harrell, Venkat Krishnamurthy, Michael W. Mahoney, Prabhat

    Abstract: We explore the trade-offs of performing linear algebra using Apache Spark, compared to traditional C and MPI implementations on HPC platforms. Spark is designed for data analytics on cluster computing platforms with access to local disks and is optimized for data-parallel tasks. We examine three widely-used and important matrix factorizations: NMF (for physical plausability), PCA (for its ubiquity… ▽ More

    Submitted 20 September, 2016; v1 submitted 5 July, 2016; originally announced July 2016.

    ACM Class: G.1.3; C.2.4

  20. arXiv:1504.01697  [pdf, ps, other

    cs.LG stat.ML

    Tensor machines for learning target-specific polynomial features

    Authors: Jiyan Yang, Alex Gittens

    Abstract: Recent years have demonstrated that using random feature maps can significantly decrease the training and testing times of kernel-based algorithms without significantly lowering their accuracy. Regrettably, because random features are target-agnostic, typically thousands of such features are necessary to achieve acceptable accuracies. In this work, we consider the problem of learning a small numbe… ▽ More

    Submitted 7 April, 2015; originally announced April 2015.

    Comments: 19 pages, 4 color figures, 2 tables. Submitted to ECML 2015

  21. arXiv:1404.0466  [pdf, other

    cs.LG math.NA

    piCholesky: Polynomial Interpolation of Multiple Cholesky Factors for Efficient Approximate Cross-Validation

    Authors: Da Kuang, Alex Gittens, Raffay Hamid

    Abstract: The dominant cost in solving least-square problems using Newton's method is often that of factorizing the Hessian matrix over multiple values of the regularization parameter ($λ$). We propose an efficient way to interpolate the Cholesky factors of the Hessian matrix computed over a small set of $λ$ values. This approximation enables us to optimally minimize the hold-out error while incurring only… ▽ More

    Submitted 10 June, 2015; v1 submitted 2 April, 2014; originally announced April 2014.

  22. arXiv:1312.4626  [pdf, other

    stat.ML cs.LG

    Compact Random Feature Maps

    Authors: Raffay Hamid, Ying Xiao, Alex Gittens, Dennis DeCoste

    Abstract: Kernel approximation using randomized feature maps has recently gained a lot of interest. In this work, we identify that previous approaches for polynomial kernel approximation create maps that are rank deficient, and therefore do not utilize the capacity of the projected feature space effectively. To address this challenge, we propose compact random feature maps (CRAFTMaps) to approximate polynom… ▽ More

    Submitted 16 December, 2013; originally announced December 2013.

    Comments: 9 pages

  23. arXiv:1311.2854  [pdf, other

    cs.LG math.NA

    Spectral Clustering via the Power Method -- Provably

    Authors: Christos Boutsidis, Alex Gittens, Prabhanjan Kambadur

    Abstract: Spectral clustering is one of the most important algorithms in data mining and machine intelligence; however, its computational complexity limits its application to truly large scale data analysis. The computational bottleneck in spectral clustering is computing a few of the top eigenvectors of the (normalized) Laplacian matrix corresponding to the graph representing the data to be clustered. One… ▽ More

    Submitted 12 May, 2015; v1 submitted 12 November, 2013; originally announced November 2013.

    Comments: ICML 2015, to appear

  24. arXiv:1303.1849  [pdf, other

    cs.LG cs.DS math.NA

    Revisiting the Nystrom Method for Improved Large-Scale Machine Learning

    Authors: Alex Gittens, Michael W. Mahoney

    Abstract: We reconsider randomized algorithms for the low-rank approximation of symmetric positive semi-definite (SPSD) matrices such as Laplacian and kernel matrices that arise in data analysis and machine learning applications. Our main results consist of an empirical evaluation of the performance quality and running time of sampling and projection methods on a diverse suite of SPSD matrices. Our results… ▽ More

    Submitted 3 June, 2013; v1 submitted 7 March, 2013; originally announced March 2013.

    Comments: 60 pages, 15 color figures; updated proof of Frobenius norm bounds, added comparison to projection-based low-rank approximations, and an analysis of the power method applied to SPSD sketches

  25. arXiv:1204.0062  [pdf, other

    cs.DS math.NA

    Improved matrix algorithms via the Subsampled Randomized Hadamard Transform

    Authors: Christos Boutsidis, Alex Gittens

    Abstract: Several recent randomized linear algebra algorithms rely upon fast dimension reduction methods. A popular choice is the Subsampled Randomized Hadamard Transform (SRHT). In this article, we address the efficacy, in the Frobenius and spectral norms, of an SRHT-based low-rank matrix approximation technique introduced by Woolfe, Liberty, Rohklin, and Tygert. We establish a slightly better Frobenius no… ▽ More

    Submitted 21 June, 2013; v1 submitted 30 March, 2012; originally announced April 2012.

    Comments: to appear in SIAM Journal on Matrix Analysis and Applications