Zum Hauptinhalt springen

Showing 1–15 of 15 results for author: Haddock, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.02626  [pdf

    cs.DB

    The text2term tool to map free-text descriptions of biomedical terms to ontologies

    Authors: Rafael S. Gonçalves, Jason Payne, Amelia Tan, Carmen Benitez, Jamie Haddock, Robert Gentleman

    Abstract: There is an ongoing need for scalable tools to aid researchers in both retrospective and prospective standardization of discrete entity types -- such as disease names, cell types or chemicals -- that are used in metadata associated with biomedical data. When metadata are not well-structured or precise, the associated data are harder to find and are often burdensome to reuse, analyze or integrate w… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  2. arXiv:2303.00058  [pdf, other

    cs.LG stat.ML

    Neural Nonnegative Matrix Factorization for Hierarchical Multilayer Topic Modeling

    Authors: Tyler Will, Runyu Zhang, Eli Sadovnik, Mengdi Gao, Joshua Vendrow, Jamie Haddock, Denali Molitor, Deanna Needell

    Abstract: We introduce a new method based on nonnegative matrix factorization, Neural NMF, for detecting latent hierarchical structure in data. Datasets with hierarchical structure arise in a wide variety of fields, such as document classification, image processing, and bioinformatics. Neural NMF recursively applies NMF in layers to discover overarching topics encompassing the lower-level features. We deriv… ▽ More

    Submitted 28 February, 2023; originally announced March 2023.

  3. arXiv:2209.02415  [pdf, other

    cs.CV cs.AI

    Automatic Infectious Disease Classification Analysis with Concept Discovery

    Authors: Elena Sizikova, Joshua Vendrow, Xu Cao, Rachel Grotheer, Jamie Haddock, Lara Kassab, Alona Kryshchenko, Thomas Merkh, R. W. M. A. Madushani, Kenny Moise, Annie Ulichney, Huy V. Vo, Chuntian Wang, Megan Coffee, Kathryn Leonard, Deanna Needell

    Abstract: Automatic infectious disease classification from images can facilitate needed medical diagnoses. Such an approach can identify diseases, like tuberculosis, which remain under-diagnosed due to resource constraints and also novel and emerging diseases, like monkeypox, which clinicians have little experience or acumen in diagnosing. Avoiding missed or delayed diagnoses would prevent further transmiss… ▽ More

    Submitted 14 November, 2022; v1 submitted 28 August, 2022; originally announced September 2022.

    Comments: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2022, November 28th, 2022, New Orleans, United States & Virtual, http://www.ml4h.cc, 13 pages

  4. arXiv:2207.05112  [pdf, other

    cs.LG

    An Interpretable Joint Nonnegative Matrix Factorization-Based Point Cloud Distance Measure

    Authors: Hannah Friedman, Amani R. Maina-Kilaas, Julianna Schalkwyk, Hina Ahmed, Jamie Haddock

    Abstract: In this paper, we propose a new method for determining shared features of and measuring the distance between data sets or point clouds. Our approach uses the joint factorization of two data matrices $X_1,X_2$ into non-negative matrices $X_1 = AS_1, X_2 = AS_2$ to derive a similarity measure that determines how well the shared basis $A$ approximates $X_1, X_2$. We also propose a point cloud distanc… ▽ More

    Submitted 27 November, 2022; v1 submitted 11 July, 2022; originally announced July 2022.

  5. arXiv:2204.13586  [pdf, other

    cs.SI cs.LG math.PR physics.data-an physics.soc-ph

    Nonbacktracking spectral clustering of nonuniform hypergraphs

    Authors: Philip Chodrow, Nicole Eikmeier, Jamie Haddock

    Abstract: Spectral methods offer a tractable, global framework for clustering in graphs via eigenvector computations on graph matrices. Hypergraph data, in which entities interact on edges of arbitrary size, poses challenges for matrix representations and therefore for spectral clustering. We study spectral clustering for nonuniform hypergraphs based on the hypergraph nonbacktracking operator. After reviewi… ▽ More

    Submitted 3 September, 2022; v1 submitted 26 April, 2022; originally announced April 2022.

    Comments: Main text: 26 pages, 6 figures. Appendix and references: 23 pages, 4 figures

    MSC Class: 05C50; 05C65; 15A18; 62H30; 62R07; 91D30

  6. arXiv:2203.03551  [pdf, other

    cs.IR cs.LG math.NA

    Semi-supervised Nonnegative Matrix Factorization for Document Classification

    Authors: Jamie Haddock, Lara Kassab, Sixian Li, Alona Kryshchenko, Rachel Grotheer, Elena Sizikova, Chuntian Wang, Thomas Merkh, RWMA Madushani, Miju Ahn, Deanna Needell, Kathryn Leonard

    Abstract: We propose new semi-supervised nonnegative matrix factorization (SSNMF) models for document classification and provide motivation for these models as maximum likelihood estimators. The proposed SSNMF models simultaneously provide both a topic model and a model for classification, thereby offering highly interpretable classification results. We derive training methods using multiplicative updates f… ▽ More

    Submitted 28 February, 2022; originally announced March 2022.

    Comments: arXiv admin note: substantial text overlap with arXiv:2010.07956

  7. arXiv:2110.14609  [pdf, other

    math.OC cs.DC cs.MA eess.SY

    Paving the Way for Consensus: Convergence of Block Gossip Algorithms

    Authors: Jamie Haddock, Benjamin Jarman, Chen Yap

    Abstract: Gossip protocols are popular methods for average consensus problems in distributed computing. We prove new convergence guarantees for a variety of such protocols, including path, clique, and synchronous pairwise gossip. These arise by exploiting the connection between these protocols and the block randomized Kaczmarz method for solving linear systems. Moreover, we extend existing convergence resul… ▽ More

    Submitted 27 October, 2021; originally announced October 2021.

    Comments: 21 pages, 19 figures

  8. arXiv:2109.14820  [pdf, other

    cs.LG stat.ML

    A Generalized Hierarchical Nonnegative Tensor Decomposition

    Authors: Joshua Vendrow, Jamie Haddock, Deanna Needell

    Abstract: Nonnegative matrix factorization (NMF) has found many applications including topic modeling and document analysis. Hierarchical NMF (HNMF) variants are able to learn topics at various levels of granularity and illustrate their hierarchical relationship. Recently, nonnegative tensor factorization (NTF) methods have been applied in a similar fashion in order to handle data sets with complex, multi-m… ▽ More

    Submitted 15 February, 2022; v1 submitted 29 September, 2021; originally announced September 2021.

    Comments: 6 pages, 2 figues, 3 tables

  9. arXiv:2010.11365  [pdf, other

    cs.LG

    On a Guided Nonnegative Matrix Factorization

    Authors: Joshua Vendrow, Jamie Haddock, Elizaveta Rebrova, Deanna Needell

    Abstract: Fully unsupervised topic models have found fantastic success in document clustering and classification. However, these models often suffer from the tendency to learn less-than-meaningful or even redundant topics when the data is biased towards a set of features. For this reason, we propose an approach based upon the nonnegative matrix factorization (NMF) model, deemed \textit{Guided NMF}, that inc… ▽ More

    Submitted 5 February, 2021; v1 submitted 21 October, 2020; originally announced October 2020.

    Comments: 6 pages, 6 tables

  10. arXiv:2010.10635  [pdf, other

    math.NA cs.LG math.OC

    On Application of Block Kaczmarz Methods in Matrix Factorization

    Authors: Edwin Chau, Jamie Haddock

    Abstract: Matrix factorization techniques compute low-rank product approximations of high dimensional data matrices and as a result, are often employed in recommender systems and collaborative filtering applications. However, many algorithms for this task utilize an exact least-squares solver whose computation is time consuming and memory-expensive. In this paper we discuss and test a block Kaczmarz solver… ▽ More

    Submitted 20 October, 2020; originally announced October 2020.

    Comments: 13 pages, 5 figures

  11. arXiv:2010.07956  [pdf, other

    cs.LG math.OC

    Semi-supervised NMF Models for Topic Modeling in Learning Tasks

    Authors: Jamie Haddock, Lara Kassab, Sixian Li, Alona Kryshchenko, Rachel Grotheer, Elena Sizikova, Chuntian Wang, Thomas Merkh, R. W. M. A. Madushani, Miju Ahn, Deanna Needell, Kathryn Leonard

    Abstract: We propose several new models for semi-supervised nonnegative matrix factorization (SSNMF) and provide motivation for SSNMF models as maximum likelihood estimators given specific distributions of uncertainty. We present multiplicative updates training methods for each new model, and demonstrate the application of these models to classification, although they are flexible to other supervised learni… ▽ More

    Submitted 15 October, 2020; originally announced October 2020.

    Comments: 4 figures, 12 tables

  12. arXiv:2009.09087  [pdf, other

    cs.CY cs.LG stat.ML

    Feature Selection on Lyme Disease Patient Survey Data

    Authors: Joshua Vendrow, Jamie Haddock, Deanna Needell, Lorraine Johnson

    Abstract: Lyme disease is a rapidly growing illness that remains poorly understood within the medical community. Critical questions about when and why patients respond to treatment or stay ill, what kinds of treatments are effective, and even how to properly diagnose the disease remain largely unanswered. We investigate these questions by applying machine learning techniques to a large scale Lyme disease pa… ▽ More

    Submitted 24 August, 2020; originally announced September 2020.

    Comments: 9 pages, 8 figures, 6 tables

  13. arXiv:2001.00631  [pdf, other

    cs.LG stat.ML

    On Large-Scale Dynamic Topic Modeling with Nonnegative CP Tensor Decomposition

    Authors: Miju Ahn, Nicole Eikmeier, Jamie Haddock, Lara Kassab, Alona Kryshchenko, Kathryn Leonard, Deanna Needell, R. W. M. A. Madushani, Elena Sizikova, Chuntian Wang

    Abstract: There is currently an unprecedented demand for large-scale temporal data analysis due to the explosive growth of data. Dynamic topic modeling has been widely used in social and data sciences with the goal of learning latent topics that emerge, evolve, and fade over time. Previous work on dynamic topic modeling primarily employ the method of nonnegative matrix factorization (NMF), where slices of t… ▽ More

    Submitted 14 October, 2020; v1 submitted 2 January, 2020; originally announced January 2020.

    Comments: 23 pages, 29 figures, submitted to Women in Data Science and Mathematics (WiSDM) Workshop Proceedings, "Advances in Data Science", AWM-Springer series

  14. arXiv:1905.13404  [pdf, other

    cs.LG math.OC stat.ML

    Data-driven Algorithm Selection and Parameter Tuning: Two Case studies in Optimization and Signal Processing

    Authors: Jesus A. De Loera, Jamie Haddock, Anna Ma, Deanna Needell

    Abstract: Machine learning algorithms typically rely on optimization subroutines and are well-known to provide very effective outcomes for many types of problems. Here, we flip the reliance and ask the reverse question: can machine learning algorithms lead to more effective outcomes for optimization problems? Our goal is to train machine learning methods to automatically improve the performance of optimizat… ▽ More

    Submitted 26 July, 2019; v1 submitted 30 May, 2019; originally announced May 2019.

  15. arXiv:1710.02608  [pdf, other

    math.OC cs.DS math.MG

    The Minimum Euclidean-Norm Point on a Convex Polytope: Wolfe's Combinatorial Algorithm is Exponential

    Authors: Jesus De Loera, Jamie Haddock, Luis Rademacher

    Abstract: The complexity of Philip Wolfe's method for the minimum Euclidean-norm point problem over a convex polytope has remained unknown since he proposed the method in 1974. The method is important because it is used as a subroutine for one of the most practical algorithms for submodular function minimization. We present the first example that Wolfe's method takes exponential time. Additionally, we impro… ▽ More

    Submitted 3 November, 2017; v1 submitted 6 October, 2017; originally announced October 2017.

    MSC Class: 90C20; 90C27; 90C60