Zum Hauptinhalt springen

Showing 1–33 of 33 results for author: Bertozzi, A L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.13781  [pdf, other

    cs.LG cs.AI cs.CL cs.CV stat.ML

    A Primal-Dual Framework for Transformers and Neural Networks

    Authors: Tan M. Nguyen, Tam Nguyen, Nhat Ho, Andrea L. Bertozzi, Richard G. Baraniuk, Stanley J. Osher

    Abstract: Self-attention is key to the remarkable success of transformers in sequence modeling tasks including many applications in natural language processing and computer vision. Like neural network layers, these attention mechanisms are often developed by heuristics and experience. To provide a principled framework for constructing attention layers in transformers, we show that the self-attention corresp… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: Accepted to ICLR 2023, 26 pages, 4 figures, 14 tables

  2. arXiv:2311.14740  [pdf, other

    cs.CL

    AutoKG: Efficient Automated Knowledge Graph Generation for Language Models

    Authors: Bohan Chen, Andrea L. Bertozzi

    Abstract: Traditional methods of linking large language models (LLMs) to knowledge bases via the semantic similarity search often fall short of capturing complex relational dynamics. To address these limitations, we introduce AutoKG, a lightweight and efficient approach for automated knowledge graph (KG) construction. For a given knowledge base consisting of text blocks, AutoKG first extracts keywords using… ▽ More

    Submitted 22 November, 2023; originally announced November 2023.

    Comments: 10 pages, accepted by IEEE BigData 2023 as a workshop paper in GTA3

  3. arXiv:2311.11163  [pdf, other

    cs.SI stat.AP stat.CO

    Hate speech and hate crimes: a data-driven study of evolving discourse around marginalized groups

    Authors: Malvina Bozhidarova, Jonathn Chang, Aaishah Ale-rasool, Yuxiang Liu, Chongyao Ma, Andrea L. Bertozzi, P. Jeffrey Brantingham, Junyuan Lin, Sanjukta Krishnagopal

    Abstract: This study explores the dynamic relationship between online discourse, as observed in tweets, and physical hate crimes, focusing on marginalized groups. Leveraging natural language processing techniques, including keyword extraction and topic modeling, we analyze the evolution of online discourse after events affecting these groups. Examining sentiment and polarizing tweets, we establish correlati… ▽ More

    Submitted 18 November, 2023; originally announced November 2023.

  4. arXiv:2307.10495  [pdf, other

    cs.LG cs.CV eess.SP

    Novel Batch Active Learning Approach and Its Application to Synthetic Aperture Radar Datasets

    Authors: James Chapman, Bohan Chen, Zheng Tan, Jeff Calder, Kevin Miller, Andrea L. Bertozzi

    Abstract: Active learning improves the performance of machine learning methods by judiciously selecting a limited number of unlabeled data points to query for labels, with the aim of maximally improving the underlying classifier's performance. Recent gains have been made using sequential active learning for synthetic aperture radar (SAR) data arXiv:2204.00005. In each iteration, sequential active learning s… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

    Comments: 16 pages, 7 figures, Preprint

    ACM Class: I.2.6; I.2.10; I.4.0; I.4.9

    Journal ref: Proc. SPIE. Algorithms for Synthetic Aperture Radar Imagery XXX (Vol. 12520, pp. 96-111). 13 June 2023

  5. Active Learning of Non-semantic Speech Tasks with Pretrained Models

    Authors: Harlin Lee, Aaqib Saeed, Andrea L. Bertozzi

    Abstract: Pretraining neural networks with massive unlabeled datasets has become popular as it equips the deep models with a better prior to solve downstream tasks. However, this approach generally assumes that the downstream tasks have access to annotated data of sufficient size. In this work, we propose ALOE, a novel system for improving the data- and label-efficiency of non-semantic speech tasks with act… ▽ More

    Submitted 25 February, 2023; v1 submitted 31 October, 2022; originally announced November 2022.

    Comments: Accepted at: ICASSP'23, Code: https://github.com/HarlinLee/ALOE

  6. arXiv:2204.08621  [pdf, other

    math.NA cs.LG

    Proximal Implicit ODE Solvers for Accelerating Learning Neural ODEs

    Authors: Justin Baker, Hedi Xia, Yiwei Wang, Elena Cherkaev, Akil Narayan, Long Chen, Jack Xin, Andrea L. Bertozzi, Stanley J. Osher, Bao Wang

    Abstract: Learning neural ODEs often requires solving very stiff ODE systems, primarily using explicit adaptive step size ODE solvers. These solvers are computationally expensive, requiring the use of tiny step sizes for numerical stability and accuracy guarantees. This paper considers learning neural ODEs using implicit ODE solvers of different orders leveraging proximal operators. The proximal implicit so… ▽ More

    Submitted 18 April, 2022; originally announced April 2022.

    Comments: 20 pages, 7 figures

    MSC Class: 68T07; 65L04 ACM Class: I.2

  7. arXiv:2204.00005  [pdf, other

    cs.LG cs.AI cs.CV eess.IV math.NA

    Graph-based Active Learning for Semi-supervised Classification of SAR Data

    Authors: Kevin Miller, John Mauro, Jason Setiadi, Xoaquin Baca, Zhan Shi, Jeff Calder, Andrea L. Bertozzi

    Abstract: We present a novel method for classification of Synthetic Aperture Radar (SAR) data by combining ideas from graph-based learning and neural network methods within an active learning framework. Graph-based methods in machine learning are based on a similarity graph constructed from the data. When the data consists of raw images composed of scenes, extraneous information can make the classification… ▽ More

    Submitted 30 March, 2022; originally announced April 2022.

    MSC Class: 68R10; 68T07; 68T05 ACM Class: I.2.6; I.2.10; I.4.0; I.4.9

  8. arXiv:2112.15486  [pdf, other

    cs.NI cs.DC cs.LG math.NA

    Efficient and Reliable Overlay Networks for Decentralized Federated Learning

    Authors: Yifan Hua, Kevin Miller, Andrea L. Bertozzi, Chen Qian, Bao Wang

    Abstract: We propose near-optimal overlay networks based on $d$-regular expander graphs to accelerate decentralized federated learning (DFL) and improve its generalization. In DFL a massive number of clients are connected by an overlay network, and they solve machine learning problems collaboratively without sharing raw data. Our overlay network design integrates spectral graph theory and the theoretical co… ▽ More

    Submitted 12 December, 2021; originally announced December 2021.

    Comments: 25 pages, 8 figures

    MSC Class: 65B99; 68T01; 68T09; 68W15

  9. arXiv:2110.07739  [pdf, other

    stat.ML cs.LG

    Model-Change Active Learning in Graph-Based Semi-Supervised Learning

    Authors: Kevin Miller, Andrea L. Bertozzi

    Abstract: Active learning in semi-supervised classification involves introducing additional labels for unlabelled data to improve the accuracy of the underlying classifier. A challenge is to identify which points to label to best improve performance while limiting the number of new labels. "Model-change" active learning quantifies the resulting change incurred in the classifier by introducing the additional… ▽ More

    Submitted 14 October, 2021; originally announced October 2021.

    Comments: Submitted to SIAM Journal on Mathematics of Data Science (SIMODS)

  10. arXiv:2110.04932  [pdf, other

    cs.SI cs.CL

    An Analysis of COVID-19 Knowledge Graph Construction and Applications

    Authors: Dominic Flocco, Bryce Palmer-Toy, Ruixiao Wang, Hongyu Zhu, Rishi Sonthalia, Junyuan Lin, Andrea L. Bertozzi, P. Jeffrey Brantingham

    Abstract: The construction and application of knowledge graphs have seen a rapid increase across many disciplines in recent years. Additionally, the problem of uncovering relationships between developments in the COVID-19 pandemic and social media behavior is of great interest to researchers hoping to curb the spread of the disease. In this paper we present a knowledge graph constructed from COVID-19 relate… ▽ More

    Submitted 10 October, 2021; originally announced October 2021.

  11. arXiv:2110.04840  [pdf, other

    cs.LG cs.AI math.DS math.NA

    Heavy Ball Neural Ordinary Differential Equations

    Authors: Hedi Xia, Vai Suliafu, Hangjie Ji, Tan M. Nguyen, Andrea L. Bertozzi, Stanley J. Osher, Bao Wang

    Abstract: We propose heavy ball neural ordinary differential equations (HBNODEs), leveraging the continuous limit of the classical momentum accelerated gradient descent, to improve neural ODEs (NODEs) training and inference. HBNODEs have two properties that imply practical advantages over NODEs: (i) The adjoint state of an HBNODE also satisfies an HBNODE, accelerating both forward and backward ODE solvers,… ▽ More

    Submitted 10 October, 2021; originally announced October 2021.

    Comments: 23 pages, 9 figures, Accepted for publication at Advances in Neural Information Processing Systems (NeurIPS) 2021

    MSC Class: 68T07 ACM Class: I.2

  12. arXiv:2107.01713  [pdf, other

    cs.SI math.DS nlin.AO physics.soc-ph q-bio.PE

    A Multilayer Network Model of the Coevolution of the Spread of a Disease and Competing Opinions

    Authors: Kaiyan Peng, Zheng Lu, Vanessa Lin, Michael R. Lindstrom, Christian Parkinson, Chuntian Wang, Andrea L. Bertozzi, Mason A. Porter

    Abstract: During the COVID-19 pandemic, conflicting opinions on physical distancing swept across social media, affecting both human behavior and the spread of COVID-19. Inspired by such phenomena, we construct a two-layer multiplex network for the coupled spread of a disease and conflicting opinions. We model each process as a contagion. On one layer, we consider the concurrent evolution of two opinions --… ▽ More

    Submitted 4 July, 2021; originally announced July 2021.

    MSC Class: 91D30; 92D30; 37N25

  13. arXiv:2105.10650  [pdf

    physics.med-ph cs.CV eess.IV

    Post-Radiotherapy PET Image Outcome Prediction by Deep Learning under Biological Model Guidance: A Feasibility Study of Oropharyngeal Cancer Application

    Authors: Hangjie Ji, Kyle Lafata, Yvonne Mowery, David Brizel, Andrea L. Bertozzi, Fang-Fang Yin, Chunhao Wang

    Abstract: This paper develops a method of biologically guided deep learning for post-radiation FDG-PET image outcome prediction based on pre-radiation images and radiotherapy dose information. Based on the classic reaction-diffusion mechanism, a novel biological model was proposed using a partial differential equation that incorporates spatial radiation dose distribution as a patient-specific treatment info… ▽ More

    Submitted 22 May, 2021; originally announced May 2021.

    Comments: 26 pages, 5 figures

  14. Posterior Consistency of Semi-Supervised Regression on Graphs

    Authors: Andrea L. Bertozzi, Bamdad Hosseini, Hao Li, Kevin Miller, Andrew M. Stuart

    Abstract: Graph-based semi-supervised regression (SSR) is the problem of estimating the value of a function on a weighted graph from its values (labels) on a small subset of the vertices. This paper is concerned with the consistency of SSR in the context of classification, in the setting where the labels have small noise and the underlying graph weighting is consistent with well-clustered nodes. We present… ▽ More

    Submitted 24 March, 2021; v1 submitted 24 July, 2020; originally announced July 2020.

  15. arXiv:2007.11126  [pdf, other

    stat.ML cs.LG

    Efficient Graph-Based Active Learning with Probit Likelihood via Gaussian Approximations

    Authors: Kevin Miller, Hao Li, Andrea L. Bertozzi

    Abstract: We present a novel adaptation of active learning to graph-based semi-supervised learning (SSL) under non-Gaussian Bayesian models. We present an approximation of non-Gaussian distributions to adapt previously Gaussian-based acquisition functions to these more general cases. We develop an efficient rank-one update for applying "look-ahead" based methods as well as model retraining. We also introduc… ▽ More

    Submitted 21 July, 2020; originally announced July 2020.

    Comments: Accepted in ICML Workshop on Real World Experiment Design and Active Learning 2020

  16. arXiv:2006.06919  [pdf, other

    cs.LG math.DS stat.ML

    MomentumRNN: Integrating Momentum into Recurrent Neural Networks

    Authors: Tan M. Nguyen, Richard G. Baraniuk, Andrea L. Bertozzi, Stanley J. Osher, Bao Wang

    Abstract: Designing deep neural networks is an art that often involves an expensive search over candidate architectures. To overcome this for recurrent neural nets (RNNs), we establish a connection between the hidden state dynamics in an RNN and gradient descent (GD). We then integrate momentum into this framework and propose a new family of RNNs, called {\em MomentumRNNs}. We theoretically prove and numeri… ▽ More

    Submitted 11 October, 2020; v1 submitted 11 June, 2020; originally announced June 2020.

    Comments: 21 pages, 11 figures, Accepted for publication at Advances in Neural Information Processing Systems (NeurIPS) 2020

    MSC Class: 68T07 ACM Class: I.2

    Journal ref: Advances in Neural Information Processing Systems (NeurIPS) 2020

  17. arXiv:2003.00631  [pdf, other

    cs.LG cs.AI stat.ML

    Sparsity Meets Robustness: Channel Pruning for the Feynman-Kac Formalism Principled Robust Deep Neural Nets

    Authors: Thu Dinh, Bao Wang, Andrea L. Bertozzi, Stanley J. Osher

    Abstract: Deep neural nets (DNNs) compression is crucial for adaptation to mobile devices. Though many successful algorithms exist to compress naturally trained DNNs, developing efficient and stable compression algorithms for robustly trained DNNs remains widely open. In this paper, we focus on a co-design of efficient DNN compression algorithms and sparse neural architectures for robust and accurate deep l… ▽ More

    Submitted 1 March, 2020; originally announced March 2020.

    Comments: 16 pages, 7 figures

    MSC Class: 68T01

  18. arXiv:2002.10583  [pdf, other

    cs.LG cs.NE stat.ML

    Scheduled Restart Momentum for Accelerated Stochastic Gradient Descent

    Authors: Bao Wang, Tan M. Nguyen, Andrea L. Bertozzi, Richard G. Baraniuk, Stanley J. Osher

    Abstract: Stochastic gradient descent (SGD) with constant momentum and its variants such as Adam are the optimization algorithms of choice for training deep neural networks (DNNs). Since DNN training is incredibly computationally expensive, there is great interest in speeding up the convergence. Nesterov accelerated gradient (NAG) improves the convergence rate of gradient descent (GD) for convex optimizatio… ▽ More

    Submitted 26 April, 2020; v1 submitted 24 February, 2020; originally announced February 2020.

    Comments: 35 pages, 16 figures, 18 tables

  19. arXiv:1902.05113  [pdf, other

    cs.LG math.OC stat.ML

    A Study on Graph-Structured Recurrent Neural Networks and Sparsification with Application to Epidemic Forecasting

    Authors: Zhijian Li, Xiyang Luo, Bao Wang, Andrea L. Bertozzi, Jack Xin

    Abstract: We study epidemic forecasting on real-world health data by a graph-structured recurrent neural network (GSRNN). We achieve state-of-the-art forecasting accuracy on the benchmark CDC dataset. To improve model efficiency, we sparsify the network weights via transformed-$\ell_1$ penalty and maintain prediction accuracy at the same level with 70% of the network weights being zero.

    Submitted 13 February, 2019; originally announced February 2019.

  20. arXiv:1811.06321  [pdf, other

    cs.SI eess.SP nlin.AO physics.soc-ph stat.ML

    Multivariate Spatiotemporal Hawkes Processes and Network Reconstruction

    Authors: Baichuan Yuan, Hao Li, Andrea L. Bertozzi, P. Jeffrey Brantingham, Mason A. Porter

    Abstract: There is often latent network structure in spatial and temporal data and the tools of network analysis can yield fascinating insights into such data. In this paper, we develop a nonparametric method for network reconstruction from spatiotemporal data sets using multivariate Hawkes processes. In contrast to prior work on network reconstruction with point-process models, which has often focused on e… ▽ More

    Submitted 15 November, 2018; originally announced November 2018.

  21. arXiv:1809.08516  [pdf, other

    cs.LG math.NA stat.ML

    Adversarial Defense via Data Dependent Activation Function and Total Variation Minimization

    Authors: Bao Wang, Alex T. Lin, Wei Zhu, Penghang Yin, Andrea L. Bertozzi, Stanley J. Osher

    Abstract: We improve the robustness of Deep Neural Net (DNN) to adversarial attacks by using an interpolating function as the output activation. This data-dependent activation remarkably improves both the generalization and robustness of DNN. In the CIFAR10 benchmark, we raise the robust accuracy of the adversarially trained ResNet20 from $\sim 46\%$ to $\sim 69\%$ under the state-of-the-art Iterative Fast… ▽ More

    Submitted 29 April, 2020; v1 submitted 22 September, 2018; originally announced September 2018.

    Comments: 17 pages, 6 figures

    MSC Class: 68Pxx

    Journal ref: Inverse Problems and Imaging, 2020

  22. arXiv:1806.02485  [pdf, other

    cs.SI cond-mat.stat-mech math.ST nlin.AO stat.ML

    Stochastic Block Models are a Discrete Surface Tension

    Authors: Zachary M. Boyd, Mason A. Porter, Andrea L. Bertozzi

    Abstract: Networks, which represent agents and interactions between them, arise in myriad applications throughout the sciences, engineering, and even the humanities. To understand large-scale structure in a network, a common task is to cluster a network's nodes into sets called "communities", such that there are dense connections within communities but sparse connections between them. A popular and statisti… ▽ More

    Submitted 24 March, 2019; v1 submitted 6 June, 2018; originally announced June 2018.

    Comments: to appear in Journal of Nonlinear Science

    MSC Class: 65K10; 49M20; 35Q56; 62H30; 91C20; 91D30; 94C15

  23. arXiv:1804.00684  [pdf, other

    cs.LG math.NA stat.ML

    Graph-Based Deep Modeling and Real Time Forecasting of Sparse Spatio-Temporal Data

    Authors: Bao Wang, Xiyang Luo, Fangbo Zhang, Baichuan Yuan, Andrea L. Bertozzi, P. Jeffrey Brantingham

    Abstract: We present a generic framework for spatio-temporal (ST) data modeling, analysis, and forecasting, with a special focus on data that is sparse in both space and time. Our multi-scaled framework is a seamless coupling of two major components: a self-exciting point process that models the macroscale statistical behaviors of the ST data and a graph structured recurrent neural network (GSRNN) to discov… ▽ More

    Submitted 2 April, 2018; originally announced April 2018.

    Comments: 9 pages, 19 figures

    MSC Class: 65-06

  24. arXiv:1711.08833  [pdf, other

    cs.LG math.NA stat.ML

    Deep Learning for Real-Time Crime Forecasting and its Ternarization

    Authors: Bao Wang, Penghang Yin, Andrea L. Bertozzi, P. Jeffrey Brantingham, Stanley J. Osher, Jack Xin

    Abstract: Real-time crime forecasting is important. However, accurate prediction of when and where the next crime will happen is difficult. No known physical model provides a reasonable approximation to such a complex system. Historical crime data are sparse in both space and time and the signal of interests is weak. In this work, we first present a proper representation of crime data. We then adapt the spa… ▽ More

    Submitted 23 November, 2017; originally announced November 2017.

    Comments: 14 pages, 7 figures

    MSC Class: 62-07

  25. arXiv:1707.03340  [pdf, other

    math.NA cs.LG stat.ML

    Deep Learning for Real Time Crime Forecasting

    Authors: Bao Wang, Duo Zhang, Duanhao Zhang, P. Jeffery Brantingham, Andrea L. Bertozzi

    Abstract: Accurate real time crime prediction is a fundamental issue for public safety, but remains a challenging problem for the scientific community. Crime occurrences depend on many complex factors. Compared to many predictable events, crime is sparse. At different spatio-temporal scales, crime distributions display dramatically different patterns. These distributions are of very low regularity in both s… ▽ More

    Submitted 9 July, 2017; originally announced July 2017.

    Comments: 4 pages, 6 figures, NOLTA, 2017

    MSC Class: 68T05

  26. arXiv:1704.02955  [pdf, other

    cs.DB

    Unsupervised record matching with noisy and incomplete data

    Authors: Yves van Gennip, Blake Hunter, Anna Ma, Daniel Moyer, Ryan de Vera, Andrea L. Bertozzi

    Abstract: We consider the problem of duplicate detection in noisy and incomplete data: given a large data set in which each record has multiple entries (attributes), detect which distinct records refer to the same real world entity. This task is complicated by noise (such as misspellings) and missing data, which can lead to records being different, despite referring to the same entity. Our method consists o… ▽ More

    Submitted 30 April, 2018; v1 submitted 10 April, 2017; originally announced April 2017.

    Comments: 24 pages, 17 figures; this second version has various significant updates compared to version 1 as a result of the peer review process prior to journal publication; we thank the reviewers for their comments

  27. arXiv:1703.08816  [pdf, other

    cs.LG stat.ML

    Uncertainty quantification in graph-based classification of high dimensional data

    Authors: Andrea L. Bertozzi, Xiyang Luo, Andrew M. Stuart, Konstantinos C. Zygalakis

    Abstract: Classification of high dimensional data finds wide-ranging applications. In many of these applications equipping the resulting classification with a measure of uncertainty may be as important as the classification itself. In this paper we introduce, develop algorithms for, and investigate the properties of, a variety of Bayesian models for the task of binary classification; via the posterior distr… ▽ More

    Submitted 8 February, 2018; v1 submitted 26 March, 2017; originally announced March 2017.

    Comments: 33 pages, 14 figures

  28. Crime Topic Modeling

    Authors: Da Kuang, P. Jeffrey Brantingham, Andrea L. Bertozzi

    Abstract: The classification of crime into discrete categories entails a massive loss of information. Crimes emerge out of a complex mix of behaviors and situations, yet most of these details cannot be captured by singular crime type labels. This information loss impacts our ability to not only understand the causes of crime, but also how to develop optimal crime prevention strategies. We apply machine lear… ▽ More

    Submitted 6 August, 2018; v1 submitted 5 January, 2017; originally announced January 2017.

    Comments: 47 pages, 4 tables, 7 figures

    Journal ref: Kuang, D., Brantingham, P. J., & Bertozzi, A. L. (2017). Crime topic modeling. Crime Science, 6(1), 12

  29. Unsupervised Classification in Hyperspectral Imagery with Nonlocal Total Variation and Primal-Dual Hybrid Gradient Algorithm

    Authors: Wei Zhu, Victoria Chayes, Alexandre Tiard, Stephanie Sanchez, Devin Dahlberg, Andrea L. Bertozzi, Stanley Osher, Dominique Zosso, Da Kuang

    Abstract: In this paper, a graph-based nonlocal total variation method (NLTV) is proposed for unsupervised classification of hyperspectral images (HSI). The variational problem is solved by the primal-dual hybrid gradient (PDHG) algorithm. By squaring the labeling function and using a stable simplex clustering routine, an unsupervised clustering method with random initialization can be implemented. The effe… ▽ More

    Submitted 13 February, 2017; v1 submitted 27 April, 2016; originally announced April 2016.

  30. arXiv:1510.08106  [pdf, other

    physics.soc-ph cs.SI

    Growth and Containment of a Hierarchical Criminal Network

    Authors: Charles Z. Marshak, M. Puck Rombach, Andrea L. Bertozzi, Maria R. D'Orsogna

    Abstract: We model the hierarchical evolution of an organized criminal network via antagonistic recruitment and pursuit processes. Within the recruitment phase, a criminal kingpin enlists new members into the network, who in turn seek out other affiliates. New recruits are linked to established criminals according to a probability distribution that depends on the current network structure. At the same time,… ▽ More

    Submitted 15 January, 2016; v1 submitted 27 October, 2015; originally announced October 2015.

    Comments: 16 pages, 11 Figures; New title; Updated figures with color scheme better suited for colorblind readers and for gray scale printing

  31. arXiv:1304.4679  [pdf, other

    cs.SI math.OC physics.soc-ph

    A Method Based on Total Variation for Network Modularity Optimization using the MBO Scheme

    Authors: Huiyi Hu, Thomas Laurent, Mason A. Porter, Andrea L. Bertozzi

    Abstract: The study of network structure is pervasive in sociology, biology, computer science, and many other disciplines. One of the most important areas of network science is the algorithmic detection of cohesive groups of nodes called "communities". One popular approach to find communities is to maximize a quality function known as {\em modularity} to achieve some sort of optimal clustering of nodes. In… ▽ More

    Submitted 17 April, 2013; originally announced April 2013.

    Comments: 23 pages

    MSC Class: 62H30; 91C20; 91D30; 94C15

  32. arXiv:1211.7180  [pdf, other

    cs.SI cs.CV physics.data-an physics.soc-ph

    Multislice Modularity Optimization in Community Detection and Image Segmentation

    Authors: Huiyi Hu, Yves van Gennip, Blake Hunter, Mason A. Porter, Andrea L. Bertozzi

    Abstract: Because networks can be used to represent many complex systems, they have attracted considerable attention in physics, computer science, sociology, and many other disciplines. One of the most important areas of network science is the algorithmic detection of cohesive groups (i.e., "communities") of nodes. In this paper, we algorithmically detect communities in social networks and image data by opt… ▽ More

    Submitted 30 November, 2012; originally announced November 2012.

    Comments: 3 pages, 2 figures, to appear in IEEE International Conference on Data Mining PhD forum conference proceedings

  33. arXiv:1206.4969  [pdf, other

    stat.AP cs.SI physics.soc-ph

    Community detection using spectral clustering on sparse geosocial data

    Authors: Yves van Gennip, Blake Hunter, Raymond Ahn, Peter Elliott, Kyle Luh, Megan Halvorson, Shannon Reid, Matt Valasik, James Wo, George E. Tita, Andrea L. Bertozzi, P. Jeffrey Brantingham

    Abstract: In this article we identify social communities among gang members in the Hollenbeck policing district in Los Angeles, based on sparse observations of a combination of social interactions and geographic locations of the individuals. This information, coming from LAPD Field Interview cards, is used to construct a similarity graph for the individuals. We use spectral clustering to identify clusters i… ▽ More

    Submitted 8 November, 2012; v1 submitted 21 June, 2012; originally announced June 2012.

    Comments: 22 pages, 6 figures (with subfigures)

    MSC Class: 62H30; 91C20; 91D30; 94C15