Zum Hauptinhalt springen

Showing 1–38 of 38 results for author: Murphy, T B

.
  1. arXiv:2409.01874  [pdf, other

    stat.ME stat.AP

    Partial membership models for soft clustering of multivariate football player performance data

    Authors: Emiliano Seri, Roberto Rocci, Thomas Brendan Murphy

    Abstract: The standard mixture modelling framework has been widely used to study heterogeneous populations, by modelling them as being composed of a finite number of homogeneous sub-populations. However, the standard mixture model assumes that each data point belongs to one and only one mixture component, or cluster, but when data points have fractional membership in multiple clusters this assumption is unr… ▽ More

    Submitted 3 September, 2024; originally announced September 2024.

  2. arXiv:2404.05566  [pdf, other

    stat.AP

    Hausdorff Distance-Based Record Linkage for Improved Matching of Households and Individuals in Different Databases

    Authors: Thais Pacheco Menezes, Thomas Brendan Murphy, Michael Fop

    Abstract: Matching households and individuals across different databases poses challenges due to the lack of unique identifiers, typographical errors, and changes in attributes over time. Record linkage tools play a crucial role in overcoming these difficulties. This paper presents a multi-step record linkage procedure that incorporates household information to enhance the entity-matching process across mul… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: 20 pages, 5 figures, 10 tables

  3. arXiv:2310.11095  [pdf

    stat.OT

    Interview with Adrian Raftery

    Authors: Leontine Alkema, Thomas Brendan Murphy, Adrian E. Raftery

    Abstract: Professor Adrian E. Raftery is the Boeing International Professor of Statistics and Sociology, and an adjunct professor of Atmospheric Sciences, at the University of Washington in Seattle. He was born in Dublin, Ireland, and obtained a B.A. in Mathematics and an M.Sc. in Statistics and Operations Research at Trinity College Dublin. He obtained a doctorate in mathematical statistics from the Univer… ▽ More

    Submitted 31 October, 2023; v1 submitted 17 October, 2023; originally announced October 2023.

    Comments: 17 pages, 8 figures

  4. arXiv:2211.01938  [pdf, other

    stat.ME

    A family of mixture models for beta valued DNA methylation data

    Authors: Koyel Majumdar, Romina Silva, Antoinette Sabrina Perry, Ronald William Watson, Andrea Rau, Florence Jaffrezic, Thomas Brendan Murphy, Isobel Claire Gormley

    Abstract: As hypermethylation of promoter cytosine-guanine dinucleotide (CpG) islands has been shown to silence tumour suppressor genes, identifying differentially methylated CpG sites between different samples can assist in understanding disease. Differentially methylated CpG sites (DMCs) can be identified using moderated t-tests or nonparametric tests, but this typically requires the use of data transform… ▽ More

    Submitted 18 March, 2024; v1 submitted 3 November, 2022; originally announced November 2022.

    Comments: 27 pages, 4 figures

  5. Adaptive data collection for intra-individual studies affected by adherence

    Authors: Greta Monacelli, Lili Zhang, Winfried Schlee, Berthold Langguth, Tomás E. Ward, Thomas B. Murphy

    Abstract: Recently the use of mobile technologies in Ecological Momentary Assessments (EMA) and Interventions (EMI) has made it easier to collect data suitable for intra-individual variability studies in the medical field. Nevertheless, especially when self-reports are used during the data collection process, there are difficulties in balancing data quality and the burden placed on the subjects. In this pap… ▽ More

    Submitted 25 July, 2022; originally announced July 2022.

    Comments: 12 pages, 4 figures

  6. arXiv:2106.04705  [pdf, other

    physics.soc-ph q-bio.PE

    Calibrating COVID-19 SEIR models with time-varying effective contact rates

    Authors: James P. Gleeson, Thomas Brendan Murphy, Joseph D. O'Brien, Nial Friel, Norma Bargary, David J. P. O'Sullivan

    Abstract: We describe the population-based SEIR (susceptible, exposed, infected, removed) model developed by the Irish Epidemiological Modelling Advisory Group (IEMAG), which advises the Irish government on COVID-19 responses. The model assumes a time-varying effective contact rate (equivalently, a time-varying reproduction number) to model the effect of non-pharmaceutical interventions. A crucial technical… ▽ More

    Submitted 8 June, 2021; originally announced June 2021.

    Journal ref: Phil. Trans. R. Soc. A 380: 20210120. (2021)

  7. Identifying Brexit voting patterns in the British House of Commons: an analysis based on Bayesian mixture models with flexible concomitant covariate effects

    Authors: Marco Berrettini, Giuliano Galimberti, Saverio Ranciati, Thomas Brendan Murphy

    Abstract: Brexit and its implications are an ongoing topic of interest since the Brexit referendum in 2016. In 2019 the House of commons held a number of "indicative" and "meaningful" votes as part of the Brexit approval process. The voting behaviour of members of the parliament in these votes is investigated to gain insight into the Brexit approval process. In particular, a mixture model with concomitant c… ▽ More

    Submitted 12 February, 2024; v1 submitted 26 May, 2021; originally announced May 2021.

    Comments: This is the Authors' original version of an article accepted for pubblication in the Journal of the Royal Statistical Society - Series C published by Oxford University Press

  8. arXiv:2102.01982  [pdf, other

    stat.ME stat.CO stat.ML

    Unobserved classes and extra variables in high-dimensional discriminant analysis

    Authors: Michael Fop, Pierre-Alexandre Mattei, Charles Bouveyron, Thomas Brendan Murphy

    Abstract: In supervised classification problems, the test set may contain data points belonging to classes not observed in the learning phase. Moreover, the same units in the test data may be measured on a set of additional variables recorded at a subsequent stage with respect to when the learning sample was collected. In this situation, the classifier built in the learning phase needs to adapt to handle po… ▽ More

    Submitted 3 February, 2021; originally announced February 2021.

    Comments: 29 pages, 29 figures

  9. arXiv:2101.12499  [pdf, other

    stat.ME stat.AP

    Parsimonious Bayesian Factor Analysis for modelling latent structures in spectroscopy data

    Authors: Alessandro Casa, Tom F. O'Callaghan, Thomas Brendan Murphy

    Abstract: In recent years animal diet has been receiving increased attention, in particular examining the impact of pasture-based feeding strategies on the quality of milk and dairy products, in line with the increased prevalence of grass-fed dairy products appearing on market shelves. To date, there are limited testing methods available for the verification of grass-fed dairy therefore these products are s… ▽ More

    Submitted 29 January, 2021; originally announced January 2021.

    Comments: 23 pages, 6 figures

  10. arXiv:2010.10415  [pdf, other

    stat.AP

    Robust variable selection in the framework of classification with label noise and outliers: applications to spectroscopic data in agri-food

    Authors: Andrea Cappozzo, Ludovic Duponchel, Francesca Greselin, Thomas Brendan Murphy

    Abstract: Classification of high-dimensional spectroscopic data is a common task in analytical chemistry. Well-established procedures like support vector machines (SVMs) and partial least squares discriminant analysis (PLS-DA) are the most common methods for tackling this supervised learning problem. Nonetheless, interpretation of these models remains sometimes difficult, and solutions based on feature sele… ▽ More

    Submitted 28 January, 2021; v1 submitted 20 October, 2020; originally announced October 2020.

  11. arXiv:2007.14810  [pdf, other

    stat.AP

    Robust variable selection for model-based learning in presence of adulteration

    Authors: Andrea Cappozzo, Francesca Greselin, Thomas Brendan Murphy

    Abstract: The problem of identifying the most discriminating features when performing supervised learning has been extensively investigated. In particular, several methods for variable selection in model-based classification have been proposed. Surprisingly, the impact of outliers and wrongly labeled units on the determination of relevant predictors has received far less attention, with almost no dedicated… ▽ More

    Submitted 15 December, 2020; v1 submitted 29 July, 2020; originally announced July 2020.

  12. arXiv:2006.02954  [pdf, other

    stat.ML cs.LG stat.ME

    Handling missing data in model-based clustering

    Authors: Alessio Serafini, Thomas Brendan Murphy, Luca Scrucca

    Abstract: Gaussian Mixture models (GMMs) are a powerful tool for clustering, classification and density estimation when clustering structures are embedded in the data. The presence of missing values can largely impact the GMMs estimation process, thus handling missing data turns out to be a crucial point in clustering, classification and density estimation. Several techniques have been developed to impute t… ▽ More

    Submitted 4 June, 2020; originally announced June 2020.

  13. Anomaly and Novelty detection for robust semi-supervised learning

    Authors: Andrea Cappozzo, Francesca Greselin, Thomas Brendan Murphy

    Abstract: Three important issues are often encountered in Supervised and Semi-Supervised Classification: class-memberships are unreliable for some training units (label noise), a proportion of observations might depart from the main structure of the data (outliers) and new groups in the test set may have not been encountered earlier in the learning phase (unobserved classes). The present work introduces a r… ▽ More

    Submitted 29 May, 2020; v1 submitted 19 November, 2019; originally announced November 2019.

  14. arXiv:1908.07963  [pdf, ps, other

    stat.ME stat.AP

    Clustering Longitudinal Life-Course Sequences Using Mixtures of Exponential-Distance Models

    Authors: Keefe Murphy, Thomas Brendan Murphy, Raffaella Piccarreta, Isobel Claire Gormley

    Abstract: Sequence analysis is an increasingly popular approach for analysing life courses represented by ordered collections of activities experienced by subjects over time. Here, we analyse a survey data set containing information on the career trajectories of a cohort of Northern Irish youths tracked between the ages of 16 and 22. We propose a novel, model-based clustering approach suited to the analysis… ▽ More

    Submitted 21 December, 2021; v1 submitted 21 August, 2019; originally announced August 2019.

    Comments: Published in Journal of the Royal Statistical Society: Series A (Statistics in Society)

    Journal ref: Journal of the Royal Statistical Society: Series A (Statistics in Society), 184(4): 1414-1451 (2021)

  15. A robust approach to model-based classification based on trimming and constraints

    Authors: Andrea Cappozzo, Francesca Greselin, Thomas Brendan Murphy

    Abstract: In a standard classification framework a set of trustworthy learning data are employed to build a decision rule, with the final aim of classifying unlabelled units belonging to the test set. Therefore, unreliable labelled observations, namely outliers and data with incorrect labels, can strongly undermine the classifier performance, especially if the training size is small. The present work introd… ▽ More

    Submitted 5 August, 2019; v1 submitted 12 April, 2019; originally announced April 2019.

  16. arXiv:1904.04699  [pdf, other

    stat.AP

    Bivariate Gamma Mixture of Experts Models for Joint Insurance Claims Modeling

    Authors: Sen Hu, T Brendan Murphy, Adrian O'Hagan

    Abstract: In general insurance, risks from different categories are often modeled independently and their sum is regarded as the total risk the insurer takes on in exchange for a premium. The dependence from multiple risks is generally neglected even when correlation could exist, for example a single car accident may result in claims from multiple risk categories. It is desirable to take the covariance of d… ▽ More

    Submitted 9 April, 2019; originally announced April 2019.

  17. arXiv:1808.05185  [pdf, other

    stat.ME

    Model-based clustering for random hypergraphs

    Authors: Tin Lok James Ng, Thomas Brendan Murphy

    Abstract: A probabilistic model for random hypergraphs is introduced to represent unary, binary and higher order interactions among objects in real-world problems. This model is an extension of the Latent Class Analysis model, which captures clustering structures among objects. An EM (expectation maximization) algorithm with MM (minorization maximization) steps is developed to perform parameter estimation w… ▽ More

    Submitted 15 August, 2018; originally announced August 2018.

    Comments: 27 pages, 6 figures

  18. arXiv:1807.06063  [pdf, other

    stat.AP

    Modeling the social media relationships of Irish politicians using a generalized latent space stochastic blockmodel

    Authors: Tin Lok James Ng, Thomas Brendan Murphy, Ted Westling, Tyler H. McCormick, Bailey K. Fosdick

    Abstract: Dáil Éireann is the principal chamber of the Irish parliament. The 31st Dáil Éireann is the principal chamber of the Irish parliament. The 31st Dáil was in session from March 11th, 2011 to February 6th, 2016. Many of the members of the Dáil were active on social media and many were Twitter users who followed other members of the Dáil. The pattern of following amongst these politicians provides ins… ▽ More

    Submitted 13 December, 2020; v1 submitted 16 July, 2018; originally announced July 2018.

    Comments: 31 pages, 9 figures

    MSC Class: 62Pxx

  19. Modelling heterogeneity in Latent Space Models for Multidimensional Networks

    Authors: Silvia D'Angelo, Marco Alfò, Thomas Brendan Murphy

    Abstract: Multidimensional network data can have different levels of complexity, as nodes may be characterized by heterogeneous individual-specific features, which may vary across the networks. This paper introduces a class of models for multidimensional network data, where different levels of heterogeneity within and between networks can be considered. The proposed framework is developed in the family of l… ▽ More

    Submitted 7 June, 2019; v1 submitted 10 July, 2018; originally announced July 2018.

    Journal ref: Stat. Neerl. 74(3): 324-341 (August 2020)

  20. Latent Space Modeling of Multidimensional Networks with Application to the Exchange of Votes in Eurovision Song Contest

    Authors: Silvia D'Angelo, Thomas Brendan Murphy, Marco Alfò

    Abstract: The Eurovision Song Contest is a popular TV singing competition held annually among country members of the European Broadcasting Union. In this competition, each member can be both contestant and jury, as it can participate with a song and/or vote for other countries' tunes. Throughout the years, the voting system has repeatedly been accused of being biased by the presence of tactical voting, acco… ▽ More

    Submitted 13 March, 2018; originally announced March 2018.

    Journal ref: Ann. Appl. Stat. 13(2): 900-930 (June 2019)

  21. arXiv:1711.07748  [pdf, other

    stat.ME stat.CO

    Model-based Clustering with Sparse Covariance Matrices

    Authors: Michael Fop, Thomas Brendan Murphy, Luca Scrucca

    Abstract: Finite Gaussian mixture models are widely used for model-based clustering of continuous data. Nevertheless, since the number of model parameters scales quadratically with the number of variables, these models can be easily over-parameterized. For this reason, parsimonious models have been developed via covariance matrix decompositions or assuming local independence. However, these remedies do not… ▽ More

    Submitted 23 September, 2018; v1 submitted 21 November, 2017; originally announced November 2017.

  22. Gaussian Parsimonious Clustering Models with Covariates and a Noise Component

    Authors: Keefe Murphy, Thomas Brendan Murphy

    Abstract: We consider model-based clustering methods for continuous, correlated data that account for external information available in the presence of mixed-type fixed covariates by proposing the MoEClust suite of models. These models allow different subsets of covariates to influence the component weights and/or component densities by modelling the parameters of the mixture as functions of the covariates.… ▽ More

    Submitted 13 July, 2021; v1 submitted 15 November, 2017; originally announced November 2017.

    Comments: Published in Advances in Data Analysis and Classification

    Journal ref: Advances in Data Analysis and Classification, 14(2): 293-325 (2020)

  23. arXiv:1710.03704  [pdf, other

    stat.AP

    Motor Insurance Accidental Damage Claims Modeling with Factor Collapsing and Bayesian Model Averaging

    Authors: Sen Hu, Adrian O'Hagan, Thomas Brendan Murphy

    Abstract: Accidental damage is a typical component of motor insurance claim. Modeling of this nature generally involves analysis of past claim history and different characteristics of the insured objects and the policyholders. Generalized linear models (GLMs) have become the industry's standard approach for pricing and modeling risks of this nature. However, the GLM approach utilizes a single "best" model o… ▽ More

    Submitted 10 October, 2017; originally announced October 2017.

  24. arXiv:1707.00306  [pdf, other

    stat.ME stat.AP stat.ML

    Variable Selection Methods for Model-based Clustering

    Authors: Michael Fop, Thomas Brendan Murphy

    Abstract: Model-based clustering is a popular approach for clustering multivariate data which has seen applications in numerous fields. Nowadays, high-dimensional data are more and more common and the model-based clustering approach has adapted to deal with the increasing dimensionality. In particular, the development of variable selection techniques has received a lot of attention and research effort in re… ▽ More

    Submitted 4 June, 2018; v1 submitted 2 July, 2017; originally announced July 2017.

    Journal ref: Statistics Surveys, 12 (2018) 1-48

  25. arXiv:1608.07618  [pdf, other

    stat.ME stat.AP

    Multiresolution network models

    Authors: Bailey K. Fosdick, Tyler H. McCormick, Thomas Brendan Murphy, Tin Lok James Ng, Ted Westling

    Abstract: Many existing statistical and machine learning tools for social network analysis focus on a single level of analysis. Methods designed for clustering optimize a global partition of the graph, whereas projection based approaches (e.g. the latent space model in the statistics literature) represent in rich detail the roles of individuals. Many pertinent questions in sociology and economics, however,… ▽ More

    Submitted 5 July, 2018; v1 submitted 26 August, 2016; originally announced August 2016.

  26. Exponential Family Mixed Membership Models for Soft~Clustering of Multivariate Data

    Authors: Arthur White, Thomas Brendan Murphy

    Abstract: For several years, model-based clustering methods have successfully tackled many of the challenges presented by data-analysts. However, as the scope of data analysis has evolved, some problems may be beyond the standard mixture model framework. One such problem is when observations in a dataset come from overlapping clusters, whereby different clusters will possess similar parameters for multiple… ▽ More

    Submitted 10 August, 2016; originally announced August 2016.

    Journal ref: White, A. & Murphy, T.B. Adv Data Anal Classif (2016) 10: 521

  27. arXiv:1512.03350  [pdf, other

    stat.ME stat.AP stat.CO

    Variable Selection for Latent Class Analysis with Application to Low Back Pain Diagnosis

    Authors: Michael Fop, Keith Smart, Thomas Brendan Murphy

    Abstract: The identification of most relevant clinical criteria related to low back pain disorders may aid the evaluation of the nature of pain suffered in a way that usefully informs patient assessment and treatment. Data concerning low back pain can be of categorical nature, in the form of a check-list in which each item denotes presence or absence of a clinical condition. Latent class analysis is a model… ▽ More

    Submitted 5 February, 2018; v1 submitted 10 December, 2015; originally announced December 2015.

    Comments: Published in The Annals of Applied Statistics by the Institute of Mathematical Statistics

    MSC Class: 62H30

    Journal ref: The Annals of Applied Statistics 2017, Vol. 11, No. 4, 2085-2115

  28. arXiv:1510.00551  [pdf, ps, other

    stat.CO stat.ME

    Investigation of Parameter Uncertainty in Clustering Using a Gaussian Mixture Model Via Jackknife, Bootstrap and Weighted Likelihood Bootstrap

    Authors: Adrian O'Hagan, Thomas Brendan Murphy, Luca Scrucca, Isobel Claire Gormley

    Abstract: Mixture models are a popular tool in model-based clustering. Such a model is often fitted by a procedure that maximizes the likelihood, such as the EM algorithm. At convergence, the maximum likelihood parameter estimates are typically reported, but in most cases little emphasis is placed on the variability associated with these estimates. In part this may be due to the fact that standard errors ar… ▽ More

    Submitted 22 July, 2019; v1 submitted 2 October, 2015; originally announced October 2015.

  29. arXiv:1506.09035  [pdf, other

    stat.CO

    Bayesian model averaging in model-based clustering and density estimation

    Authors: Niamh Russell, Thomas Brendan Murphy, Adrian E Raftery

    Abstract: We propose Bayesian model averaging (BMA) as a method for postprocessing the results of model-based clustering. Given a number of competing models, appropriate model summaries are averaged, using the posterior model probabilities, instead of being taken from a single "best" model. We demonstrate the use of BMA in model-based clustering for a number of datasets. We show that BMA provides a useful s… ▽ More

    Submitted 30 June, 2015; originally announced June 2015.

  30. arXiv:1404.0221  [pdf, other

    stat.CO stat.ME

    Mixed-Membership of Experts Stochastic Blockmodel

    Authors: Arthur White, Thomas Brendan Murphy

    Abstract: Social network analysis is the study of how links between a set of actors are formed. Typically, it is believed that links are formed in a structured manner, which may be due to, for example, political or material incentives, and which often may not be directly observable. The stochastic blockmodel represents this structure using latent groups which exhibit different connective properties, so that… ▽ More

    Submitted 1 April, 2014; originally announced April 2014.

    Comments: 32 pages, 8 figures

    Journal ref: Network Science, 4, pp 48-80 (2016)

  31. Bayesian variable selection for latent class analysis using a collapsed Gibbs sampler

    Authors: Arthur White, Jason Wyse, Thomas Brendan Murphy

    Abstract: Latent class analysis is used to perform model based clustering for multivariate categorical responses. Selection of the variables most relevant for clustering is an important task which can affect the quality of clustering considerably. This work considers a Bayesian approach for selecting the number of clusters and the best clustering variables. The main idea is to reformulate the problem of gro… ▽ More

    Submitted 30 April, 2015; v1 submitted 27 February, 2014; originally announced February 2014.

    Comments: (to appear in Statistics and Computing)

    Journal ref: Statistics and Computing January 2016, Volume 26, Issue 1, pp 511-527

  32. arXiv:1308.3740  [pdf, other

    stat.AP cs.LG stat.ML

    Standardizing Interestingness Measures for Association Rules

    Authors: Mateen Shaikh, Paul D. McNicholas, M. Luiza Antonie, T. Brendan Murphy

    Abstract: Interestingness measures provide information that can be used to prune or select association rules. A given value of an interestingness measure is often interpreted relative to the overall range of the values that the interestingness measure can take. However, properties of individual association rules restrict the values an interestingness measure can achieve. An interesting measure can be standa… ▽ More

    Submitted 16 August, 2013; originally announced August 2013.

  33. arXiv:1301.3759  [pdf, other

    stat.ME stat.AP

    Joint Modelling of Multiple Network Views

    Authors: Isabella Gollini, Thomas Brendan Murphy

    Abstract: Latent space models (LSM) for network data were introduced by Hoff et al. (2002) under the basic assumption that each node of the network has an unknown position in a D-dimensional Euclidean latent space: generally the smaller the distance between two nodes in the latent space, the greater their probability of being connected. In this paper we propose a variational Bayes approach to estimate the i… ▽ More

    Submitted 25 September, 2014; v1 submitted 16 January, 2013; originally announced January 2013.

    Comments: Main paper and Supplementary material: 37 (27 + 10) pages, 20 (16 + 4) figures

  34. arXiv:1301.2167  [pdf, other

    stat.ME stat.CO

    Mixture of Latent Trait Analyzers for Model-Based Clustering of Categorical Data

    Authors: Isabella Gollini, Thomas Brendan Murphy

    Abstract: Model-based clustering methods for continuous data are well established and commonly used in a wide range of applications. However, model-based clustering methods for categorical data are less standard. Latent class analysis is a commonly used method for model-based clustering of binary data and/or categorical data, but due to an assumed local independence structure there may not be a corresponden… ▽ More

    Submitted 19 February, 2013; v1 submitted 10 January, 2013; originally announced January 2013.

    Comments: Accepted to appear in Statistics and Computing; Main paper and supplementary material

  35. arXiv:1211.5037  [pdf, ps, other

    stat.ML cs.LG stat.ME

    Bayesian nonparametric Plackett-Luce models for the analysis of preferences for college degree programmes

    Authors: François Caron, Yee Whye Teh, Thomas Brendan Murphy

    Abstract: In this paper we propose a Bayesian nonparametric model for clustering partial ranking data. We start by developing a Bayesian nonparametric extension of the popular Plackett-Luce choice model that can handle an infinite number of choice items. Our framework is based on the theory of random atomic measures, with the prior specified by a completely random measure. We characterise the posterior dist… ▽ More

    Submitted 1 August, 2014; v1 submitted 21 November, 2012; originally announced November 2012.

    Comments: Published in at http://dx.doi.org/10.1214/14-AOAS717 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS717

    Journal ref: Annals of Applied Statistics 2014, Vol. 8, No. 2, 1145-1181

  36. arXiv:1203.3083  [pdf, other

    stat.CO

    Clustering in networks with the collapsed Stochastic Block Model

    Authors: Aaron F. McDaid, Thomas Brendan Murphy, Nial Friel, Neil J Hurley

    Abstract: An efficient MCMC algorithm is presented to cluster the nodes of a network such that nodes with similar role in the network are clustered together. This is known as block-modelling or block-clustering. The model is the stochastic blockmodel (SBM) with block parameters integrated out. The resulting marginal distribution defines a posterior over the number of clusters and cluster memberships. Sampli… ▽ More

    Submitted 8 November, 2012; v1 submitted 14 March, 2012; originally announced March 2012.

    Comments: A later version, called "Improved Bayesian inference for the Stochastic Block Model with application to large networks" has been accepted by 'Computational Statistics and Data Analysis'. Publication date to be confirmed

  37. arXiv:0910.2585  [pdf, ps, other

    stat.ME stat.AP

    Variable selection and updating in model-based discriminant analysis for high dimensional data with food authenticity applications

    Authors: Thomas Brendan Murphy, Nema Dean, Adrian E. Raftery

    Abstract: Food authenticity studies are concerned with determining if food samples have been correctly labeled or not. Discriminant analysis methods are an integral part of the methodology for food authentication. Motivated by food authenticity applications, a model-based discriminant analysis method that includes variable selection is presented. The discriminant analysis model is fitted in a semi-supervise… ▽ More

    Submitted 7 October, 2010; v1 submitted 14 October, 2009; originally announced October 2009.

    Comments: Published in at http://dx.doi.org/10.1214/09-AOAS279 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS279

    Journal ref: Annals of Applied Statistics 2010, Vol. 4, No. 1, 396-421

  38. arXiv:0901.4203  [pdf, ps, other

    stat.AP stat.CO

    A mixture of experts model for rank data with applications in election studies

    Authors: Isobel Claire Gormley, Thomas Brendan Murphy

    Abstract: A voting bloc is defined to be a group of voters who have similar voting preferences. The cleavage of the Irish electorate into voting blocs is of interest. Irish elections employ a ``single transferable vote'' electoral system; under this system voters rank some or all of the electoral candidates in order of preference. These rank votes provide a rich source of preference information from which… ▽ More

    Submitted 27 January, 2009; originally announced January 2009.

    Comments: Published in at http://dx.doi.org/10.1214/08-AOAS178 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS178

    Journal ref: Annals of Applied Statistics 2008, Vol. 2, No. 4, 1452-1477