Zum Hauptinhalt springen

Showing 1–11 of 11 results for author: Reid, M D

Searching in archive cs. Search in all archives.
.
  1. arXiv:1606.03203  [pdf, other

    stat.ML cs.LG

    Causal Bandits: Learning Good Interventions via Causal Inference

    Authors: Finnian Lattimore, Tor Lattimore, Mark D. Reid

    Abstract: We study the problem of using causal models to improve the rate at which good interventions can be learned online in a stochastic environment. Our formalism combines multi-arm bandits and causal inference to model a novel type of bandit feedback that is not exploited by existing approaches. We propose a new algorithm that exploits the causal feedback and prove a bound on its simple regret that is… ▽ More

    Submitted 10 June, 2016; originally announced June 2016.

  2. arXiv:1602.02852  [pdf, other

    stat.ML cs.LG

    Compliance-Aware Bandits

    Authors: Nicolás Della Penna, Mark D. Reid, David Balduzzi

    Abstract: Motivated by clinical trials, we study bandits with observable non-compliance. At each step, the learner chooses an arm, after, instead of observing only the reward, it also observes the action that took place. We show that such noncompliance can be helpful or hurtful to the learner in general. Unfortunately, naively incorporating compliance information into bandit algorithms loses guarantees on s… ▽ More

    Submitted 8 February, 2016; originally announced February 2016.

  3. arXiv:1507.02592  [pdf, other

    cs.LG stat.ML

    Fast rates in statistical and online learning

    Authors: Tim van Erven, Peter D. Grünwald, Nishant A. Mehta, Mark D. Reid, Robert C. Williamson

    Abstract: The speed with which a learning algorithm converges as it is presented with more data is a central problem in machine learning --- a fast rate of convergence means less data is needed for the same level of performance. The pursuit of fast rates in online and statistical learning has led to the discovery of many conditions in learning theory under which fast learning is possible. We show that most… ▽ More

    Submitted 1 September, 2015; v1 submitted 9 July, 2015; originally announced July 2015.

    Comments: 69 pages, 3 figures

    Journal ref: Journal of Machine Learning Research 6(54):1793-1861, 2015

  4. arXiv:1410.0413  [pdf, other

    cs.GT cs.AI math.OC

    Risk Dynamics in Trade Networks

    Authors: Rafael M. Frongillo, Mark D. Reid

    Abstract: We introduce a new framework to model interactions among agents which seek to trade to minimize their risk with respect to some future outcome. We quantify this risk using the concept of risk measures from finance, and introduce a class of trade dynamics which allow agents to trade contracts contingent upon the future outcome. We then show that these trade dynamics exactly correspond to a variant… ▽ More

    Submitted 9 October, 2014; v1 submitted 1 October, 2014; originally announced October 2014.

  5. arXiv:1406.6130  [pdf, other

    cs.LG

    Generalized Mixability via Entropic Duality

    Authors: Mark D. Reid, Rafael M. Frongillo, Robert C. Williamson, Nishant Mehta

    Abstract: Mixability is a property of a loss which characterizes when fast convergence is possible in the game of prediction with expert advice. We show that a key property of mixability generalizes, and the exp and log operations present in the usual theory are not as special as one might have thought. In doing this we introduce a more general notion of $Φ$-mixability where $Φ$ is a general entropy (\ie, a… ▽ More

    Submitted 23 June, 2014; originally announced June 2014.

    Comments: 20 pages, 1 figure. Supersedes the work in arXiv:1403.2433 [cs.LG]

  6. arXiv:1403.2433  [pdf, ps, other

    cs.LG stat.ML

    Generalised Mixability, Constant Regret, and Bayesian Updating

    Authors: Mark D. Reid, Rafael M. Frongillo, Robert C. Williamson

    Abstract: Mixability of a loss is known to characterise when constant regret bounds are achievable in games of prediction with expert advice through the use of Vovk's aggregating algorithm. We provide a new interpretation of mixability via convex analysis that highlights the role of the Kullback-Leibler divergence in its definition. This naturally generalises to what we call $Φ$-mixability where the Bregman… ▽ More

    Submitted 10 March, 2014; originally announced March 2014.

    Comments: 12 pages

  7. arXiv:1204.3511  [pdf, ps, other

    cs.SI cs.GT

    Crowd & Prejudice: An Impossibility Theorem for Crowd Labelling without a Gold Standard

    Authors: Nicolás Della Penna, Mark D. Reid

    Abstract: A common use of crowd sourcing is to obtain labels for a dataset. Several algorithms have been proposed to identify uninformative members of the crowd so that their labels can be disregarded and the cost of paying them avoided. One common motivation of these algorithms is to try and do without any initial set of trusted labeled data. We analyse this class of algorithms as mechanisms in a game-theo… ▽ More

    Submitted 16 April, 2012; originally announced April 2012.

    Comments: Presented at Collective Intelligence conference, 2012 (arXiv:1204.2991)

    Report number: CollectiveIntelligence/2012/33

  8. arXiv:1112.0076  [pdf, other

    q-fin.TR cs.GT stat.ML

    Bandit Market Makers

    Authors: Nicolas Della Penna, Mark D. Reid

    Abstract: We introduce a modular framework for market making. It combines cost-function based automated market makers with bandit algorithms. We obtain worst-case profits guarantee's relative to the best in hindsight within a class of natural "overround" cost functions . This combination allow us to have distribution-free guarantees on the regret of profits while preserving the bounded worst-case losses and… ▽ More

    Submitted 1 August, 2013; v1 submitted 30 November, 2011; originally announced December 2011.

    Comments: A previous version of this work appeared in the NIPS 2011 Workshop on Computational Social Science and the Wisdom of the Crowds

  9. arXiv:1110.3907  [pdf, ps, other

    stat.ML cs.AI cs.CV

    AOSO-LogitBoost: Adaptive One-Vs-One LogitBoost for Multi-Class Problem

    Authors: Peng Sun, Mark D. Reid, Jie Zhou

    Abstract: This paper presents an improvement to model learning when using multi-class LogitBoost for classification. Motivated by the statistical view, LogitBoost can be seen as additive tree regression. Two important factors in this setting are: 1) coupled classifier output due to a sum-to-zero constraint, and 2) the dense Hessian matrices that arise when computing tree node split gain and node value fitti… ▽ More

    Submitted 4 July, 2012; v1 submitted 18 October, 2011; originally announced October 2011.

    Comments: 8-pages camera ready version for ICML2012

  10. arXiv:1009.3346  [pdf, other

    cs.LG

    Conditional Random Fields and Support Vector Machines: A Hybrid Approach

    Authors: Qinfeng Shi, Mark D. Reid, Tiberio Caetano

    Abstract: We propose a novel hybrid loss for multiclass and structured prediction problems that is a convex combination of log loss for Conditional Random Fields (CRFs) and a multiclass hinge loss for Support Vector Machines (SVMs). We provide a sufficient condition for when the hybrid loss is Fisher consistent for classification. This condition depends on a measure of dominance between labels - specificall… ▽ More

    Submitted 17 September, 2010; originally announced September 2010.

    Comments: 16 pages, 3 figures

  11. arXiv:0906.1244  [pdf, other

    cs.IT

    Generalised Pinsker Inequalities

    Authors: Mark D. Reid, Robert C. Williamson

    Abstract: We generalise the classical Pinsker inequality which relates variational divergence to Kullback-Liebler divergence in two ways: we consider arbitrary f-divergences in place of KL divergence, and we assume knowledge of a sequence of values of generalised variational divergences. We then develop a best possible inequality for this doubly generalised situation. Specialising our result to the classi… ▽ More

    Submitted 5 June, 2009; originally announced June 2009.

    Comments: 21 pages, 3 figures, accepted to COLT 2009