Search | arXiv e-print repository

Variational Search Distributions

Authors: Daniel M. Steinberg, Rafael Oliveira, Cheng Soon Ong, Edwin V. Bonilla

Abstract: We develop variational search distributions (VSD), a method for finding discrete, combinatorial designs of a rare desired class in a batch sequential manner with a fixed experimental budget. We formalize the requirements and desiderata for this problem and formulate a solution via variational inference that fulfill these. In particular, VSD uses off-the-shelf gradient based optimization routines,… ▽ More We develop variational search distributions (VSD), a method for finding discrete, combinatorial designs of a rare desired class in a batch sequential manner with a fixed experimental budget. We formalize the requirements and desiderata for this problem and formulate a solution via variational inference that fulfill these. In particular, VSD uses off-the-shelf gradient based optimization routines, and can take advantage of scalable predictive models. We show that VSD can outperform existing baseline methods on a set of real sequence-design problems in various biological systems. △ Less

Submitted 9 September, 2024; originally announced September 2024.

Comments: 16 pages, 5 figures, Appendix material included

ACM Class: G.3; G.2.1; I.2.6

arXiv:2402.03884 [pdf, other]

doi 10.1063/5.0186615

Investigation of the Nonlinear Optical Frequency Conversion in Ultrathin Franckeite Heterostructures

Authors: Alisson R. Cadore, Alexandre S. M. V. Ore, David Steinberg, Juan D. Zapata, Eunézio A. T. de Souza, Dario A. Bahamon, Christiano J. S. de Matos

Abstract: Layered franckeite is a natural superlattice composed of two alternating layers of different compositions, SnS$_2$- and PbS-like. This creates incommensurability between the two species along the planes of the layers, resulting in spontaneous symmetry-break periodic ripples in the \textit{a}-axis orientation. Nevertheless, natural franckeite heterostructure has shown potential for optoelectronic a… ▽ More Layered franckeite is a natural superlattice composed of two alternating layers of different compositions, SnS$_2$- and PbS-like. This creates incommensurability between the two species along the planes of the layers, resulting in spontaneous symmetry-break periodic ripples in the \textit{a}-axis orientation. Nevertheless, natural franckeite heterostructure has shown potential for optoelectronic applications mostly because it is a semiconductor with 0.7 eV bandgap, air-stable, and can be easily exfoliated down to ultrathin thicknesses. Here, we demonstrate that few-layer franckeite shows a highly anisotropic nonlinear optical response due to its lattice structure, which allow for the identification of the ripple axis. Moreover, we find that the highly anisotropic third-harmonic emission strongly varies with material thickness. These features are further corroborated by a theoretical nonlinear susceptibility model and the nonlinear transfer matrix method. Overall, our findings help to understand this material and propose a characterization method that could be used in other layered materials and heterostructures to assign their characteristic axes. △ Less

Submitted 6 February, 2024; originally announced February 2024.

Comments: 27 pages, 5 figures. The following article has been accepted by the Journal of Applied Physics. After it is published, it will be found by DOI: 10.1063/5.0186615

arXiv:2308.10194 [pdf, other]

Federated Statistical Analysis: Non-parametric Testing and Quantile Estimation

Authors: Ori Becher, Mira Marcus-Kalish, David M. Steinberg

Abstract: The age of big data has fueled expectations for accelerating learning. The availability of large data sets enables researchers to achieve more powerful statistical analyses and enhances the reliability of conclusions, which can be based on a broad collection of subjects. Often such data sets can be assembled only with access to diverse sources; for example, medical research that combines data from… ▽ More The age of big data has fueled expectations for accelerating learning. The availability of large data sets enables researchers to achieve more powerful statistical analyses and enhances the reliability of conclusions, which can be based on a broad collection of subjects. Often such data sets can be assembled only with access to diverse sources; for example, medical research that combines data from multiple centers in a federated analysis. However these hopes must be balanced against data privacy concerns, which hinder sharing raw data among centers. Consequently, federated analyses typically resort to sharing data summaries from each center. The limitation to summaries carries the risk that it will impair the efficiency of statistical analysis procedures. In this work we take a close look at the effects of federated analysis on two very basic problems, nonparametric comparison of two groups and quantile estimation to describe the corresponding distributions. We also propose a specific privacy-preserving data release policy for federated analysis with the $K$-anonymity criterion, which has been adopted by the Medical Informatics Platform of the European Human Brain Project. Our results show that, for our tasks, there is only a modest loss of statistical efficiency. △ Less

Submitted 20 August, 2023; originally announced August 2023.

arXiv:2307.01443 [pdf]

doi 10.1126/science.adg3781

Emissions and Energy Impacts of the Inflation Reduction Act

Authors: John Bistline, Geoffrey Blanford, Maxwell Brown, Dallas Burtraw, Maya Domeshek, Jamil Farbes, Allen Fawcett, Anne Hamilton, Jesse Jenkins, Ryan Jones, Ben King, Hannah Kolus, John Larsen, Amanda Levin, Megan Mahajan, Cara Marcy, Erin Mayfield, James McFarland, Haewon McJeon, Robbie Orvis, Neha Patankar, Kevin Rennert, Christopher Roney, Nicholas Roy, Greg Schivley , et al. (7 additional authors not shown)

Abstract: If goals set under the Paris Agreement are met, the world may hold warming well below 2 C; however, parties are not on track to deliver these commitments, increasing focus on policy implementation to close the gap between ambition and action. Recently, the US government passed its most prominent piece of climate legislation to date, the Inflation Reduction Act of 2022 (IRA), designed to invest in… ▽ More If goals set under the Paris Agreement are met, the world may hold warming well below 2 C; however, parties are not on track to deliver these commitments, increasing focus on policy implementation to close the gap between ambition and action. Recently, the US government passed its most prominent piece of climate legislation to date, the Inflation Reduction Act of 2022 (IRA), designed to invest in a wide range of programs that, among other provisions, incentivize clean energy and carbon management, encourage electrification and efficiency measures, reduce methane emissions, promote domestic supply chains, and address environmental justice concerns. IRA's scope and complexity make modeling important to understand impacts on emissions and energy systems. We leverage results from nine independent, state-of-the-art models to examine potential implications of key IRA provisions, showing economy wide emissions reductions between 43-48% below 2005 by 2035. △ Less

Submitted 3 July, 2023; originally announced July 2023.

Journal ref: Science, 380(6652): 1324-1327 (2023)

arXiv:2305.18435 [pdf, other]

Statistically Efficient Bayesian Sequential Experiment Design via Reinforcement Learning with Cross-Entropy Estimators

Authors: Tom Blau, Iadine Chades, Amir Dezfouli, Daniel Steinberg, Edwin V. Bonilla

Abstract: Reinforcement learning can learn amortised design policies for designing sequences of experiments. However, current amortised methods rely on estimators of expected information gain (EIG) that require an exponential number of samples on the magnitude of the EIG to achieve an unbiased estimation. We propose the use of an alternative estimator based on the cross-entropy of the joint model distributi… ▽ More Reinforcement learning can learn amortised design policies for designing sequences of experiments. However, current amortised methods rely on estimators of expected information gain (EIG) that require an exponential number of samples on the magnitude of the EIG to achieve an unbiased estimation. We propose the use of an alternative estimator based on the cross-entropy of the joint model distribution and a flexible proposal distribution. This proposal distribution approximates the true posterior of the model parameters given the experimental history and the design policy. Our method overcomes the exponential-sample complexity of previous approaches and provide more accurate estimates of high EIG values. More importantly, it allows learning of superior design policies, and is compatible with continuous and discrete design spaces, non-differentiable likelihoods and even implicit probabilistic models. △ Less

Submitted 4 February, 2024; v1 submitted 28 May, 2023; originally announced May 2023.

arXiv:2304.01490 [pdf, other]

The Economic Effect of Gaining a New Qualification Later in Life

Authors: Finn Lattimore, Daniel M. Steinberg, Anna Zhu

Abstract: Pursuing educational qualifications later in life is an increasingly common phenomenon within OECD countries since technological change and automation continues to drive the evolution of skills needed in many professions. We focus on the causal impacts to economic returns of degrees completed later in life, where motivations and capabilities to acquire additional education may be distinct from edu… ▽ More Pursuing educational qualifications later in life is an increasingly common phenomenon within OECD countries since technological change and automation continues to drive the evolution of skills needed in many professions. We focus on the causal impacts to economic returns of degrees completed later in life, where motivations and capabilities to acquire additional education may be distinct from education in early years. We find that completing an additional degree leads to more than \$3000 (AUD, 2019) extra income per year compared to those who do not complete additional study. For outcomes, treatment and controls we use the extremely rich and nationally representative longitudinal data from the Household Income and Labour Dynamics Australia survey (HILDA). To take full advantage of the complexity and richness of this data we use a Machine Learning (ML) based methodology for causal effect estimation. We are also able to use ML to discover sources of heterogeneity in the effects of gaining additional qualifications. For example, those younger than 45 years of age when obtaining additional qualifications tend to reap more benefits (as much as \$50 per week more) than others. △ Less

Submitted 21 April, 2023; v1 submitted 3 April, 2023; originally announced April 2023.

Comments: 80 pages, 16 figures

arXiv:2211.03714 [pdf, other]

Deviations in Representations Induced by Adversarial Attacks

Authors: Daniel Steinberg, Paul Munro

Abstract: Deep learning has been a popular topic and has achieved success in many areas. It has drawn the attention of researchers and machine learning practitioners alike, with developed models deployed to a variety of settings. Along with its achievements, research has shown that deep learning models are vulnerable to adversarial attacks. This finding brought about a new direction in research, whereby alg… ▽ More Deep learning has been a popular topic and has achieved success in many areas. It has drawn the attention of researchers and machine learning practitioners alike, with developed models deployed to a variety of settings. Along with its achievements, research has shown that deep learning models are vulnerable to adversarial attacks. This finding brought about a new direction in research, whereby algorithms were developed to attack and defend vulnerable networks. Our interest is in understanding how these attacks effect change on the intermediate representations of deep learning models. We present a method for measuring and analyzing the deviations in representations induced by adversarial attacks, progressively across a selected set of layers. Experiments are conducted using an assortment of attack algorithms, on the CIFAR-10 dataset, with plots created to visualize the impact of adversarial attacks across different layers in a network. △ Less

Submitted 7 November, 2022; originally announced November 2022.

arXiv:2111.07035 [pdf, other]

Measuring the Contribution of Multiple Model Representations in Detecting Adversarial Instances

Authors: Daniel Steinberg, Paul Munro

Abstract: Deep learning models have been used for a wide variety of tasks. They are prevalent in computer vision, natural language processing, speech recognition, and other areas. While these models have worked well under many scenarios, it has been shown that they are vulnerable to adversarial attacks. This has led to a proliferation of research into ways that such attacks could be identified and/or defend… ▽ More Deep learning models have been used for a wide variety of tasks. They are prevalent in computer vision, natural language processing, speech recognition, and other areas. While these models have worked well under many scenarios, it has been shown that they are vulnerable to adversarial attacks. This has led to a proliferation of research into ways that such attacks could be identified and/or defended against. Our goal is to explore the contribution that can be attributed to using multiple underlying models for the purpose of adversarial instance detection. Our paper describes two approaches that incorporate representations from multiple models for detecting adversarial examples. We devise controlled experiments for measuring the detection impact of incrementally utilizing additional models. For many of the scenarios we consider, the results show that performance increases with the number of underlying models used for extracting representations. △ Less

Submitted 11 February, 2022; v1 submitted 12 November, 2021; originally announced November 2021.

Comments: Correction: replaced "model-wise" with "unit-wise" in the first sentence of Section 3.2

arXiv:2105.14116 [pdf, other]

Visualizing Representations of Adversarially Perturbed Inputs

Authors: Daniel Steinberg, Paul Munro

Abstract: It has been shown that deep learning models are vulnerable to adversarial attacks. We seek to further understand the consequence of such attacks on the intermediate activations of neural networks. We present an evaluation metric, POP-N, which scores the effectiveness of projecting data to N dimensions under the context of visualizing representations of adversarially perturbed inputs. We conduct ex… ▽ More It has been shown that deep learning models are vulnerable to adversarial attacks. We seek to further understand the consequence of such attacks on the intermediate activations of neural networks. We present an evaluation metric, POP-N, which scores the effectiveness of projecting data to N dimensions under the context of visualizing representations of adversarially perturbed inputs. We conduct experiments on CIFAR-10 to compare the POP-2 score of several dimensionality reduction algorithms across various adversarial attacks. Finally, we utilize the 2D data corresponding to high POP-2 scores to generate example visualizations. △ Less

Submitted 28 May, 2021; originally announced May 2021.

arXiv:2009.13921 [pdf, other]

Efficient Study Design with Multiple Measurement Instruments

Authors: Michal Bitan, Malka Gorfine, Laura Rosen, David M. Steinberg

Abstract: Outcomes from studies assessing exposure often use multiple measurements. In previous work, using a model first proposed by Buonoccorsi (1991), we showed that combining direct (e.g. biomarkers) and indirect (e.g. self-report) measurements provides a more accurate picture of true exposure than estimates obtained when using a single type of measurement. In this article, we propose a valuable tool fo… ▽ More Outcomes from studies assessing exposure often use multiple measurements. In previous work, using a model first proposed by Buonoccorsi (1991), we showed that combining direct (e.g. biomarkers) and indirect (e.g. self-report) measurements provides a more accurate picture of true exposure than estimates obtained when using a single type of measurement. In this article, we propose a valuable tool for efficient design of studies that include both direct and indirect measurements of a relevant outcome. Based on data from a pilot or preliminary study, the tool, which is available online as a shiny app \citep{shinyR}, can be used to compute: (1) the sample size required for a statistical power analysis, while optimizing the percent of participants who should provide direct measures of exposure (biomarkers) in addition to the indirect (self-report) measures provided by all participants; (2) the ideal number of replicates; and (3) the allocation of resources to intervention and control arms. In addition we show how to examine the sensitivity of results to underlying assumptions. We illustrate our analysis using studies of tobacco smoke exposure and nutrition. In these examples, a near-optimal allocation of the resources can be found even if the assumptions are not precise. △ Less

Submitted 29 September, 2020; originally announced September 2020.

arXiv:2002.06200 [pdf, other]

Fast Fair Regression via Efficient Approximations of Mutual Information

Authors: Daniel Steinberg, Alistair Reid, Simon O'Callaghan, Finnian Lattimore, Lachlan McCalman, Tiberio Caetano

Abstract: Most work in algorithmic fairness to date has focused on discrete outcomes, such as deciding whether to grant someone a loan or not. In these classification settings, group fairness criteria such as independence, separation and sufficiency can be measured directly by comparing rates of outcomes between subpopulations. Many important problems however require the prediction of a real-valued outcome,… ▽ More Most work in algorithmic fairness to date has focused on discrete outcomes, such as deciding whether to grant someone a loan or not. In these classification settings, group fairness criteria such as independence, separation and sufficiency can be measured directly by comparing rates of outcomes between subpopulations. Many important problems however require the prediction of a real-valued outcome, such as a risk score or insurance premium. In such regression settings, measuring group fairness criteria is computationally challenging, as it requires estimating information-theoretic divergences between conditional probability density functions. This paper introduces fast approximations of the independence, separation and sufficiency group fairness criteria for regression models from their (conditional) mutual information definitions, and uses such approximations as regularisers to enforce fairness within a regularised risk minimisation framework. Experiments in real-world datasets indicate that in spite of its superior computational efficiency our algorithm still displays state-of-the-art accuracy/fairness tradeoffs. △ Less

Submitted 14 February, 2020; originally announced February 2020.

Comments: arXiv admin note: text overlap with arXiv:2001.06089

arXiv:2001.06089 [pdf, other]

Fairness Measures for Regression via Probabilistic Classification

Authors: Daniel Steinberg, Alistair Reid, Simon O'Callaghan

Abstract: Algorithmic fairness involves expressing notions such as equity, or reasonable treatment, as quantifiable measures that a machine learning algorithm can optimise. Most work in the literature to date has focused on classification problems where the prediction is categorical, such as accepting or rejecting a loan application. This is in part because classification fairness measures are easily comput… ▽ More Algorithmic fairness involves expressing notions such as equity, or reasonable treatment, as quantifiable measures that a machine learning algorithm can optimise. Most work in the literature to date has focused on classification problems where the prediction is categorical, such as accepting or rejecting a loan application. This is in part because classification fairness measures are easily computed by comparing the rates of outcomes, leading to behaviours such as ensuring that the same fraction of eligible men are selected as eligible women. But such measures are computationally difficult to generalise to the continuous regression setting for problems such as pricing, or allocating payments. The difficulty arises from estimating conditional densities (such as the probability density that a system will over-charge by a certain amount). For the regression setting we introduce tractable approximations of the independence, separation and sufficiency criteria by observing that they factorise as ratios of different conditional probabilities of the protected attributes. We introduce and train machine learning classifiers, distinct from the predictor, as a mechanism to estimate these probabilities from the data. This naturally leads to model agnostic, tractable approximations of the criteria, which we explore experimentally. △ Less

Submitted 4 March, 2020; v1 submitted 16 January, 2020; originally announced January 2020.

Comments: Accepted to the 2nd Ethics of Data Science Conference 2020 (March, Sydney, Australia)

arXiv:1906.11380 [pdf, ps, other]

Some extension algebras for standard modules over KLR algebras of type $A$

Authors: Doeke Buursma, Alexander Kleshchev, David J. Steinberg

Abstract: Khovanov-Lauda-Rouquier algebras $R_θ$ of finite Lie type are affine quasihereditary with standard modules $Δ(π)$ labeled by Kostant partitions of $θ$. Let $Δ$ be the direct sum of all standard modules. It is known that the Yoneda algebra $\mathcal{E}_θ:=\operatorname{Ext}_{R_θ}^*(Δ, Δ)$ carries a structure of an $A_\infty$-algebra which can be used to reconstruct the category of standardly filter… ▽ More Khovanov-Lauda-Rouquier algebras $R_θ$ of finite Lie type are affine quasihereditary with standard modules $Δ(π)$ labeled by Kostant partitions of $θ$. Let $Δ$ be the direct sum of all standard modules. It is known that the Yoneda algebra $\mathcal{E}_θ:=\operatorname{Ext}_{R_θ}^*(Δ, Δ)$ carries a structure of an $A_\infty$-algebra which can be used to reconstruct the category of standardly filtered $R_θ$-modules. In this paper, we explicitly describe $\mathcal{E}_θ$ in two special cases: (1) when $θ$ is a positive root in type $\mathtt{A}$, and (2) when $θ$ is of Lie type $\mathtt{A_2}$. In these cases, $\mathcal{E}_θ$ turns out to be torsion free and intrinsically formal. We provide an example to show that the $A_\infty$-algebra $\mathcal{E}_θ$ is non-formal in general. △ Less

Submitted 26 June, 2019; originally announced June 2019.

MSC Class: 16G99; 16E05; 17B37

arXiv:1906.11376 [pdf, ps, other]

Resolutions of standard modules over KLR algebras of type $A$

Authors: Doeke Buursma, Alexander Kleshchev, David J. Steinberg

Abstract: Khovanov-Lauda-Rouquier algebras $R_θ$ of finite Lie type are affine quasihereditary with standard modules $Δ(π)$ labeled by Kostant partitions $π$ of $θ$. In type $A$, we construct explicit projective resolutions of standard modules $Δ(π)$. Khovanov-Lauda-Rouquier algebras $R_θ$ of finite Lie type are affine quasihereditary with standard modules $Δ(π)$ labeled by Kostant partitions $π$ of $θ$. In type $A$, we construct explicit projective resolutions of standard modules $Δ(π)$. △ Less

Submitted 26 June, 2019; originally announced June 2019.

MSC Class: 16E05; 16G99; 17B37

arXiv:1906.05959 [pdf]

Early Detection of Long Term Evaluation Criteria in Online Controlled Experiments

Authors: Yoni Schamroth, Liron Gat Kahlon, Boris Rabinovich, David Steinberg

Abstract: A common dilemma encountered by many upon implementing an optimization method or experiment, whether it be a reinforcement learning algorithm, or A/B testing, is deciding on what metric to optimize for. Very often short-term metrics, which are easier to measure are chosen over long term metrics which have undesirable time considerations and often a more complex calculation. In this paper, we argue… ▽ More A common dilemma encountered by many upon implementing an optimization method or experiment, whether it be a reinforcement learning algorithm, or A/B testing, is deciding on what metric to optimize for. Very often short-term metrics, which are easier to measure are chosen over long term metrics which have undesirable time considerations and often a more complex calculation. In this paper, we argue the importance of choosing a metrics that focuses on long term effects. With this comes the necessity in the ability to measure significant differences between groups relatively early. We present here an efficient methodology for early detection of lifetime differences between groups based on bootstrap hypothesis testing of the lifetime forecast of the response. We present an application of this method in the domain of online advertising and we argue that approach not only allows one to focus on the ultimate metric of importance but also provides a means of accelerating the testing period. △ Less

Submitted 13 June, 2019; originally announced June 2019.

arXiv:1901.11053 [pdf, other]

Software solutions for form-based collection of data and the semantic enrichment of form data

Authors: Markus D. Steinberg

Abstract: Data collection is an important part of many citizen science projects as well as other fields of research, particularly in life sciences. Mobile applications with form-based surveys are increasingly used to support this, due to the large number of mobile devices and their growing number of built-in sensors. Since the composition of form-based surveys from scratch can be a tedious task, multiple to… ▽ More Data collection is an important part of many citizen science projects as well as other fields of research, particularly in life sciences. Mobile applications with form-based surveys are increasingly used to support this, due to the large number of mobile devices and their growing number of built-in sensors. Since the composition of form-based surveys from scratch can be a tedious task, multiple tools have been published that can help with their design and distribution as well as the data collection via mobile devices and the data storage. Some even support simple data analysis. With this increasing number of software options project leaders will often face the question, which tool is most suitable for their current use case. With that in mind, this project pursues two main objectives: 1. To present an overview of a selection of survey design tools and their capabilities in order to provide a clear foundation for such a decision. 2. To examine if any tool provides the capability to collect and export data in a way that can easily be used and interpreted by other applications or persons. This aspect includes the supply of metadata about the data collection process and the data itself, information about the meaning of the data as well as an export format that can easily be processed. △ Less

Submitted 30 January, 2019; originally announced January 2019.

arXiv:1807.04883 [pdf]

Probabilistic Re-aggregation Algorithm [First Draft]

Authors: Alistair Reid, Xinyue Wang, Simon O'Callaghan, Daniel Steinberg, Lachlan McCalman

Abstract: Spatial data about individuals or businesses is often aggregated over polygonal regions to preserve privacy, provide useful insight and support decision making. Given a particular aggregation of data (say into local government areas), the re-aggregation problem is to estimate how that same data would aggregate over a different set of polygonal regions (say electorates) without having access to the… ▽ More Spatial data about individuals or businesses is often aggregated over polygonal regions to preserve privacy, provide useful insight and support decision making. Given a particular aggregation of data (say into local government areas), the re-aggregation problem is to estimate how that same data would aggregate over a different set of polygonal regions (say electorates) without having access to the original unit records. Data61 is developing new re-aggregation algorithms that both estimate confidence intervals of their predictions and utilize additional related datasets when available to improve accuracy. The algorithms are an improvement over the current re-aggregation procedure in use by the ABS, which is manually applied by the data user, less accurate in validation experiments and provides a single best guess answer. The algorithms are deployed in an accessible web service that automatically learns a model and applies it to user-data. This report formulates the re-aggregation problem, describes Data61's new algorithms, and presents preliminary validation experiments. △ Less

Submitted 12 July, 2018; originally announced July 2018.

arXiv:1704.04760 [pdf]

In-Datacenter Performance Analysis of a Tensor Processing Unit

Authors: Norman P. Jouppi, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, Al Borchers, Rick Boyle, Pierre-luc Cantin, Clifford Chao, Chris Clark, Jeremy Coriell, Mike Daley, Matt Dau, Jeffrey Dean, Ben Gelb, Tara Vazir Ghaemmaghami, Rajendra Gottipati, William Gulland, Robert Hagmann, C. Richard Ho, Doug Hogberg , et al. (50 additional authors not shown)

Abstract: Many architects believe that major improvements in cost-energy-performance must now come from domain-specific hardware. This paper evaluates a custom ASIC---called a Tensor Processing Unit (TPU)---deployed in datacenters since 2015 that accelerates the inference phase of neural networks (NN). The heart of the TPU is a 65,536 8-bit MAC matrix multiply unit that offers a peak throughput of 92 TeraOp… ▽ More Many architects believe that major improvements in cost-energy-performance must now come from domain-specific hardware. This paper evaluates a custom ASIC---called a Tensor Processing Unit (TPU)---deployed in datacenters since 2015 that accelerates the inference phase of neural networks (NN). The heart of the TPU is a 65,536 8-bit MAC matrix multiply unit that offers a peak throughput of 92 TeraOps/second (TOPS) and a large (28 MiB) software-managed on-chip memory. The TPU's deterministic execution model is a better match to the 99th-percentile response-time requirement of our NN applications than are the time-varying optimizations of CPUs and GPUs (caches, out-of-order execution, multithreading, multiprocessing, prefetching, ...) that help average throughput more than guaranteed latency. The lack of such features helps explain why, despite having myriad MACs and a big memory, the TPU is relatively small and low power. We compare the TPU to a server-class Intel Haswell CPU and an Nvidia K80 GPU, which are contemporaries deployed in the same datacenters. Our workload, written in the high-level TensorFlow framework, uses production NN applications (MLPs, CNNs, and LSTMs) that represent 95% of our datacenters' NN inference demand. Despite low utilization for some applications, the TPU is on average about 15X - 30X faster than its contemporary GPU or CPU, with TOPS/Watt about 30X - 80X higher. Moreover, using the GPU's GDDR5 memory in the TPU would triple achieved TOPS and raise TOPS/Watt to nearly 70X the GPU and 200X the CPU. △ Less

Submitted 16 April, 2017; originally announced April 2017.

Comments: 17 pages, 11 figures, 8 tables. To appear at the 44th International Symposium on Computer Architecture (ISCA), Toronto, Canada, June 24-28, 2017

arXiv:1505.04222 [pdf, ps, other]

doi 10.1112/S0010437X16008204

Homomorphisms between standard modules over finite type KLR algebras

Authors: Alexander S. Kleshchev, David J. Steinberg

Abstract: Khovanov-Lauda-Rouquier algebras of finite Lie type come with families of standard modules, which under the Khovanov-Lauda-Rouquier categorification correspond to PBW-bases of the positive part of the corresponding quantized enveloping algebra. We show that there are no non-zero homomorphisms between distinct standard modules and all non-zero endomorphisms of a standard module are injective. We ob… ▽ More Khovanov-Lauda-Rouquier algebras of finite Lie type come with families of standard modules, which under the Khovanov-Lauda-Rouquier categorification correspond to PBW-bases of the positive part of the corresponding quantized enveloping algebra. We show that there are no non-zero homomorphisms between distinct standard modules and all non-zero endomorphisms of a standard module are injective. We obtain applications to extensions between standard modules and modular representation theory of KLR algebras. △ Less

Submitted 15 May, 2015; originally announced May 2015.

MSC Class: 16G99

Journal ref: Compositio Math. 153 (2017) 621-646

arXiv:1501.00083 [pdf, other]

Variable Selection in Bayesian Semiparametric Regression Models

Authors: Ofir Harari, David M. Steinberg

Abstract: In this paper we extend existing Bayesian methods for variable selection in Gaussian process regression, to select both the regression terms and the active covariates in the spatial correlation structure. We then use the estimated posterior probabilities to choose between relatively few modes through cross-validation, and consequently improve prediction. In this paper we extend existing Bayesian methods for variable selection in Gaussian process regression, to select both the regression terms and the active covariates in the spatial correlation structure. We then use the estimated posterior probabilities to choose between relatively few modes through cross-validation, and consequently improve prediction. △ Less

Submitted 31 December, 2014; originally announced January 2015.

arXiv:1208.0884 [pdf, ps, other]

Curve-counting invariants for crepant resolutions

Authors: Jim Bryan, David Steinberg

Abstract: We construct curve counting invariants for a Calabi-Yau threefold $Y$ equipped with a dominant birational morphism $π:Y \to X$. Our invariants generalize the stable pair invariants of Pandharipande and Thomas which occur for the case when $π:Y\to Y$ is the identity. Our main result is a PT/DT-type formula relating the partition function of our invariants to the Donaldson-Thomas partition function… ▽ More We construct curve counting invariants for a Calabi-Yau threefold $Y$ equipped with a dominant birational morphism $π:Y \to X$. Our invariants generalize the stable pair invariants of Pandharipande and Thomas which occur for the case when $π:Y\to Y$ is the identity. Our main result is a PT/DT-type formula relating the partition function of our invariants to the Donaldson-Thomas partition function in the case when $Y$ is a crepant resolution of $X$, the coarse space of a Calabi-Yau orbifold $\mathcal{X}$ satisfying the hard Lefschetz condition. In this case, our partition function is equal to the Pandharipande-Thomas partition function of the orbifold $\mathcal{X}$. Our methods include defining a new notion of stability for sheaves which depends on the morphism $π$. Our notion generalizes slope stability which is recovered in the case where $π$ is the identity on $Y$. Our proof is a generalization of Bridgeland's proof of the PT/DT correspondence via the Hall algebra and Joyce's integration map. △ Less

Submitted 1 July, 2014; v1 submitted 4 August, 2012; originally announced August 2012.

Comments: In this version, Jim Bryan has been added as an author and the required boundedness result for our stability condition has been added. arXiv admin note: text overlap with arXiv:1002.4374 by other authors

MSC Class: 14N35

arXiv:0905.1845 [pdf, ps, other]

doi 10.1111/j.1365-2966.2009.15067.x

Search for corannulene (C20H10) in the Red Rectangle

Authors: P. Pilleri, D. Herberth, T. F. Giesen, M. Gerin, C. Joblin, G. Mulas, G. Malloci, J. U. Grabow, S. Brunken, L. Surin, B. D. Steinberg, K. R. Curtis, L. T. Scott

Abstract: Polycyclic Aromatic Hydrocarbons (PAHs) are widely accepted as the carriers of the Aromatic Infrared Bands (AIBs), but an unambiguous identification of any specific interstellar PAH is still missing. For polar PAHs, pure rotational transitions can be used as fingerprints for identification. Combining dedicated experiments, detailed simulations and observations, we explored the mm domain to searc… ▽ More Polycyclic Aromatic Hydrocarbons (PAHs) are widely accepted as the carriers of the Aromatic Infrared Bands (AIBs), but an unambiguous identification of any specific interstellar PAH is still missing. For polar PAHs, pure rotational transitions can be used as fingerprints for identification. Combining dedicated experiments, detailed simulations and observations, we explored the mm domain to search for specific rotational transitions of corannulene (C20H10). We performed high-resolution spectroscopic measurements and a simulation of the emission spectrum of UV-excited C20H10 in the environment of the Red Rectangle, calculating its synthetic rotational spectrum. Based on these results, we conducted a first observational campaign at the IRAM 30m telescope towards this source to search for several high-J rotational transitions of (C20H10). The laboratory detection of the J = 112 <- 111 transition of corannulene showed that no centrifugal splitting is present up to this line. Observations with the IRAM 30m telescope towards the Red Rectangle do not show any corannulene emission at any of the observed frequencies, down to a rms noise level of Tmb = 8 mK for the J =135 -> 134 transition at 137.615 GHz. Comparing the noise level with the synthetic spectrum, we are able to estimate an upper limit to the fraction of carbon locked in corannulene of about 1.0x10(-5) relative to the total abundance of carbon in PAHs. The sensitivity achieved shows that radio spectroscopy can be a powerful tool to search for polar PAHs. We compare this upper limit with models for the PAH size distribution, emphasising that small PAHs are much less abundant than predicted. We show that this cannot be explained by destruction but is more likely related to the chemistry of their formation in the environment of the Red Rectangle. △ Less

Submitted 13 May, 2009; v1 submitted 12 May, 2009; originally announced May 2009.

Comments: 8 pages, 7 figures, 2 tables, accepted for publication in MNRAS

arXiv:cs/0405012 [pdf]

Is Neural Network a Reliable Forecaster on Earth? A MARS Query!

Authors: Ajith Abraham, Dan Steinberg

Abstract: Long-term rainfall prediction is a challenging task especially in the modern world where we are facing the major environmental problem of global warming. In general, climate and rainfall are highly non-linear phenomena in nature exhibiting what is known as the butterfly effect. While some regions of the world are noticing a systematic decrease in annual rainfall, others notice increases in flood… ▽ More Long-term rainfall prediction is a challenging task especially in the modern world where we are facing the major environmental problem of global warming. In general, climate and rainfall are highly non-linear phenomena in nature exhibiting what is known as the butterfly effect. While some regions of the world are noticing a systematic decrease in annual rainfall, others notice increases in flooding and severe storms. The global nature of this phenomenon is very complicated and requires sophisticated computer modeling and simulation to predict accurately. In this paper, we report a performance analysis for Multivariate Adaptive Regression Splines (MARS)and artificial neural networks for one month ahead prediction of rainfall. To evaluate the prediction efficiency, we made use of 87 years of rainfall data in Kerala state, the southern part of the Indian peninsula situated at latitude -longitude pairs (8o29'N - 76o57' E). We used an artificial neural network trained using the scaled conjugate gradient algorithm. The neural network and MARS were trained with 40 years of rainfall data. For performance evaluation, network predicted outputs were compared with the actual rainfall data. Simulation results reveal that MARS is a good forecasting tool and performed better than the considered neural network. △ Less

Submitted 4 May, 2004; originally announced May 2004.

ACM Class: I.2.0

Journal ref: Bio-Inspired Applications of Connectionism, Lecture Notes in Computer Science. Volume. 2085, Springer Verlag Germany, Jose Mira and Alberto Prieto (Eds.), ISBN 3540422374, Spain, pp.679-686, 2001

Showing 1–23 of 23 results for author: Steinberg, D