Zum Hauptinhalt springen

Showing 1–32 of 32 results for author: Tishby, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2301.00005  [pdf, other

    cs.AI physics.app-ph

    Intrinsic Motivation in Dynamical Control Systems

    Authors: Stas Tiomkin, Ilya Nemenman, Daniel Polani, Naftali Tishby

    Abstract: Biological systems often choose actions without an explicit reward signal, a phenomenon known as intrinsic motivation. The computational principles underlying this behavior remain poorly understood. In this study, we investigate an information-theoretic approach to intrinsic motivation, based on maximizing an agent's empowerment (the mutual information between its past actions and future states).… ▽ More

    Submitted 29 December, 2022; originally announced January 2023.

  2. arXiv:2106.08956  [pdf, ps, other

    cs.LG math.DS nlin.CD

    Detecting chaos in lineage-trees: A deep learning approach

    Authors: Hagai Rappeport, Irit Levin Reisman, Naftali Tishby, Nathalie Q. Balaban

    Abstract: Many complex phenomena, from weather systems to heartbeat rhythm patterns, are effectively modeled as low-dimensional dynamical systems. Such systems may behave chaotically under certain conditions, and so the ability to detect chaos based on empirical measurement is an important step in characterizing and predicting these processes. Classifying a system as chaotic usually requires estimating its… ▽ More

    Submitted 8 June, 2021; originally announced June 2021.

    Comments: 12 pages, 7 figures

  3. Critical Slowing Down Near Topological Transitions in Rate-Distortion Problems

    Authors: Shlomi Agmon, Etam Benger, Or Ordentlich, Naftali Tishby

    Abstract: In rate-distortion (RD) problems one seeks reduced representations of a source that meet a target distortion constraint. Such optimal representations undergo topological transitions at some critical rate values, when their cardinality or dimensionality change. We study the convergence time of the Arimoto-Blahut alternating projection algorithms, used to solve such problems, near those critical poi… ▽ More

    Submitted 9 May, 2021; v1 submitted 3 March, 2021; originally announced March 2021.

    Comments: 10 pages, 2 figures, ISIT 2021 submission

    Journal ref: 2021 IEEE International Symposium on Information Theory (ISIT), Melbourne, Australia, 2021, pp. 2625-2630

  4. arXiv:2006.04641  [pdf, other

    cs.IT cs.LG

    The Dual Information Bottleneck

    Authors: Zoe Piran, Ravid Shwartz-Ziv, Naftali Tishby

    Abstract: The Information Bottleneck (IB) framework is a general characterization of optimal representations obtained using a principled approach for balancing accuracy and complexity. Here we present a new framework, the Dual Information Bottleneck (dualIB), which resolves some of the known drawbacks of the IB. We provide a theoretical analysis of the dualIB framework; (i) solving for the structure of its… ▽ More

    Submitted 8 June, 2020; originally announced June 2020.

  5. arXiv:1905.04562  [pdf, other

    cs.CL

    Semantic categories of artifacts and animals reflect efficient coding

    Authors: Noga Zaslavsky, Terry Regier, Naftali Tishby, Charles Kemp

    Abstract: It has been argued that semantic categories across languages reflect pressure for efficient communication. Recently, this idea has been cast in terms of a general information-theoretic principle of efficiency, the Information Bottleneck (IB) principle, and it has been shown that this principle accounts for the emergence and evolution of named color categories across languages, including soft struc… ▽ More

    Submitted 11 May, 2019; originally announced May 2019.

    Comments: To appear in the proceedings of the 41st Annual Conference of the Cognitive Science Society (CogSci 2019)

  6. arXiv:1810.13259  [pdf, other

    cs.LG stat.ML

    Non-linear Canonical Correlation Analysis: A Compressed Representation Approach

    Authors: Amichai Painsky, Meir Feder, Naftali Tishby

    Abstract: Canonical Correlation Analysis (CCA) is a linear representation learning method that seeks maximally correlated variables in multi-view data. Non-linear CCA extends this notion to a broader family of transformations, which are more powerful in many real-world applications. Given the joint probability, the Alternating Conditional Expectation (ACE) algorithm provides an optimal solution to the non-l… ▽ More

    Submitted 10 February, 2020; v1 submitted 31 October, 2018; originally announced October 2018.

  7. arXiv:1808.03353  [pdf, other

    cs.CL

    Efficient human-like semantic representations via the Information Bottleneck principle

    Authors: Noga Zaslavsky, Charles Kemp, Terry Regier, Naftali Tishby

    Abstract: Maintaining efficient semantic representations of the environment is a major challenge both for humans and for machines. While human languages represent useful solutions to this problem, it is not yet clear what computational principle could give rise to similar solutions in machines. In this work we propose an answer to this open question. We suggest that languages compress percepts into words by… ▽ More

    Submitted 9 August, 2018; originally announced August 2018.

    Journal ref: Cognitively Informed Artificial Intelligence Workshop at NIPS 2017

  8. arXiv:1805.06165  [pdf, other

    cs.CL

    Color naming reflects both perceptual structure and communicative need

    Authors: Noga Zaslavsky, Charles Kemp, Naftali Tishby, Terry Regier

    Abstract: Gibson et al. (2017) argued that color naming is shaped by patterns of communicative need. In support of this claim, they showed that color naming systems across languages support more precise communication about warm colors than cool colors, and that the objects we talk about tend to be warm-colored rather than cool-colored. Here, we present new analyses that alter this picture. We show that grea… ▽ More

    Submitted 2 August, 2018; v1 submitted 16 May, 2018; originally announced May 2018.

    Journal ref: Proceedings of the 40th Annual Conference of the Cognitive Science Society (pp. 1250 - 1255), 2018

  9. arXiv:1712.03524  [pdf, ps, other

    cs.LG

    A General Memory-Bounded Learning Algorithm

    Authors: Michal Moshkovitz, Naftali Tishby

    Abstract: Designing bounded-memory algorithms is becoming increasingly important nowadays. Previous works studying bounded-memory algorithms focused on proving impossibility results, while the design of bounded-memory algorithms was left relatively unexplored. To remedy this situation, in this work we design a general bounded-memory learning algorithm, when the underlying distribution is known. The core ide… ▽ More

    Submitted 11 October, 2019; v1 submitted 10 December, 2017; originally announced December 2017.

  10. arXiv:1711.02421  [pdf, other

    cs.LG stat.ML

    Gaussian Lower Bound for the Information Bottleneck Limit

    Authors: Amichai Painsky, Naftali Tishby

    Abstract: The Information Bottleneck (IB) is a conceptual method for extracting the most compact, yet informative, representation of a set of variables, with respect to the target. It generalizes the notion of minimal sufficient statistics from classical parametric statistics to a broader information-theoretic sense. The IB curve defines the optimal trade-off between representation complexity and its predic… ▽ More

    Submitted 7 November, 2017; originally announced November 2017.

  11. arXiv:1703.00810  [pdf, other

    cs.LG

    Opening the Black Box of Deep Neural Networks via Information

    Authors: Ravid Shwartz-Ziv, Naftali Tishby

    Abstract: Despite their great success, there is still no comprehensive theoretical understanding of learning with Deep Neural Networks (DNNs) or their inner organization. Previous work proposed to analyze DNNs in the \textit{Information Plane}; i.e., the plane of the Mutual Information values that each layer preserves on the input and output variables. They suggested that the goal of the network is to optim… ▽ More

    Submitted 29 April, 2017; v1 submitted 2 March, 2017; originally announced March 2017.

    Comments: 19 pages, 8 figures

  12. arXiv:1703.00729  [pdf, other

    cs.LG

    Mixing Complexity and its Applications to Neural Networks

    Authors: Michal Moshkovitz, Naftali Tishby

    Abstract: We suggest analyzing neural networks through the prism of space constraints. We observe that most training algorithms applied in practice use bounded memory, which enables us to use a new notion introduced in the study of space-time tradeoffs that we call mixing complexity. This notion was devised in order to measure the (in)ability to learn using a bounded-memory algorithm. In this paper we descr… ▽ More

    Submitted 2 March, 2017; originally announced March 2017.

  13. arXiv:1701.04984  [pdf, other

    cs.IT

    Control Capacity of Partially Observable Dynamic Systems in Continuous Time

    Authors: Stas Tiomkin, Daniel Polani, Naftali Tishby

    Abstract: Stochastic dynamic control systems relate in a prob- abilistic fashion the space of control signals to the space of corresponding future states. Consequently, stochastic dynamic systems can be interpreted as an information channel between the control space and the state space. In this work we study this control-to-state informartion capacity of stochastic dynamic systems in continuous-time, when t… ▽ More

    Submitted 18 January, 2017; originally announced January 2017.

    Comments: 11 [ages, 7 figures

    MSC Class: 93C41;

  14. arXiv:1609.05524  [pdf, other

    cs.LG stat.ML

    Principled Option Learning in Markov Decision Processes

    Authors: Roy Fox, Michal Moshkovitz, Naftali Tishby

    Abstract: It is well known that options can make planning more efficient, among their many benefits. Thus far, algorithms for autonomously discovering a set of useful options were heuristic. Naturally, a principled way of finding a set of useful options may be more promising and insightful. In this paper we suggest a mathematical characterization of good sets of options using tools from information theory.… ▽ More

    Submitted 30 March, 2017; v1 submitted 18 September, 2016; originally announced September 2016.

    Journal ref: 13th European Workshop on Reinforcement Learning (EWRL 2016)

  15. arXiv:1606.01947  [pdf, other

    eess.SY cs.IT

    Minimum-Information LQG Control - Part II: Retentive Controllers

    Authors: Roy Fox, Naftali Tishby

    Abstract: Retentive (memory-utilizing) sensing-acting agents may operate under limitations on the communication between their sensing, memory and acting components, requiring them to trade off the external cost that they incur with the capacity of their communication channels. In this paper we formulate this problem as a sequential rate-distortion problem of minimizing the rate of information required for t… ▽ More

    Submitted 30 March, 2017; v1 submitted 6 June, 2016; originally announced June 2016.

    Journal ref: 55th IEEE Conference on Decision and Control (CDC 2016)

  16. arXiv:1606.01946  [pdf, other

    eess.SY cs.IT

    Minimum-Information LQG Control - Part I: Memoryless Controllers

    Authors: Roy Fox, Naftali Tishby

    Abstract: With the increased demand for power efficiency in feedback-control systems, communication is becoming a limiting factor, raising the need to trade off the external cost that they incur with the capacity of the controller's communication channels. With a proper design of the channels, this translates into a sequential rate-distortion problem, where we minimize the rate of information required for t… ▽ More

    Submitted 30 March, 2017; v1 submitted 6 June, 2016; originally announced June 2016.

    Journal ref: 55th IEEE Conference on Decision and Control (CDC 2016)

  17. arXiv:1604.05129  [pdf, other

    q-bio.NC cs.AI stat.ML

    Memory shapes time perception and intertemporal choices

    Authors: Pedro A. Ortega, Naftali Tishby

    Abstract: There is a consensus that human and non-human subjects experience temporal distortions in many stages of their perceptual and decision-making systems. Similarly, intertemporal choice research has shown that decision-makers undervalue future outcomes relative to immediate ones. Here we combine techniques from information theory and artificial intelligence to show how both temporal distortions and i… ▽ More

    Submitted 29 May, 2016; v1 submitted 18 April, 2016; originally announced April 2016.

    Comments: 24 pages, 4 figures, 2 tables. Submitted

  18. arXiv:1512.08575  [pdf, other

    cs.LG cs.IT

    Optimal Selective Attention in Reactive Agents

    Authors: Roy Fox, Naftali Tishby

    Abstract: In POMDPs, information about the hidden state, delivered through observations, is both valuable to the agent, allowing it to base its actions on better informed internal states, and a "curse", exploding the size and diversity of the internal state space. One attempt to deal with this is to focus on reactive policies, that only base their actions on the most recent observation. However, even reacti… ▽ More

    Submitted 28 December, 2015; originally announced December 2015.

  19. arXiv:1512.08562  [pdf, other

    cs.LG cs.IT

    Taming the Noise in Reinforcement Learning via Soft Updates

    Authors: Roy Fox, Ari Pakman, Naftali Tishby

    Abstract: Model-free reinforcement learning algorithms, such as Q-learning, perform poorly in the early stages of learning in noisy environments, because much effort is spent unlearning biased estimates of the state-action value function. The bias results from selecting, among several noisy estimates, the apparent optimum, which may actually be suboptimal. We propose G-learning, a new off-policy learning al… ▽ More

    Submitted 30 March, 2017; v1 submitted 28 December, 2015; originally announced December 2015.

    Journal ref: 32nd Conference on Uncertainty in Artificial Intelligence (UAI 2016)

  20. arXiv:1512.06789  [pdf, other

    stat.ML cs.AI eess.SY math.OC

    Information-Theoretic Bounded Rationality

    Authors: Pedro A. Ortega, Daniel A. Braun, Justin Dyer, Kee-Eung Kim, Naftali Tishby

    Abstract: Bounded rationality, that is, decision-making and planning under resource limitations, is widely regarded as an important open problem in artificial intelligence, reinforcement learning, computational neuroscience and economics. This paper offers a consolidated presentation of a theory of bounded rationality based on information-theoretic ideas. We provide a conceptual justification for using the… ▽ More

    Submitted 21 December, 2015; originally announced December 2015.

    Comments: 47 pages, 19 figures

  21. arXiv:1503.02406  [pdf, other

    cs.LG

    Deep Learning and the Information Bottleneck Principle

    Authors: Naftali Tishby, Noga Zaslavsky

    Abstract: Deep Neural Networks (DNNs) are analyzed via the theoretical framework of the information bottleneck (IB) principle. We first show that any DNN can be quantified by the mutual information between the layers and the input and output variables. Using this representation we can calculate the optimal information theoretic limits of the DNN and obtain finite sample generalization bounds. The advantage… ▽ More

    Submitted 9 March, 2015; originally announced March 2015.

    Comments: 5 pages, 2 figures, Invited paper to ITW 2015; 2015 IEEE Information Theory Workshop (ITW) (IEEE ITW 2015)

  22. arXiv:1301.2270  [pdf

    cs.LG cs.AI stat.ML

    Multivariate Information Bottleneck

    Authors: Nir Friedman, Ori Mosenzon, Noam Slonim, Naftali Tishby

    Abstract: The Information bottleneck method is an unsupervised non-parametric data organization technique. Given a joint distribution P(A,B), this method constructs a new variable T that extracts partitions, or clusters, over the values of A that are informative about B. The information bottleneck has already been applied to document classification, gene expression, neural code, and spectral analysis. In th… ▽ More

    Submitted 10 January, 2013; originally announced January 2013.

    Comments: Appears in Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence (UAI2001)

    Report number: UAI-P-2001-PG-152-161

  23. arXiv:1212.2483  [pdf

    cs.LG stat.ML

    Sufficient Dimensionality Reduction with Irrelevant Statistics

    Authors: Amir Globerson, Gal Chechik, Naftali Tishby

    Abstract: The problem of finding a reduced dimensionality representation of categorical variables while preserving their most relevant characteristics is fundamental for the analysis of complex data. Specifically, given a co-occurrence matrix of two variables, one often seeks a compact representation of one variable which preserves information about the other variable. We have recently intro… ▽ More

    Submitted 19 October, 2012; originally announced December 2012.

    Comments: Appears in Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence (UAI2003)

    Report number: UAI-P-2003-PG-281-288

  24. arXiv:1207.4110  [pdf

    cs.LG stat.ML

    The Minimum Information Principle for Discriminative Learning

    Authors: Amir Globerson, Naftali Tishby

    Abstract: Exponential models of distributions are widely used in machine learning for classiffication and modelling. It is well known that they can be interpreted as maximum entropy models under empirical expectation constraints. In this work, we argue that for classiffication tasks, mutual information is a more suitable information theoretic measure to be optimized. We show how the principle of minimum mut… ▽ More

    Submitted 11 July, 2012; originally announced July 2012.

    Comments: Appears in Proceedings of the Twentieth Conference on Uncertainty in Artificial Intelligence (UAI2004)

    Report number: UAI-P-2004-PG-193-200

  25. arXiv:1206.6405  [pdf

    cs.LG cs.AI stat.ML

    Bounded Planning in Passive POMDPs

    Authors: Roy Fox, Naftali Tishby

    Abstract: In Passive POMDPs actions do not affect the world state, but still incur costs. When the agent is bounded by information-processing constraints, it can only keep an approximation of the belief. We present a variational principle for the problem of maintaining the information which is most useful for minimizing the cost, and introduce an efficient and simple algorithm for finding an optimum.

    Submitted 27 June, 2012; originally announced June 2012.

    Comments: Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012)

  26. arXiv:1204.1276  [pdf, ps, other

    stat.ML cs.LG

    Distribution-Dependent Sample Complexity of Large Margin Learning

    Authors: Sivan Sabato, Nathan Srebro, Naftali Tishby

    Abstract: We obtain a tight distribution-specific characterization of the sample complexity of large-margin classification with L2 regularization: We introduce the margin-adapted dimension, which is a simple function of the second order statistics of the data distribution, and show distribution-specific upper and lower bounds on the sample complexity, both governed by the margin-adapted dimension of the dat… ▽ More

    Submitted 18 September, 2013; v1 submitted 5 April, 2012; originally announced April 2012.

    Comments: arXiv admin note: text overlap with arXiv:1011.5053

    Journal ref: S. Sabato, N. Srebro and N. Tishby, "Distribution-Dependent Sample Complexity of Large Margin Learning", Journal of Machine Learning Research, 14(Jul):2119-2149, 2013

  27. arXiv:1107.2021  [pdf, ps, other

    cs.LG stat.ML

    Multi-Instance Learning with Any Hypothesis Class

    Authors: Sivan Sabato, Naftali Tishby

    Abstract: In the supervised learning setting termed Multiple-Instance Learning (MIL), the examples are bags of instances, and the bag label is a function of the labels of its instances. Typically, this function is the Boolean OR. The learner observes a sample of bags and the bag labels, but not the instance labels that determine the bag labels. The learner is then required to emit a classification rule for… ▽ More

    Submitted 13 August, 2012; v1 submitted 11 July, 2011; originally announced July 2011.

    Comments: Fixed typos and added some explanations

    Journal ref: Journal of Machine Learning Research 13(Oct):1999-3039, 2012

  28. arXiv:1011.5053  [pdf, ps, other

    cs.LG math.PR math.ST stat.ML

    Tight Sample Complexity of Large-Margin Learning

    Authors: Sivan Sabato, Nathan Srebro, Naftali Tishby

    Abstract: We obtain a tight distribution-specific characterization of the sample complexity of large-margin classification with L_2 regularization: We introduce the γ-adapted-dimension, which is a simple function of the spectrum of a distribution's covariance matrix, and show distribution-specific upper and lower bounds on the sample complexity, both governed by the γ-adapted-dimension of the source distrib… ▽ More

    Submitted 5 April, 2012; v1 submitted 23 November, 2010; originally announced November 2010.

    Comments: Appearing in Neural Information Processing Systems (NIPS) 2010; This is the full version, including appendix with proofs; Also with some corrections

    Journal ref: Advances in Neural Information Processing Systems 23 (NIPS), 2038-2046, 2010

  29. arXiv:physics/0007070  [pdf, ps, other

    physics.data-an cond-mat.dis-nn cond-mat.other cs.LG nlin.AO q-bio.OT

    Predictability, complexity and learning

    Authors: William Bialek, Ilya Nemenman, Naftali Tishby

    Abstract: We define {\em predictive information} $I_{\rm pred} (T)$ as the mutual information between the past and the future of a time series. Three qualitatively different behaviors are found in the limit of large observation times $T$: $I_{\rm pred} (T)$ can remain finite, grow logarithmically, or grow as a fractional power law. If the time series allows us to learn a model with a finite number of para… ▽ More

    Submitted 23 January, 2001; v1 submitted 19 July, 2000; originally announced July 2000.

    Comments: 53 pages, 3 figures, 98 references, LaTeX2e

    Journal ref: Neural Computation 13, 2409-2463 (2001)

  30. arXiv:physics/0004057  [pdf, ps, other

    physics.data-an cond-mat.dis-nn cs.LG nlin.AO

    The information bottleneck method

    Authors: Naftali Tishby, Fernando C. Pereira, William Bialek

    Abstract: We define the relevant information in a signal $x\in X$ as being the information that this signal provides about another signal $y\in \Y$. Examples include the information that face images provide about the names of the people portrayed, or the information that speech sounds provide about the words spoken. Understanding the signal $x$ requires more than just predicting $y$, it also requires spec… ▽ More

    Submitted 24 April, 2000; originally announced April 2000.

  31. Beyond Word N-Grams

    Authors: Fernando C. N. Pereira, Yoram Singer, Naftali Tishby

    Abstract: We describe, analyze, and evaluate experimentally a new probabilistic model for word-sequence prediction in natural language based on prediction suffix trees (PSTs). By using efficient data structures, we extend the notion of PST to unbounded vocabularies. We also show how to use a Bayesian approach based on recursive priors over all possible PSTs to efficiently maintain tree mixtures. These mix… ▽ More

    Submitted 13 July, 1996; originally announced July 1996.

    Comments: 15 pages, one PostScript figure, uses psfig.sty and fullname.sty. Revised version of a paper in the Proceedings of the Third Workshop on Very Large Corpora, MIT, 1995

  32. Distributional Clustering of English Words

    Authors: Fernando Pereira, Naftali Tishby, Lillian Lee

    Abstract: We describe and experimentally evaluate a method for automatically clustering words according to their distribution in particular syntactic contexts. Deterministic annealing is used to find lowest distortion sets of clusters. As the annealing parameter increases, existing clusters become unstable and subdivide, yielding a hierarchical ``soft'' clustering of the data. Clusters are used as the bas… ▽ More

    Submitted 22 August, 1994; originally announced August 1994.

    Comments: 8 pages, appeared in the proceedings of ACL-93, Columbus, Ohio