Zum Hauptinhalt springen

Showing 1–50 of 62 results for author: Markov, I

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.10994  [pdf, other

    cs.CL cs.AI cs.HC cs.LG

    Panza: A Personalized Text Writing Assistant via Data Playback and Local Fine-Tuning

    Authors: Armand Nicolicioiu, Eugenia Iofinova, Eldar Kurtic, Mahdi Nikdan, Andrei Panferov, Ilia Markov, Nir Shavit, Dan Alistarh

    Abstract: The availability of powerful open-source large language models (LLMs) opens exciting use-cases, such as automated personal assistants that adapt to the user's unique data and demands. Two key desiderata for such assistants are personalization-in the sense that the assistant should reflect the user's own style-and privacy-in the sense that users may prefer to always store their personal data locall… ▽ More

    Submitted 24 June, 2024; originally announced July 2024.

    Comments: Panza is available at https://github.com/IST-DASLab/PanzaMail

  2. arXiv:2405.15756  [pdf, other

    cs.LG cs.AI

    Sparse Expansion and Neuronal Disentanglement

    Authors: Shashata Sawmya, Linghao Kong, Ilia Markov, Dan Alistarh, Nir Shavit

    Abstract: We show how to improve the inference efficiency of an LLM by expanding it into a mixture of sparse experts, where each expert is a copy of the original weights, one-shot pruned for a specific cluster of input values. We call this approach $\textit{Sparse Expansion}$. We show that, for models such as Llama 2 70B, as we increase the number of sparse experts, Sparse Expansion outperforms all other on… ▽ More

    Submitted 24 June, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: 10 pages, 8 figures

  3. arXiv:2405.13754  [pdf, other

    cs.CL

    Grounding Toxicity in Real-World Events across Languages

    Authors: Wondimagegnhue Tsegaye Tufa, Ilia Markov, Piek Vossen

    Abstract: Social media conversations frequently suffer from toxicity, creating significant issues for users, moderators, and entire communities. Events in the real world, like elections or conflicts, can initiate and escalate toxic behavior online. Our study investigates how real-world events influence the origin and spread of toxicity in online discussions across various languages and regions. We gathered… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: Paper accepted for at The 29th International Conference on Natural Language & Information Systems (NLDB 2024)

  4. arXiv:2404.18865  [pdf, other

    cs.CL

    Truth-value judgment in language models: belief directions are context sensitive

    Authors: Stefan F. Schouten, Peter Bloem, Ilia Markov, Piek Vossen

    Abstract: Recent work has demonstrated that the latent spaces of large language models (LLMs) contain directions predictive of the truth of sentences. Multiple methods recover such directions and build probes that are described as getting at a model's "knowledge" or "beliefs". We investigate this phenomenon, looking closely at the impact of context on the probes. Our experiments establish where in the LLM t… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  5. arXiv:2404.18810  [pdf, other

    cs.CL

    Unknown Script: Impact of Script on Cross-Lingual Transfer

    Authors: Wondimagegnhue Tsegaye Tufa, Ilia Markov, Piek Vossen

    Abstract: Cross-lingual transfer has become an effective way of transferring knowledge between languages. In this paper, we explore an often overlooked aspect in this domain: the influence of the source language of a language model on language transfer performance. We consider a case where the target language and its script are not part of the pre-trained model. We conduct a series of experiments on monolin… ▽ More

    Submitted 7 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

    Comments: Paper accepted to NAACL Student Research Workshop (SRW) 2024

  6. arXiv:2404.18726  [pdf, other

    cs.CL

    The Constant in HATE: Analyzing Toxicity in Reddit across Topics and Languages

    Authors: Wondimagegnhue Tsegaye Tufa, Ilia Markov, Piek Vossen

    Abstract: Toxic language remains an ongoing challenge on social media platforms, presenting significant issues for users and communities. This paper provides a cross-topic and cross-lingual analysis of toxicity in Reddit conversations. We collect 1.5 million comment threads from 481 communities in six languages: English, German, Spanish, Turkish,Arabic, and Dutch, covering 80 topics such as Culture, Politic… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: Accepted to TRAC 2024

  7. arXiv:2311.05787  [pdf, other

    cs.LG

    Towards stable real-world equation discovery with assessing differentiating quality influence

    Authors: Mikhail Masliaev, Ilya Markov, Alexander Hvatov

    Abstract: This paper explores the critical role of differentiation approaches for data-driven differential equation discovery. Accurate derivatives of the input data are essential for reliable algorithmic operation, particularly in real-world scenarios where measurement quality is inevitably compromised. We propose alternatives to the commonly used finite differences-based method, notorious for its instabil… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

  8. arXiv:2310.14657  [pdf, other

    cs.CL cs.AI

    Reasoning about Ambiguous Definite Descriptions

    Authors: Stefan F. Schouten, Peter Bloem, Ilia Markov, Piek Vossen

    Abstract: Natural language reasoning plays an increasingly important role in improving language models' ability to solve complex language understanding tasks. An interesting use case for reasoning is the resolution of context-dependent ambiguity. But no resources exist to evaluate how well Large Language Models can use explicit reasoning to resolve ambiguity in language. We propose to use ambiguous definite… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 Findings

  9. arXiv:2310.09259  [pdf, other

    cs.LG

    QUIK: Towards End-to-End 4-Bit Inference on Generative Large Language Models

    Authors: Saleh Ashkboos, Ilia Markov, Elias Frantar, Tingxuan Zhong, Xincheng Wang, Jie Ren, Torsten Hoefler, Dan Alistarh

    Abstract: Large Language Models (LLMs) from the GPT family have become extremely popular, leading to a race towards reducing their inference costs to allow for efficient local computation. Yet, the vast majority of existing work focuses on weight-only quantization, which can reduce runtime costs in the memory-bound one-token-at-a-time generative setting, but does not address them in compute-bound scenarios,… ▽ More

    Submitted 2 November, 2023; v1 submitted 13 October, 2023; originally announced October 2023.

    Comments: 16 pages

  10. arXiv:2306.09642  [pdf, ps, other

    cs.CL cs.LG

    Cross-Domain Toxic Spans Detection

    Authors: Stefan F. Schouten, Baran Barbarestani, Wondimagegnhue Tufa, Piek Vossen, Ilia Markov

    Abstract: Given the dynamic nature of toxic language use, automated methods for detecting toxic spans are likely to encounter distributional shift. To explore this phenomenon, we evaluate three approaches for detecting toxic spans under cross-domain conditions: lexicon-based, rationale extraction, and fine-tuned language models. Our findings indicate that a simple method using off-the-shelf lexicons perform… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

    Comments: NLDB 2023

  11. arXiv:2306.09633  [pdf, other

    cs.LG cs.AI cs.AR cs.CY

    The False Dawn: Reevaluating Google's Reinforcement Learning for Chip Macro Placement

    Authors: Igor L. Markov

    Abstract: Reinforcement learning (RL) for physical design of silicon chips in a Google 2021 Nature paper stirred controversy due to poorly documented claims that raised eyebrows and drew critical media coverage. The paper withheld critical methodology steps and most inputs needed to reproduce results. Our meta-analysis shows how two separate evaluations filled in the gaps and demonstrated that Google RL lag… ▽ More

    Submitted 5 July, 2024; v1 submitted 16 June, 2023; originally announced June 2023.

    Comments: 15 pages, 1 figure, 4 tables, 83 references

  12. arXiv:2303.16531  [pdf, other

    cs.CV

    RusTitW: Russian Language Text Dataset for Visual Text in-the-Wild Recognition

    Authors: Igor Markov, Sergey Nesteruk, Andrey Kuznetsov, Denis Dimitrov

    Abstract: Information surrounds people in modern life. Text is a very efficient type of information that people use for communication for centuries. However, automated text-in-the-wild recognition remains a challenging problem. The major limitation for a DL system is the lack of training data. For the competitive performance, training set must contain many samples that replicate the real-world cases. While… ▽ More

    Submitted 29 March, 2023; originally announced March 2023.

    Comments: 5 pages, 6 figures, 2 tables

  13. arXiv:2303.11580  [pdf, other

    cs.LG

    Efficient Multi-stage Inference on Tabular Data

    Authors: Daniel S Johnson, Igor L Markov

    Abstract: Many ML applications and products train on medium amounts of input data but get bottlenecked in real-time inference. When implementing ML systems, conventional wisdom favors segregating ML code into services queried by product code via Remote Procedure Call (RPC) APIs. This approach clarifies the overall software architecture and simplifies product code by abstracting away ML internals. However, t… ▽ More

    Submitted 21 July, 2023; v1 submitted 21 March, 2023; originally announced March 2023.

  14. arXiv:2302.14139  [pdf, other

    cs.LG cs.AI cs.SE

    Scalable End-to-End ML Platforms: from AutoML to Self-serve

    Authors: Igor L. Markov, Pavlos A. Apostolopoulos, Mia R. Garrard, Tanya Qie, Yin Huang, Tanvi Gupta, Anika Li, Cesar Cardoso, George Han, Ryan Maghsoudian, Norm Zhou

    Abstract: ML platforms help enable intelligent data-driven applications and maintain them with limited engineering effort. Upon sufficiently broad adoption, such platforms reach economies of scale that bring greater component reuse while improving efficiency of system development and maintenance. For an end-to-end ML platform with broad adoption, scaling relies on pervasive ML automation and system integrat… ▽ More

    Submitted 3 March, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

    Comments: 10 pages, 1 figure, 2 tables

  15. arXiv:2302.12360  [pdf, other

    cs.LG cs.AI

    Practical Knowledge Distillation: Using DNNs to Beat DNNs

    Authors: Chung-Wei Lee, Pavlos Athanasios Apostolopulos, Igor L. Markov

    Abstract: For tabular data sets, we explore data and model distillation, as well as data denoising. These techniques improve both gradient-boosting models and a specialized DNN architecture. While gradient boosting is known to outperform DNNs on tabular data, we close the gap for datasets with 100K+ rows and give DNNs an advantage on small data sets. We extend these results with input-data distillation and… ▽ More

    Submitted 1 March, 2023; v1 submitted 23 February, 2023; originally announced February 2023.

    Comments: 11 pages, 1 figure, 17 tables

  16. arXiv:2302.02390  [pdf, other

    cs.LG

    Quantized Distributed Training of Large Models with Convergence Guarantees

    Authors: Ilia Markov, Adrian Vladu, Qi Guo, Dan Alistarh

    Abstract: Communication-reduction techniques are a popular way to improve scalability in data-parallel training of deep neural networks (DNNs). The recent emergence of large language models such as GPT has created the need for new approaches to exploit data-parallelism. Among these, fully-sharded data parallel (FSDP) training is highly popular, yet it still encounters scalability bottlenecks. One reason is… ▽ More

    Submitted 5 February, 2023; originally announced February 2023.

  17. arXiv:2301.07233  [pdf, other

    quant-ph cs.ET

    Enhancing quantum computer performance via symmetrization

    Authors: Andrii Maksymov, Jason Nguyen, Yunseong Nam, Igor Markov

    Abstract: Large quantum computers promise to solve some critical problems not solvable otherwise. However, modern quantum technologies suffer various imperfections such as control errors and qubit decoherence, inhibiting their potential utility. The overheads of quantum error correction are too great for near-term quantum computers, whereas error-mitigation strategies that address specific device imperfecti… ▽ More

    Submitted 17 January, 2023; originally announced January 2023.

  18. arXiv:2210.17357  [pdf, other

    cs.LG cs.DC

    L-GreCo: Layerwise-Adaptive Gradient Compression for Efficient and Accurate Deep Learning

    Authors: Mohammadreza Alimohammadi, Ilia Markov, Elias Frantar, Dan Alistarh

    Abstract: Data-parallel distributed training of deep neural networks (DNN) has gained very widespread adoption, but can still experience communication bottlenecks. To address this issue, entire families of compression mechanisms have been developed, including quantization, sparsification, and low-rank approximation, some of which are seeing significant practical adoption. Despite this progress, almost all k… ▽ More

    Submitted 9 June, 2023; v1 submitted 31 October, 2022; originally announced October 2022.

  19. arXiv:2210.12526  [pdf, other

    cs.CR cs.LG

    Federated Calibration and Evaluation of Binary Classifiers

    Authors: Graham Cormode, Igor Markov

    Abstract: We address two major obstacles to practical use of supervised classifiers on distributed private data. Whether a classifier was trained by a federation of cooperating clients or trained centrally out of distribution, (1) the output scores must be calibrated, and (2) performance metrics must be evaluated -- all without assembling labels in one place. In particular, we show how to perform calibratio… ▽ More

    Submitted 22 October, 2022; originally announced October 2022.

    Comments: 24 pages

  20. arXiv:2202.09483  [pdf, other

    cs.CL cs.SI

    Data-Driven Mitigation of Adversarial Text Perturbation

    Authors: Rasika Bhalerao, Mohammad Al-Rubaie, Anand Bhaskar, Igor Markov

    Abstract: Social networks have become an indispensable part of our lives, with billions of people producing ever-increasing amounts of text. At such scales, content policies and their enforcement become paramount. To automate moderation, questionable content is detected by Natural Language Processing (NLP) classifiers. However, high-performance classifiers are hampered by misspellings and adversarial text p… ▽ More

    Submitted 18 February, 2022; originally announced February 2022.

  21. arXiv:2111.12795  [pdf, other

    cs.HC cs.AI cs.LG

    Picasso: Model-free Feature Visualization

    Authors: Binh Vu, Igor Markov

    Abstract: Today, Machine Learning (ML) applications can have access to tens of thousands of features. With such feature sets, efficiently browsing and curating subsets of most relevant features is a challenge. In this paper, we present a novel approach to visualize up to several thousands of features in a single image. The image not only shows information on individual features, but also expresses feature i… ▽ More

    Submitted 24 November, 2021; originally announced November 2021.

  22. CGX: Adaptive System Support for Communication-Efficient Deep Learning

    Authors: Ilia Markov, Hamidreza Ramezanikebrya, Dan Alistarh

    Abstract: The ability to scale out training workloads has been one of the key performance enablers of deep learning. The main scaling approach is data-parallel GPU-based training, which has been boosted by hardware and software support for highly efficient point-to-point communication, and in particular via hardware bandwidth overprovisioning. Overprovisioning comes at a cost: there is an order of magnitude… ▽ More

    Submitted 29 December, 2022; v1 submitted 16 November, 2021; originally announced November 2021.

    Journal ref: Middleware 2022

  23. arXiv:2110.07554  [pdf, other

    cs.LG cs.AI cs.SE

    Looper: An end-to-end ML platform for product decisions

    Authors: Igor L. Markov, Hanson Wang, Nitya Kasturi, Shaun Singh, Sze Wai Yuen, Mia Garrard, Sarah Tran, Yin Huang, Zehui Wang, Igor Glotov, Tanvi Gupta, Boshuang Huang, Peng Chen, Xiaowen Xie, Michael Belkin, Sal Uryasev, Sam Howie, Eytan Bakshy, Norm Zhou

    Abstract: Modern software systems and products increasingly rely on machine learning models to make data-driven decisions based on interactions with users, infrastructure and other systems. For broader adoption, this practice must (i) accommodate product engineers without ML backgrounds, (ii) support finegrain product-metric evaluation and (iii) optimize for product goals. To address shortcomings of prior p… ▽ More

    Submitted 21 June, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

    Comments: 11 pages + references, 7 figures; to appear in KDD 2022

  24. arXiv:2109.11577  [pdf, other

    cs.LG

    Text Ranking and Classification using Data Compression

    Authors: Nitya Kasturi, Igor L. Markov

    Abstract: A well-known but rarely used approach to text categorization uses conditional entropy estimates computed using data compression tools. Text affinity scores derived from compressed sizes can be used for classification and ranking tasks, but their success depends on the compression tools used. We use the Zstandard compressor and strengthen these ideas in several ways, calling the resulting language-… ▽ More

    Submitted 7 December, 2021; v1 submitted 23 September, 2021; originally announced September 2021.

    Journal ref: ICBINB workshop at NeurIPS 2021

  25. Mixture-Based Correction for Position and Trust Bias in Counterfactual Learning to Rank

    Authors: Ali Vardasbi, Maarten de Rijke, Ilya Markov

    Abstract: In counterfactual learning to rank (CLTR) user interactions are used as a source of supervision. Since user interactions come with bias, an important focus of research in this field lies in developing methods to correct for the bias of interactions. Inverse propensity scoring (IPS) is a popular method suitable for correcting position bias. Affine correction (AC) is a generalization of IPS that cor… ▽ More

    Submitted 19 August, 2021; originally announced August 2021.

    Comments: CIKM 2021

  26. arXiv:2108.03708  [pdf, other

    quant-ph cs.ET

    Detecting Qubit-coupling Faults in Ion-trap Quantum Computers

    Authors: Andrii O. Maksymov, Jason Nguyen, Vandiver Chaplin, Yunseong Nam, Igor L. Markov

    Abstract: Ion-trap quantum computers offer a large number of possible qubit couplings, each of which requires individual calibration and can be misconfigured. To enhance the duty cycle of an ion trap, we develop a strategy that diagnoses individual miscalibrated couplings using only log-many tests. This strategy is validated on a commercial ion-trap quantum computer, where we illustrate the process of debug… ▽ More

    Submitted 12 December, 2021; v1 submitted 8 August, 2021; originally announced August 2021.

    Journal ref: HPCA 2022

  27. arXiv:2108.01521  [pdf, other

    cs.CR cs.DS

    Bit-efficient Numerical Aggregation and Stronger Privacy for Trust in Federated Analytics

    Authors: Graham Cormode, Igor L. Markov

    Abstract: Private data generated by edge devices -- from smart phones to automotive electronics -- are highly informative when aggregated but can be damaging when mishandled. A variety of solutions are being explored but have not yet won the public's trust and full backing of mobile platforms. In this work, we propose numerical aggregation protocols that empirically improve upon prior art, while providing c… ▽ More

    Submitted 3 August, 2021; originally announced August 2021.

    Comments: 15 pages

  28. arXiv:2104.13818   

    cs.LG math.OC stat.ML

    NUQSGD: Provably Communication-efficient Data-parallel SGD via Nonuniform Quantization

    Authors: Ali Ramezani-Kebrya, Fartash Faghri, Ilya Markov, Vitalii Aksenov, Dan Alistarh, Daniel M. Roy

    Abstract: As the size and complexity of models and datasets grow, so does the need for communication-efficient variants of stochastic gradient descent that can be deployed to perform parallel model training. One popular communication-compression method for data-parallel SGD is QSGD (Alistarh et al., 2017), which quantizes and encodes gradients to reduce communication costs. The baseline variant of QSGD prov… ▽ More

    Submitted 1 May, 2021; v1 submitted 28 April, 2021; originally announced April 2021.

    Comments: This entry is redundant and was created in error. See arXiv:1908.06077 for the latest version

  29. arXiv:2102.09507  [pdf, ps, other

    cs.CL cs.LG cs.SI

    Regular Expressions for Fast-response COVID-19 Text Classification

    Authors: Igor L. Markov, Jacqueline Liu, Adam Vagner

    Abstract: Text classifiers are at the core of many NLP applications and use a variety of algorithmic approaches and software. This paper introduces infrastructure and methodologies for text classifiers based on large-scale regular expressions. In particular, we describe how Facebook determines if a given piece of text - anything from a hashtag to a post - belongs to a narrow topic such as COVID-19. To fully… ▽ More

    Submitted 21 June, 2021; v1 submitted 18 February, 2021; originally announced February 2021.

    Comments: 10 pages, 7 tables

  30. arXiv:2102.08465  [pdf, other

    cs.SI cs.DL cs.IR cs.LG

    Prioritizing Original News on Facebook

    Authors: Xiuyan Ni, Shujian Bu, Igor L. Markov

    Abstract: This work outlines how we prioritize original news, a critical indicator of news quality. By examining the landscape and life-cycle of news posts on our social media platform, we identify challenges of building and deploying an originality score. We pursue an approach based on normalized PageRank values and three-step clustering, and refresh the score on an hourly basis to capture the dynamics of… ▽ More

    Submitted 14 March, 2021; v1 submitted 16 February, 2021; originally announced February 2021.

    Comments: 9 pages, 8 figures, 6 tables, 2 algorithm pseudocodes

    Journal ref: CIKM 2021

  31. arXiv:2102.05612  [pdf, other

    cs.LG cs.HC cs.SE

    Personalization for Web-based Services using Offline Reinforcement Learning

    Authors: Pavlos Athanasios Apostolopoulos, Zehui Wang, Hanson Wang, Chad Zhou, Kittipat Virochsiri, Norm Zhou, Igor L. Markov

    Abstract: Large-scale Web-based services present opportunities for improving UI policies based on observed user interactions. We address challenges of learning such policies through model-free offline Reinforcement Learning (RL) with off-policy training. Deployed in a production system for user authentication in a major social network, it significantly improves long-term objectives. We articulate practical… ▽ More

    Submitted 10 February, 2021; originally announced February 2021.

    Comments: 9 pages, 8 figures, 3 tables

    Journal ref: 2nd Offline Reinforcement Learning Workshop at NeurIPS 2021

  32. arXiv:2010.12460  [pdf, other

    cs.LG stat.ML

    Adaptive Gradient Quantization for Data-Parallel SGD

    Authors: Fartash Faghri, Iman Tabrizian, Ilia Markov, Dan Alistarh, Daniel Roy, Ali Ramezani-Kebrya

    Abstract: Many communication-efficient variants of SGD use gradient quantization schemes. These schemes are often heuristic and fixed over the course of training. We empirically observe that the statistics of gradients of deep models change during the training. Motivated by this observation, we introduce two adaptive quantization schemes, ALQ and AMQ. In both schemes, processors update their compression sch… ▽ More

    Submitted 23 October, 2020; originally announced October 2020.

    Comments: Accepted at the conference on Neural Information Processing Systems (NeurIPS 2020)

  33. arXiv:2008.00216  [pdf, other

    quant-ph cs.AR cs.DC cs.ET physics.comp-ph

    Faster Schrödinger-style simulation of quantum circuits

    Authors: Aneeqa Fatima, Igor L. Markov

    Abstract: Recent demonstrations of superconducting quantum computers by Google and IBM and trapped-ion computers from IonQ fueled new research in quantum algorithms, compilation into quantum circuits, and empirical algorithmics. While online access to quantum hardware remains too limited to meet the demand, simulating quantum circuits on conventional computers satisfies many needs. We advance Schrödinger-st… ▽ More

    Submitted 24 November, 2020; v1 submitted 1 August, 2020; originally announced August 2020.

    Comments: 14 pages, 15 figures, 4 tables. Version 2 : Additional optimizations; improved simulation runtimes; profiling data; comparisons with the latest IBM QISKit simulator; dispelled apparent limitations of techniques. Version 3 : Ablation experiments and images for the code snippets

    Journal ref: HPCA 2021

  34. Cascade Model-based Propensity Estimation for Counterfactual Learning to Rank

    Authors: Ali Vardasbi, Maarten de Rijke, Ilya Markov

    Abstract: Unbiased CLTR requires click propensities to compensate for the difference between user clicks and true relevance of search results via IPS. Current propensity estimation methods assume that user click behavior follows the PBM and estimate click propensities based on this assumption. However, in reality, user clicks often follow the CM, where users scan search results from top to bottom and where… ▽ More

    Submitted 25 May, 2020; originally announced May 2020.

    Comments: 4 pages, 2 figures, 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '20)

  35. arXiv:2005.01588  [pdf

    cs.CY

    Workshops on Extreme Scale Design Automation (ESDA) Challenges and Opportunities for 2025 and Beyond

    Authors: R. Iris Bahar, Alex K. Jones, Srinivas Katkoori, Patrick H. Madden, Diana Marculescu, Igor L. Markov

    Abstract: Integrated circuits and electronic systems, as well as design technologies, are evolving at a great rate -- both quantitatively and qualitatively. Major developments include new interconnects and switching devices with atomic-scale uncertainty, the depth and scale of on-chip integration, electronic system-level integration, the increasing significance of software, as well as more effective means o… ▽ More

    Submitted 4 May, 2020; originally announced May 2020.

    Comments: A Computing Community Consortium (CCC) workshop report, 32 pages

    Report number: ccc2014report_1

  36. arXiv:2002.00467  [pdf, other

    cs.IR cs.LG

    Safe Exploration for Optimizing Contextual Bandits

    Authors: Rolf Jagerman, Ilya Markov, Maarten de Rijke

    Abstract: Contextual bandit problems are a natural fit for many information retrieval tasks, such as learning to rank, text classification, recommendation, etc. However, existing learning methods for contextual bandit problems have one of two drawbacks: they either do not explore the space of all possible document rankings (i.e., actions) and, thus, may miss the optimal ranking, or they present suboptimal r… ▽ More

    Submitted 2 February, 2020; originally announced February 2020.

    Comments: 23 pages, 3 figures

  37. arXiv:2001.05918  [pdf, other

    cs.LG stat.ML

    Elastic Consistency: A General Consistency Model for Distributed Stochastic Gradient Descent

    Authors: Giorgi Nadiradze, Ilia Markov, Bapi Chatterjee, Vyacheslav Kungurtsev, Dan Alistarh

    Abstract: Machine learning has made tremendous progress in recent years, with models matching or even surpassing humans on a series of specialized tasks. One key element behind the progress of machine learning in recent years has been the ability to train machine learning models in large-scale distributed shared-memory and message-passing environments. Many of these models are trained employing variants of… ▽ More

    Submitted 28 June, 2020; v1 submitted 16 January, 2020; originally announced January 2020.

  38. arXiv:1908.06077  [pdf, other

    cs.LG stat.ML

    NUQSGD: Provably Communication-efficient Data-parallel SGD via Nonuniform Quantization

    Authors: Ali Ramezani-Kebrya, Fartash Faghri, Ilya Markov, Vitalii Aksenov, Dan Alistarh, Daniel M. Roy

    Abstract: As the size and complexity of models and datasets grow, so does the need for communication-efficient variants of stochastic gradient descent that can be deployed to perform parallel model training. One popular communication-compression method for data-parallel SGD is QSGD (Alistarh et al., 2017), which quantizes and encodes gradients to reduce communication costs. The baseline variant of QSGD prov… ▽ More

    Submitted 3 May, 2021; v1 submitted 16 August, 2019; originally announced August 2019.

    Comments: 42 pages, 21 figures. To appear in the Journal of Machine Learning Research (JMLR)

  39. ViTOR: Learning to Rank Webpages Based on Visual Features

    Authors: Bram van den Akker, Ilya Markov, Maarten de Rijke

    Abstract: The visual appearance of a webpage carries valuable information about its quality and can be used to improve the performance of learning to rank (LTR). We introduce the Visual learning TO Rank (ViTOR) model that integrates state-of-the-art visual features extraction methods by (i) transfer learning from a pre-trained image classification model, and (ii) synthetic saliency heat maps generated from… ▽ More

    Submitted 7 March, 2019; originally announced March 2019.

    Comments: In Proceedings of the 2019 World Wide Web Conference (WWW 2019), May 2019, San Francisco

    ACM Class: H.3.3; I.2

  40. arXiv:1812.04910  [pdf, other

    cs.IR

    Online Learning to Rank with List-level Feedback for Image Filtering

    Authors: Chang Li, Artem Grotov, Ilya Markov, Maarten de Rijke

    Abstract: Online learning to rank (OLTR) via implicit feedback has been extensively studied for document retrieval in cases where the feedback is available at the level of individual items. To learn from item-level feedback, the current algorithms require certain assumptions about user behavior. In this paper, we study a more general setup: OLTR with list-level feedback, where the feedback is provided only… ▽ More

    Submitted 9 January, 2019; v1 submitted 12 December, 2018; originally announced December 2018.

  41. arXiv:1812.04412  [pdf, other

    cs.IR cs.LG

    MergeDTS: A Method for Effective Large-Scale Online Ranker Evaluation

    Authors: Chang Li, Ilya Markov, Maarten de Rijke, Masrour Zoghi

    Abstract: Online ranker evaluation is one of the key challenges in information retrieval. While the preferences of rankers can be inferred by interleaving methods, the problem of how to effectively choose the ranker pair that generates the interleaved list without degrading the user experience too much is still challenging. On the one hand, if two rankers have not been compared enough, the inferred preferen… ▽ More

    Submitted 9 August, 2020; v1 submitted 11 December, 2018; originally announced December 2018.

    Comments: Accepted at TOIS

  42. arXiv:1807.10749  [pdf, other

    quant-ph cs.DC cs.ET

    Quantum Supremacy Is Both Closer and Farther than It Appears

    Authors: Igor L. Markov, Aneeqa Fatima, Sergei V. Isakov, Sergio Boixo

    Abstract: As quantum computers improve in the number of qubits and fidelity, the question of when they surpass state-of-the-art classical computation for a well-defined computational task is attracting much attention. The leading candidate task for this milestone entails sampling from the output distribution defined by a random quantum circuit. We develop a massively-parallel simulation tool Rollright that… ▽ More

    Submitted 26 September, 2018; v1 submitted 27 July, 2018; originally announced July 2018.

    Comments: 32 pages, 3 figures, 1. A new section on how to simulate sampling. 2. New comparisons with simulators developed by other groups. Edited for clarity

    Journal ref: DAC 2020

  43. arXiv:1806.05819  [pdf, other

    cs.LG stat.ML

    BubbleRank: Safe Online Learning to Re-Rank via Implicit Click Feedback

    Authors: Chang Li, Branislav Kveton, Tor Lattimore, Ilya Markov, Maarten de Rijke, Csaba Szepesvari, Masrour Zoghi

    Abstract: In this paper, we study the problem of safe online learning to re-rank, where user feedback is used to improve the quality of displayed lists. Learning to rank has traditionally been studied in two settings. In the offline setting, rankers are typically learned from relevance labels created by judges. This approach has generally become standard in industrial applications of ranking, such as search… ▽ More

    Submitted 29 June, 2019; v1 submitted 15 June, 2018; originally announced June 2018.

  44. arXiv:1805.03411  [pdf, other

    cs.IR

    A Click Sequence Model for Web Search

    Authors: Alexey Borisov, Martijn Wardenaar, Ilya Markov, Maarten de Rijke

    Abstract: Getting a better understanding of user behavior is important for advancing information retrieval systems. Existing work focuses on modeling and predicting single interaction events, such as clicks. In this paper, we for the first time focus on modeling and predicting sequences of interaction events. And in particular, sequences of clicks. We formulate the problem of click sequence prediction and… ▽ More

    Submitted 9 May, 2018; originally announced May 2018.

  45. arXiv:1712.03554  [pdf, other

    cs.DS quant-ph

    Simulation of Quantum Circuits via Stabilizer Frames

    Authors: Héctor J. García, Igor L. Markov

    Abstract: Generic quantum-circuit simulation appears intractable for conventional computers and may be unnecessary because useful quantum circuits exhibit significant structure that can be exploited during simulation. For example, Gottesman and Knill identified an important subclass, called stabilizer circuits, which can be simulated efficiently using group-theory techniques and insights from quantum physic… ▽ More

    Submitted 10 December, 2017; originally announced December 2017.

    Comments: 15 pages, 18 figures, 3 tables

    Journal ref: IEEE Transactions on Computers, vol. 64, no. 8, 2015

  46. arXiv:1711.07848  [pdf, other

    quant-ph cs.CG cs.ET

    On the Geometry of Stabilizer States

    Authors: Héctor J. García, Igor L. Markov, Andrew W. Cross

    Abstract: Large-scale quantum computation is likely to require massive quantum error correction (QEC). QEC codes and circuits are described via the stabilizer formalism, which represents stabilizer states by keeping track of the operators that preserve them. Such states are obtained by stabilizer circuits (consisting of CNOT, Hadamard and Phase gates) and can be represented compactly on conventional compute… ▽ More

    Submitted 20 November, 2017; originally announced November 2017.

    Comments: 38 pages, 10 figures, 2 Appendices. arXiv admin note: substantial text overlap with arXiv:1210.6646

    Journal ref: Quantum Information and Computation (QIC), vol. 14, no. 7-8, pp. 683-720, 2014

  47. arXiv:1709.05298  [pdf, other

    cs.HC cs.IR

    Conversational Exploratory Search via Interactive Storytelling

    Authors: Svitlana Vakulenko, Ilya Markov, Maarten de Rijke

    Abstract: Conversational interfaces are likely to become more efficient, intuitive and engaging way for human-computer interaction than today's text or touch-based interfaces. Current research efforts concerning conversational interfaces focus primarily on question answering functionality, thereby neglecting support for search activities beyond targeted information lookup. Users engage in exploratory search… ▽ More

    Submitted 15 September, 2017; originally announced September 2017.

    Comments: Accepted at ICTIR'17 Workshop on Search-Oriented Conversational AI (SCAI 2017)

  48. arXiv:1412.0650  [pdf, ps, other

    cs.ET cs.NE

    A review of "Mem-computing NP-complete problems in polynomial time using polynomial resources" (arXiv:1411.4798)

    Authors: Igor L. Markov

    Abstract: The reviewed paper describes an analog device that empirically solves small instances of the NP-complete Subset Sum Problem (SSP). The authors claim that this device can solve the SSP in polynomial time using polynomial space, in principle, and observe no exponential scaling in resource requirements. We point out that (a) the properties ascribed by the authors to their device are insufficient to s… ▽ More

    Submitted 22 April, 2015; v1 submitted 29 November, 2014; originally announced December 2014.

  49. arXiv:1408.3821  [pdf, other

    cs.ET quant-ph

    Limits on Fundamental Limits to Computation

    Authors: Igor L. Markov

    Abstract: An indispensable part of our lives, computing has also become essential to industries and governments. Steady improvements in computer hardware have been supported by periodic doubling of transistor densities in integrated circuits over the last fifty years. Such Moore scaling now requires increasingly heroic efforts, stimulating research in alternative hardware and stirring controversy. To help e… ▽ More

    Submitted 8 January, 2015; v1 submitted 17 August, 2014; originally announced August 2014.

    Comments: 15 pages, 4 figures, 1 table

    Journal ref: Nature 512, 147-154 (14 August 2014)

  50. arXiv:1304.7516  [pdf, ps, other

    cs.ET quant-ph

    Quantum Circuits for GCD Computation with $O(n \log n)$ Depth and O(n) Ancillae

    Authors: Mehdi Saeedi, Igor L. Markov

    Abstract: GCD computations and variants of the Euclidean algorithm enjoy broad uses in both classical and quantum algorithms. In this paper, we propose quantum circuits for GCD computation with $O(n \log n)$ depth with O(n) ancillae. Prior circuit construction needs $O(n^2)$ running time with O(n) ancillae. The proposed construction is based on the binary GCD algorithm and it benefits from log-depth circuit… ▽ More

    Submitted 28 April, 2013; originally announced April 2013.

    Comments: 5 pages, 6 figures, 1 table