Zum Hauptinhalt springen

Showing 1–17 of 17 results for author: Baykal, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2307.08397  [pdf, other

    cs.CV

    CLIP-Guided StyleGAN Inversion for Text-Driven Real Image Editing

    Authors: Ahmet Canberk Baykal, Abdul Basit Anees, Duygu Ceylan, Erkut Erdem, Aykut Erdem, Deniz Yuret

    Abstract: Researchers have recently begun exploring the use of StyleGAN-based models for real image editing. One particularly interesting application is using natural language descriptions to guide the editing process. Existing approaches for editing images using language either resort to instance-level latent code optimization or map predefined text prompts to some editing directions in the latent space. H… ▽ More

    Submitted 18 July, 2023; v1 submitted 17 July, 2023; originally announced July 2023.

    Comments: Accepted for publication in ACM Transactions on Graphics

  2. arXiv:2302.03806  [pdf, other

    cs.LG

    SLaM: Student-Label Mixing for Distillation with Unlabeled Examples

    Authors: Vasilis Kontonis, Fotis Iliopoulos, Khoa Trinh, Cenk Baykal, Gaurav Menghani, Erik Vee

    Abstract: Knowledge distillation with unlabeled examples is a powerful training paradigm for generating compact and lightweight student models in applications where the amount of labeled data is limited but one has access to a large pool of unlabeled data. In this setting, a large teacher model generates ``soft'' pseudo-labels for the unlabeled dataset which are then used for training the student model. Des… ▽ More

    Submitted 8 June, 2023; v1 submitted 7 February, 2023; originally announced February 2023.

  3. arXiv:2302.00003  [pdf, other

    cs.LG cs.CL

    The Power of External Memory in Increasing Predictive Model Capacity

    Authors: Cenk Baykal, Dylan J Cutler, Nishanth Dikkala, Nikhil Ghosh, Rina Panigrahy, Xin Wang

    Abstract: One way of introducing sparsity into deep networks is by attaching an external table of parameters that is sparsely looked up at different layers of the network. By storing the bulk of the parameters in the external table, one can increase the capacity of the model without necessarily increasing the inference time. Two crucial questions in this setting are then: what is the lookup function for acc… ▽ More

    Submitted 30 January, 2023; originally announced February 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2301.13310

  4. arXiv:2301.13310  [pdf, other

    cs.LG cs.CL

    Alternating Updates for Efficient Transformers

    Authors: Cenk Baykal, Dylan Cutler, Nishanth Dikkala, Nikhil Ghosh, Rina Panigrahy, Xin Wang

    Abstract: It has been well established that increasing scale in deep transformer networks leads to improved quality and performance. However, this increase in scale often comes with prohibitive increases in compute cost and inference latency. We introduce Alternating Updates (AltUp), a simple-to-implement method to increase a model's capacity without the computational burden. AltUp enables the widening of t… ▽ More

    Submitted 3 October, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

  5. arXiv:2210.06711  [pdf, other

    cs.LG cs.AI

    Weighted Distillation with Unlabeled Examples

    Authors: Fotis Iliopoulos, Vasilis Kontonis, Cenk Baykal, Gaurav Menghani, Khoa Trinh, Erik Vee

    Abstract: Distillation with unlabeled examples is a popular and powerful method for training deep neural networks in settings where the amount of labeled data is limited: A large ''teacher'' neural network is trained on the labeled data available, and then it is used to generate labels on an unlabeled dataset (typically much larger in size). These labels are then utilized to train the smaller ''student'' mo… ▽ More

    Submitted 13 October, 2022; originally announced October 2022.

    Comments: To appear in NeurIPS 2022

  6. arXiv:2210.01213  [pdf, other

    cs.LG cs.AI

    Robust Active Distillation

    Authors: Cenk Baykal, Khoa Trinh, Fotis Iliopoulos, Gaurav Menghani, Erik Vee

    Abstract: Distilling knowledge from a large teacher model to a lightweight one is a widely successful approach for generating compact, powerful models in the semi-supervised learning setting where a limited amount of labeled data is available. In large-scale applications, however, the teacher tends to provide a large number of incorrect soft-labels that impairs student performance. The sheer size of the tea… ▽ More

    Submitted 4 February, 2023; v1 submitted 3 October, 2022; originally announced October 2022.

  7. arXiv:2208.04461  [pdf, other

    cs.LG stat.ML

    A Theoretical View on Sparsely Activated Networks

    Authors: Cenk Baykal, Nishanth Dikkala, Rina Panigrahy, Cyrus Rashtchian, Xin Wang

    Abstract: Deep and wide neural networks successfully fit very complex functions today, but dense models are starting to be prohibitively expensive for inference. To mitigate this, one promising direction is networks that activate a sparse subgraph of the network. The subgraph is chosen by a data-dependent routing function, enforcing a fixed mapping of inputs to subnetworks (e.g., the Mixture of Experts (MoE… ▽ More

    Submitted 8 August, 2022; originally announced August 2022.

    Comments: 18 pages, 7 figures

  8. arXiv:2202.03621  [pdf, other

    cs.LG

    Bandit Sampling for Multiplex Networks

    Authors: Cenk Baykal, Vamsi K. Potluru, Sameena Shah, Manuela M. Veloso

    Abstract: Graph neural networks have gained prominence due to their excellent performance in many classification and prediction tasks. In particular, they are used for node classification and link prediction which have a wide range of applications in social networks, biomedical data sets, and financial transaction graphs. Most of the existing work focuses primarily on the monoplex setting where we have acce… ▽ More

    Submitted 7 February, 2022; originally announced February 2022.

  9. arXiv:2106.03033  [pdf, other

    cs.LG cs.SI

    Graph Belief Propagation Networks

    Authors: Junteng Jia, Cenk Baykal, Vamsi K. Potluru, Austin R. Benson

    Abstract: With the wide-spread availability of complex relational data, semi-supervised node classification in graphs has become a central machine learning problem. Graph neural networks are a recent class of easy-to-train and accurate methods for this problem that map the features in the neighborhood of a node to its label, but they ignore label correlation during inference and their predictions are diffic… ▽ More

    Submitted 6 June, 2021; originally announced June 2021.

  10. arXiv:2104.02822  [pdf, other

    cs.LG

    Low-Regret Active learning

    Authors: Cenk Baykal, Lucas Liebenwein, Dan Feldman, Daniela Rus

    Abstract: We develop an online learning algorithm for identifying unlabeled data points that are most informative for training (i.e., active learning). By formulating the active learning problem as the prediction with sleeping experts problem, we provide a regret minimization framework for identifying relevant data with respect to any given definition of informativeness. Motivated by the successes of ensemb… ▽ More

    Submitted 22 February, 2022; v1 submitted 6 April, 2021; originally announced April 2021.

  11. arXiv:2103.03014  [pdf, other

    cs.LG cs.AI cs.CV

    Lost in Pruning: The Effects of Pruning Neural Networks beyond Test Accuracy

    Authors: Lucas Liebenwein, Cenk Baykal, Brandon Carter, David Gifford, Daniela Rus

    Abstract: Neural network pruning is a popular technique used to reduce the inference costs of modern, potentially overparameterized, networks. Starting from a pre-trained network, the process is as follows: remove redundant parameters, retrain, and repeat while maintaining the same test accuracy. The result is a model that is a fraction of the size of the original with comparable predictive performance (tes… ▽ More

    Submitted 4 March, 2021; originally announced March 2021.

    Comments: Published in MLSys 2021

  12. arXiv:2002.06469  [pdf, other

    cs.LG stat.ML

    On Coresets for Support Vector Machines

    Authors: Murad Tukan, Cenk Baykal, Dan Feldman, Daniela Rus

    Abstract: We present an efficient coreset construction algorithm for large-scale Support Vector Machine (SVM) training in Big Data and streaming applications. A coreset is a small, representative subset of the original data points such that a models trained on the coreset are provably competitive with those trained on the original data set. Since the size of the coreset is generally much smaller than the or… ▽ More

    Submitted 15 February, 2020; originally announced February 2020.

  13. arXiv:1911.07412  [pdf, other

    cs.LG stat.ML

    Provable Filter Pruning for Efficient Neural Networks

    Authors: Lucas Liebenwein, Cenk Baykal, Harry Lang, Dan Feldman, Daniela Rus

    Abstract: We present a provable, sampling-based approach for generating compact Convolutional Neural Networks (CNNs) by identifying and removing redundant filters from an over-parameterized network. Our algorithm uses a small batch of input data points to assign a saliency score to each filter and constructs an importance sampling distribution where filters that highly affect the output are sampled with cor… ▽ More

    Submitted 23 March, 2020; v1 submitted 17 November, 2019; originally announced November 2019.

  14. arXiv:1910.05422  [pdf, other

    cs.LG cs.DS stat.ML

    SiPPing Neural Networks: Sensitivity-informed Provable Pruning of Neural Networks

    Authors: Cenk Baykal, Lucas Liebenwein, Igor Gilitschenski, Dan Feldman, Daniela Rus

    Abstract: We introduce a pruning algorithm that provably sparsifies the parameters of a trained model in a way that approximately preserves the model's predictive accuracy. Our algorithm uses a small batch of input points to construct a data-informed importance sampling distribution over the network's parameters, and adaptively mixes a sampling-based and deterministic pruning procedure to discard redundant… ▽ More

    Submitted 14 March, 2021; v1 submitted 11 October, 2019; originally announced October 2019.

    Comments: First two authors contributed equally

  15. arXiv:1804.05345  [pdf, other

    cs.LG cs.DS stat.ML

    Data-Dependent Coresets for Compressing Neural Networks with Applications to Generalization Bounds

    Authors: Cenk Baykal, Lucas Liebenwein, Igor Gilitschenski, Dan Feldman, Daniela Rus

    Abstract: We present an efficient coresets-based neural network compression algorithm that sparsifies the parameters of a trained fully-connected neural network in a manner that provably approximates the network's output. Our approach is based on an importance sampling scheme that judiciously defines a sampling distribution over the neural network parameters, and as a result, retains parameters of high impo… ▽ More

    Submitted 17 May, 2019; v1 submitted 15 April, 2018; originally announced April 2018.

    Comments: First two authors contributed equally

  16. arXiv:1708.03835  [pdf, other

    cs.DS cs.LG

    Training Support Vector Machines using Coresets

    Authors: Cenk Baykal, Lucas Liebenwein, Wilko Schwarting

    Abstract: We present a novel coreset construction algorithm for solving classification tasks using Support Vector Machines (SVMs) in a computationally efficient manner. A coreset is a weighted subset of the original data points that provably approximates the original set. We show that coresets of size polylogarithmic in $n$ and polynomial in $d$ exist for a set of $n$ input points with $d$ features and pres… ▽ More

    Submitted 9 November, 2017; v1 submitted 12 August, 2017; originally announced August 2017.

  17. arXiv:1707.02386  [pdf, other

    cs.NI eess.SY

    Detection of AQM on Paths using Machine Learning Methods

    Authors: Cenk Baykal, Wilko Schwarting, Alex Wallar

    Abstract: In this paper, we address the problem of determining whether a bottleneck router on a given network path is using an AQM or a drop-tail scheme. We assume that we are given a source-to-sink path of interest -along which a bottleneck router exists- and data regarding the Round-Trip Times (RTT) and Congestion Window (CWND) sizes with respect to this flow. We develop a reliable classification algorith… ▽ More

    Submitted 7 July, 2017; originally announced July 2017.