Zum Hauptinhalt springen

Showing 1–17 of 17 results for author: Torrellas, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.11988  [pdf, other

    cs.DC

    Distributed-Memory Parallel Algorithms for Sparse Matrix and Sparse Tall-and-Skinny Matrix Multiplication

    Authors: Isuru Ranawaka, Md Taufique Hussain, Charles Block, Gerasimos Gerogiannis, Josep Torrellas, Ariful Azad

    Abstract: We consider a sparse matrix-matrix multiplication (SpGEMM) setting where one matrix is square and the other is tall and skinny. This special variant, called TS-SpGEMM, has important applications in multi-source breadth-first search, influence maximization, sparse graph embedding, and algebraic multigrid solvers. Unfortunately, popular distributed algorithms like sparse SUMMA deliver suboptimal per… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

  2. arXiv:2408.00741  [pdf, other

    cs.AI cs.AR cs.DC

    DynamoLLM: Designing LLM Inference Clusters for Performance and Energy Efficiency

    Authors: Jovan Stojkovic, Chaojie Zhang, Íñigo Goiri, Josep Torrellas, Esha Choukse

    Abstract: The rapid evolution and widespread adoption of generative large language models (LLMs) have made them a pivotal workload in various applications. Today, LLM inference clusters receive a large number of queries with strict Service Level Objectives (SLOs). To achieve the desired performance, these models execute on power-hungry GPUs causing the inference clusters to consume large amount of energy an… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

  3. Last-Level Cache Side-Channel Attacks Are Feasible in the Modern Public Cloud (Extended Version)

    Authors: Zirui Neil Zhao, Adam Morrison, Christopher W. Fletcher, Josep Torrellas

    Abstract: Last-level cache side-channel attacks have been mostly demonstrated in highly-controlled, quiescent local environments. Hence, it is unclear whether such attacks are feasible in a production cloud environment. In the cloud, side channels are flooded with noise from activities of other tenants and, in Function-as-a-Service (FaaS) workloads, the attacker has a very limited time window to mount the a… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Journal ref: 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2024), Volume 2, pages 582-600, La Jolla, CA, USA, May 2024

  4. arXiv:2403.20306  [pdf, other

    cs.AI cs.AR cs.DC

    Towards Greener LLMs: Bringing Energy-Efficiency to the Forefront of LLM Inference

    Authors: Jovan Stojkovic, Esha Choukse, Chaojie Zhang, Inigo Goiri, Josep Torrellas

    Abstract: With the ubiquitous use of modern large language models (LLMs) across industries, the inference serving for these models is ever expanding. Given the high compute and memory requirements of modern LLMs, more and more top-of-the-line GPUs are being deployed to serve these models. Energy availability has come to the forefront as the biggest challenge for data center expansion to serve these models.… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: 6 pages, 15 figures

    ACM Class: C.0; I.2

  5. arXiv:2306.15155  [pdf, other

    cs.LG cs.PF

    SENSEi: Input-Sensitive Compilation for Accelerating GNNs

    Authors: Damitha Lenadora, Vimarsh Sathia, Gerasimos Gerogiannis, Serif Yesil, Josep Torrellas, Charith Mendis

    Abstract: Over the years, many frameworks and optimization techniques have been proposed to accelerate graph neural networks (GNNs). Compared to the optimizations explored in these systems, we observe that different matrix re-associations of GNN computations lead to novel input-sensitive performance behavior. We leverage this observation to propose SENSEi, a system that exposes different sparse and dense ma… ▽ More

    Submitted 8 March, 2024; v1 submitted 26 June, 2023; originally announced June 2023.

  6. arXiv:2302.01474  [pdf, other

    cs.CR cs.AR cs.LG

    Defensive ML: Defending Architectural Side-channels with Adversarial Obfuscation

    Authors: Hyoungwook Nam, Raghavendra Pradyumna Pothukuchi, Bo Li, Nam Sung Kim, Josep Torrellas

    Abstract: Side-channel attacks that use machine learning (ML) for signal analysis have become prominent threats to computer security, as ML models easily find patterns in signals. To address this problem, this paper explores using Adversarial Machine Learning (AML) methods as a defense at the computer architecture layer to obfuscate side channels. We call this approach Defensive ML, and the generator to obf… ▽ More

    Submitted 14 October, 2023; v1 submitted 2 February, 2023; originally announced February 2023.

    Comments: Preprint. Under review

  7. arXiv:2205.06444  [pdf, other

    cs.PL

    UniHeap: Managing Persistent Objects Across Managed Runtimes for Non-Volatile Memory

    Authors: Daixuan Li, Benjamin Reidys, Jinghan Sun, Thomas Shull, Josep Torrellas, Jian Huang

    Abstract: Byte-addressable, non-volatile memory (NVM) is emerging as a promising technology. To facilitate its wide adoption, employing NVM in managed runtimes like JVM has proven to be an effective approach (i.e., managed NVM). However, such an approach is runtime specific, which lacks a generic abstraction across different managed languages. Similar to the well-known filesystem primitives that allow diver… ▽ More

    Submitted 13 May, 2022; originally announced May 2022.

    Comments: A 2 page extended abstract for NVMW 2022'

  8. arXiv:2112.10632  [pdf, other

    cs.AR

    A Method for Hiding the Increased Non-Volatile Cache Read Latency

    Authors: Apostolos Kokolis, Namrata Mantri, Shrikanth Ganapathy, Josep Torrellas, John Kalamatianos

    Abstract: The increased memory demands of workloads is putting high pressure on Last Level Caches (LLCs). Unfortunately, there is limited opportunity to increase the capacity of LLCs due to the area and power requirements of the underlying SRAM technology. Interestingly, emerging Non-Volatile Memory (NVM) technologies promise a feasible alternative to SRAM for LLCs due to their higher area density. However,… ▽ More

    Submitted 20 December, 2021; originally announced December 2021.

    Comments: 14 pages, 15 figures

  9. arXiv:2007.11818  [pdf, other

    cs.AR cs.CR

    Speculative Interference Attacks: Breaking Invisible Speculation Schemes

    Authors: Mohammad Behnia, Prateek Sahu, Riccardo Paccagnella, Jiyong Yu, Zirui Zhao, Xiang Zou, Thomas Unterluggauer, Josep Torrellas, Carlos Rozas, Adam Morrison, Frank Mckeen, Fangfei Liu, Ron Gabor, Christopher W. Fletcher, Abhishek Basak, Alaa Alameldeen

    Abstract: Recent security vulnerabilities that target speculative execution (e.g., Spectre) present a significant challenge for processor design. The highly publicized vulnerability uses speculative execution to learn victim secrets by changing cache state. As a result, recent computer architecture research has focused on invisible speculation mechanisms that attempt to block changes in cache state due to s… ▽ More

    Submitted 23 April, 2021; v1 submitted 23 July, 2020; originally announced July 2020.

    Comments: Updated CR Version

  10. arXiv:1911.10175  [pdf, other

    cs.LG cs.DC stat.ML

    SparseTrain:Leveraging Dynamic Sparsity in Training DNNs on General-Purpose SIMD Processors

    Authors: Zhangxiaowen Gong, Houxiang Ji, Christopher Fletcher, Christopher Hughes, Josep Torrellas

    Abstract: Our community has greatly improved the efficiency of deep learning applications, including by exploiting sparsity in inputs. Most of that work, though, is for inference, where weight sparsity is known statically, and/or for specialized hardware. We propose a scheme to leverage dynamic sparsity during training. In particular, we exploit zeros introduced by the ReLU activation function to both featu… ▽ More

    Submitted 22 November, 2019; originally announced November 2019.

  11. Maya: Falsifying Power Sidechannels with Dynamic Control

    Authors: Raghavendra Pradyumna Pothukuchi, Sweta Yamini Pothukuchi, Petros Voulgaris, Alexander Schwing, Josep Torrellas

    Abstract: The security of computers is at risk because of information leaking through physical outputs such as power, temperature, or electromagnetic (EM) emissions. Attackers can use advanced signal measurement and analysis to recover sensitive data from these sidechannels. To address this problem, this paper presents Maya, a simple and effective solution against power side-channels. The idea is to re-shap… ▽ More

    Submitted 18 August, 2019; v1 submitted 22 July, 2019; originally announced July 2019.

    Journal ref: 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA), 2021, pp. 888-901

  12. Opportunistic Beamforming in Wireless Network-on-Chip

    Authors: S. Abadal, A. Marruedo, A. Franques, H. Taghvaee, A. Cabellos-Aparicio, J. Zhou, J. Torrellas, E. Alarcón

    Abstract: Wireless Network-on-Chip (WNoC) has emerged as a promising alternative to conventional interconnect fabrics at the chip scale. Since WNoCs may imply the close integration of antennas, one of the salient challenges in this scenario is the management of coupling and interferences. This paper, instead of combating coupling, aims to take advantage of close integration to create arrays within a WNoC. T… ▽ More

    Submitted 12 June, 2019; originally announced June 2019.

    Comments: Presented at IEEE ISCAS 2019, Sapporo, Japan

  13. arXiv:1901.04291  [pdf, other

    cs.NI

    Engineer the Channel and Adapt to it: Enabling Wireless Intra-Chip Communication

    Authors: Xavier Timoneda, Sergi Abadal, Antonio Franques, Dionysios Manessis, Jin Zhou, Josep Torrellas, Eduard Alarcón, Albert Cabellos-Aparicio

    Abstract: Ubiquitous multicore processors nowadays rely on an integrated packet-switched network for cores to exchange and share data. The performance of these intra-chip networks is a key determinant of the processor speed and, at high core counts, becomes an important bottleneck due to scalability issues. To address this, several works propose the use of mm-wave wireless interconnects for intra-chip commu… ▽ More

    Submitted 12 February, 2020; v1 submitted 23 December, 2018; originally announced January 2019.

    Comments: 12 pages, 10 figures. IEEE Transactions on Communications Journal, 2020

  14. arXiv:1808.04761  [pdf, other

    cs.DC cs.CR cs.LG

    Cache Telepathy: Leveraging Shared Resource Attacks to Learn DNN Architectures

    Authors: Mengjia Yan, Christopher Fletcher, Josep Torrellas

    Abstract: Deep Neural Networks (DNNs) are fast becoming ubiquitous for their ability to attain good accuracy in various machine learning tasks. A DNN's architecture (i.e., its hyper-parameters) broadly determines the DNN's accuracy and performance, and is often confidential. Attacking a DNN in the cloud to obtain its architecture can potentially provide major commercial value. Further, attaining a DNN's arc… ▽ More

    Submitted 14 August, 2018; originally announced August 2018.

  15. Millimeter-Wave Propagation within a Computer Chip Package

    Authors: X. Timoneda, S. Abadal, A. Cabellos-Aparicio, D. Manessis, J. Zhou, A. Franques, J. Torrellas, E. Alarcón

    Abstract: Wireless Network-on-Chip (WNoC) appears as a promising alternative to conventional interconnect fabrics for chip-scale communications. The WNoC paradigm has been extensively analyzed from the physical, network and architecture perspectives assuming mmWave band operation. However, there has not been a comprehensive study at this band for realistic chip packages and, thus, the characteristics of suc… ▽ More

    Submitted 25 July, 2018; originally announced July 2018.

    Comments: Presented at the 2018 International Symposium on Circuits & Systems (ISCAS)

  16. arXiv:1806.06294  [pdf, ps, other

    cs.DC cs.NI

    Medium Access Control in Wireless Network-on-Chip: A Context Analysis

    Authors: Sergi Abadal, Albert Mestres, Josep Torrellas, Eduard Alarcón, Albert Cabellos-Aparicio

    Abstract: Wireless on-chip communication is a promising candidate to address the performance and efficiency issues that arise when scaling current Network-on-Chip (NoC) techniques to manycore processors. A Wireless Network-on-Chip (WNoC) can serve global and broadcast traffic with ultra-low latency even in thousand-core chips, thus acting as a natural complement of conventional and throughput-oriented wirel… ▽ More

    Submitted 16 June, 2018; originally announced June 2018.

    Comments: To appear in IEEE Communications Magazine

  17. arXiv:1609.06756  [pdf

    cs.CY

    21st Century Computer Architecture

    Authors: Mark D. Hill, Sarita Adve, Luis Ceze, Mary Jane Irwin, David Kaeli, Margaret Martonosi, Josep Torrellas, Thomas F. Wenisch, David Wood, Katherine Yelick

    Abstract: Because most technology and computer architecture innovations were (intentionally) invisible to higher layers, application and other software developers could reap the benefits of this progress without engaging in it. Higher performance has both made more computationally demanding applications feasible (e.g., virtual assistants, computer vision) and made less demanding applications easier to devel… ▽ More

    Submitted 21 September, 2016; originally announced September 2016.

    Comments: A Computing Community Consortium (CCC) white paper, 16 pages