Zum Hauptinhalt springen

Showing 1–18 of 18 results for author: Zhang, I

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.10914  [pdf, other

    cs.CL

    To Code, or Not To Code? Exploring Impact of Code in Pre-training

    Authors: Viraat Aryabumi, Yixuan Su, Raymond Ma, Adrien Morisot, Ivan Zhang, Acyr Locatelli, Marzieh Fadaee, Ahmet Üstün, Sara Hooker

    Abstract: Including code in the pre-training data mixture, even for models not specifically designed for code, has become a common practice in LLMs pre-training. While there has been anecdotal consensus among practitioners that code data plays a vital role in general LLMs' performance, there is only limited work analyzing the precise impact of code on non-code tasks. In this work, we systematically investig… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  2. arXiv:2406.16900  [pdf, other

    eess.IV cs.CV cs.LG

    Utilizing Weak-to-Strong Consistency for Semi-Supervised Glomeruli Segmentation

    Authors: Irina Zhang, Jim Denholm, Azam Hamidinekoo, Oskar Ålund, Christopher Bagnall, Joana Palés Huix, Michal Sulikowski, Ortensia Vito, Arthur Lewis, Robert Unwin, Magnus Soderberg, Nikolay Burlutskiy, Talha Qaiser

    Abstract: Accurate segmentation of glomerulus instances attains high clinical significance in the automated analysis of renal biopsies to aid in diagnosing and monitoring kidney disease. Analyzing real-world histopathology images often encompasses inter-observer variability and requires a labor-intensive process of data annotation. Therefore, conventional supervised learning approaches generally achieve sub… ▽ More

    Submitted 30 May, 2024; originally announced June 2024.

    Comments: accepted to MIDL'24

  3. arXiv:2403.14770  [pdf, other

    cs.AR

    Beehive: A Flexible Network Stack for Direct-Attached Accelerators

    Authors: Katie Lim, Matthew Giordano, Theano Stavrinos, Pratyush Patel, Jacob Nelson, Irene Zhang, Baris Kasikci, Tom Anderson

    Abstract: Direct-attached accelerators, where application accelerators are directly connected to the datacenter network via a hardware network stack, offer substantial benefits in terms of reduced latency, CPU overhead, and energy use. However, a key challenge is that modern datacenter network stacks are complex, with interleaved protocol layers, network management functions, and virtualization support. To… ▽ More

    Submitted 30 May, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

  4. arXiv:2312.09118  [pdf, other

    cs.NI cs.CR cs.DC

    LayerZero

    Authors: Ryan Zarick, Bryan Pellegrino, Isaac Zhang, Thomas Kim, Caleb Banister

    Abstract: In this paper, we present the first intrinsically secure and semantically universal omnichain interoperability protocol: LayerZero. Utilizing an immutable endpoint, append-only verification modules, and fully-configurable verification infrastructure, LayerZero provides the security, configurability, and extensibility necessary to achieve omnichain interoperability. LayerZero enforces strict applic… ▽ More

    Submitted 23 January, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

  5. arXiv:2311.08639  [pdf, other

    cs.NI

    ColorTrace: Fungible token coloring and attribution

    Authors: Ryan Zarick, Bryan Pellegrino, Isaac Zhang, Thomas Kim, Caleb Banister

    Abstract: We formally define the fungible token coloring problem of attributing (coloring) fungible tokens to originating entities (minters), and present, to our knowledge, the first practical onchain algorithm to solve it. Tracking attribution of colored tokens losslessly using existing approaches such as the Colored Coins protocol is computationally intractable due to the per-wallet storage requirements g… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

  6. arXiv:2311.08041  [pdf, other

    cs.NI cs.DS

    ColorFloat: Constant space token coloring

    Authors: Ryan Zarick, Bryan Pellegrino, Isaac Zhang, Thomas Kim, Caleb Banister

    Abstract: We present ColorFloat, a family of O(1) space complexity algorithms that solve the problem of attributing (coloring) fungible tokens to the entity that minted them (minter). Tagging fungible tokens with metadata is not a new problem and was first formalized in the Colored Coins protocol. In certain contexts, practical solutions to this challenge have been implemented and deployed such as NFT. We d… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

  7. arXiv:2310.03121  [pdf

    physics.chem-ph cs.LG

    OpenMM 8: Molecular Dynamics Simulation with Machine Learning Potentials

    Authors: Peter Eastman, Raimondas Galvelis, Raúl P. Peláez, Charlles R. A. Abreu, Stephen E. Farr, Emilio Gallicchio, Anton Gorenko, Michael M. Henry, Frank Hu, Jing Huang, Andreas Krämer, Julien Michel, Joshua A. Mitchell, Vijay S. Pande, João PGLM Rodrigues, Jaime Rodriguez-Guerra, Andrew C. Simmonett, Sukrit Singh, Jason Swails, Philip Turner, Yuanqing Wang, Ivy Zhang, John D. Chodera, Gianni De Fabritiis, Thomas E. Markland

    Abstract: Machine learning plays an important and growing role in molecular simulation. The newest version of the OpenMM molecular dynamics toolkit introduces new features to support the use of machine learning potentials. Arbitrary PyTorch models can be added to a simulation and used to compute forces and energy. A higher-level interface allows users to easily model their molecules of interest with general… ▽ More

    Submitted 29 November, 2023; v1 submitted 4 October, 2023; originally announced October 2023.

    Comments: 16 pages, 5 figures

    ACM Class: J.2; J.3

  8. arXiv:2304.04488  [pdf, other

    cs.DC

    Hybrid Computing for Interactive Datacenter Applications

    Authors: Pratyush Patel, Katie Lim, Kushal Jhunjhunwalla, Ashlie Martinez, Max Demoulin, Jacob Nelson, Irene Zhang, Thomas Anderson

    Abstract: Field-Programmable Gate Arrays (FPGAs) are more energy efficient and cost effective than CPUs for a wide variety of datacenter applications. Yet, for latency-sensitive and bursty workloads, this advantage can be difficult to harness due to high FPGA spin-up costs. We propose that a hybrid FPGA and CPU computing framework can harness the energy efficiency benefits of FPGAs for such workloads at rea… ▽ More

    Submitted 10 April, 2023; originally announced April 2023.

    Comments: 13 pages

  9. arXiv:2201.02120  [pdf, other

    cs.DC cs.CY cs.LG cs.NI

    Treehouse: A Case For Carbon-Aware Datacenter Software

    Authors: Thomas Anderson, Adam Belay, Mosharaf Chowdhury, Asaf Cidon, Irene Zhang

    Abstract: The end of Dennard scaling and the slowing of Moore's Law has put the energy use of datacenters on an unsustainable path. Datacenters are already a significant fraction of worldwide electricity use, with application demand scaling at a rapid rate. We argue that substantial reductions in the carbon intensity of datacenter computing are possible with a software-centric approach: by making energy and… ▽ More

    Submitted 6 January, 2022; originally announced January 2022.

  10. arXiv:2108.07790  [pdf, other

    cs.CL cs.LG

    Mitigating harm in language models with conditional-likelihood filtration

    Authors: Helen Ngo, Cooper Raterink, João G. M. Araújo, Ivan Zhang, Carol Chen, Adrien Morisot, Nicholas Frosst

    Abstract: Language models trained on large-scale unfiltered datasets curated from the open web acquire systemic biases, prejudices, and harmful views from their training data. We present a methodology for programmatically identifying and removing harmful text from web-scale datasets. A pretrained language model is used to calculate the log-likelihood of researcher-written trigger phrases conditioned on a sp… ▽ More

    Submitted 27 November, 2021; v1 submitted 4 August, 2021; originally announced August 2021.

  11. arXiv:2010.01196  [pdf, other

    physics.comp-ph cs.AI

    End-to-End Differentiable Molecular Mechanics Force Field Construction

    Authors: Yuanqing Wang, Josh Fass, Benjamin Kaminow, John E. Herr, Dominic Rufa, Ivy Zhang, Iván Pulido, Mike Henry, John D. Chodera

    Abstract: Molecular mechanics (MM) potentials have long been a workhorse of computational chemistry. Leveraging accuracy and speed, these functional forms find use in a wide variety of applications in biomolecular modeling and drug discovery, from rapid virtual screening to detailed free energy calculations. Traditionally, MM potentials have relied on human-curated, inflexible, and poorly extensible discret… ▽ More

    Submitted 18 April, 2022; v1 submitted 2 October, 2020; originally announced October 2020.

  12. arXiv:2008.06536  [pdf, other

    cs.CR cs.OS

    Making Distributed Mobile Applications SAFE: Enforcing User Privacy Policies on Untrusted Applications with Secure Application Flow Enforcement

    Authors: Adriana Szekeres, Irene Zhang, Katelin Bailey, Isaac Ackerman, Haichen Shen, Franziska Roesner, Dan R. K. Ports, Arvind Krishnamurthy, Henry M. Levy

    Abstract: Today's mobile devices sense, collect, and store huge amounts of personal information, which users share with family and friends through a wide range of applications. Once users give applications access to their data, they must implicitly trust that the apps correctly maintain data privacy. As we know from both experience and all-too-frequent press articles, that trust is often misplaced. While us… ▽ More

    Submitted 14 August, 2020; originally announced August 2020.

  13. A coarse-to-fine framework for unsupervised multi-contrast MR image deformable registration with dual consistency constraint

    Authors: Weijian Huang, Hao Yang, Xinfeng Liu, Cheng Li, Ian Zhang, Rongpin Wang, Hairong Zheng, Shanshan Wang

    Abstract: Multi-contrast magnetic resonance (MR) image registration is useful in the clinic to achieve fast and accurate imaging-based disease diagnosis and treatment planning. Nevertheless, the efficiency and performance of the existing registration algorithms can still be improved. In this paper, we propose a novel unsupervised learning-based framework to achieve accurate and efficient multi-contrast MR i… ▽ More

    Submitted 16 February, 2021; v1 submitted 4 August, 2020; originally announced August 2020.

    Journal ref: IEEE Transactions on Medical Imaging (2021)

  14. Talek: Private Group Messaging with Hidden Access Patterns

    Authors: Raymond Cheng, William Scott, Elisaweta Masserova, Irene Zhang, Vipul Goyal, Thomas Anderson, Arvind Krishnamurthy, Bryan Parno

    Abstract: Talek is a private group messaging system that sends messages through potentially untrustworthy servers, while hiding both data content and the communication patterns among its users. Talek explores a new point in the design space of private messaging; it guarantees access sequence indistinguishability, which is among the strongest guarantees in the space, while assuming an anytrust threat model,… ▽ More

    Submitted 15 December, 2020; v1 submitted 22 January, 2020; originally announced January 2020.

  15. arXiv:1910.03655  [pdf, other

    cs.CL cs.AI cs.LG

    Executing Instructions in Situated Collaborative Interactions

    Authors: Alane Suhr, Claudia Yan, Charlotte Schluger, Stanley Yu, Hadi Khader, Marwa Mouallem, Iris Zhang, Yoav Artzi

    Abstract: We study a collaborative scenario where a user not only instructs a system to complete tasks, but also acts alongside it. This allows the user to adapt to the system abilities by changing their language or deciding to simply accomplish some tasks themselves, and requires the system to effectively recover from errors as the user strategically assigns it new goals. We build a game environment to stu… ▽ More

    Submitted 22 November, 2022; v1 submitted 8 October, 2019; originally announced October 2019.

    Comments: EMNLP 2019 long paper

  16. arXiv:1905.13678  [pdf, other

    cs.LG stat.ML

    Learning Sparse Networks Using Targeted Dropout

    Authors: Aidan N. Gomez, Ivan Zhang, Siddhartha Rao Kamalakara, Divyam Madaan, Kevin Swersky, Yarin Gal, Geoffrey E. Hinton

    Abstract: Neural networks are easier to optimise when they have many more weights than are required for modelling the mapping from inputs to outputs. This suggests a two-stage learning procedure that first learns a large net and then prunes away connections or hidden units. But standard training does not necessarily encourage nets to be amenable to pruning. We introduce targeted dropout, a method for traini… ▽ More

    Submitted 9 September, 2019; v1 submitted 31 May, 2019; originally announced May 2019.

  17. arXiv:1811.00491  [pdf, other

    cs.CL cs.CV

    A Corpus for Reasoning About Natural Language Grounded in Photographs

    Authors: Alane Suhr, Stephanie Zhou, Ally Zhang, Iris Zhang, Huajun Bai, Yoav Artzi

    Abstract: We introduce a new dataset for joint reasoning about natural language and images, with a focus on semantic diversity, compositionality, and visual reasoning challenges. The data contains 107,292 examples of English sentences paired with web photographs. The task is to determine whether a natural language caption is true about a pair of photographs. We crowdsource the data using sets of visually ri… ▽ More

    Submitted 21 July, 2019; v1 submitted 1 November, 2018; originally announced November 2018.

    Comments: ACL 2019 Long Paper

  18. arXiv:1801.04883  [pdf, other

    cs.LG

    Unsupervised Cipher Cracking Using Discrete GANs

    Authors: Aidan N. Gomez, Sicong Huang, Ivan Zhang, Bryan M. Li, Muhammad Osama, Lukasz Kaiser

    Abstract: This work details CipherGAN, an architecture inspired by CycleGAN used for inferring the underlying cipher mapping given banks of unpaired ciphertext and plaintext. We demonstrate that CipherGAN is capable of cracking language data enciphered using shift and Vigenere ciphers to a high degree of fidelity and for vocabularies much larger than previously achieved. We present how CycleGAN can be made… ▽ More

    Submitted 15 January, 2018; originally announced January 2018.