Zum Hauptinhalt springen

Showing 1–2 of 2 results for author: Odisho, A Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.00886  [pdf, other

    cs.AI cs.CL cs.LG

    Mechanistic Interpretation through Contextual Decomposition in Transformers

    Authors: Aliyah R. Hsu, Yeshwanth Cherapanamjeri, Anobel Y. Odisho, Peter R. Carroll, Bin Yu

    Abstract: Transformers exhibit impressive capabilities but are often regarded as black boxes due to challenges in understanding the complex nonlinear relationships between features. Interpreting machine learning models is of paramount importance to mitigate risks, and mechanistic interpretability is in particular of current interest as it opens up a window for guiding manual modifications and reverse-engine… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  2. arXiv:2305.17588  [pdf, other

    cs.CL cs.AI cs.LG

    Diagnosing Transformers: Illuminating Feature Spaces for Clinical Decision-Making

    Authors: Aliyah R. Hsu, Yeshwanth Cherapanamjeri, Briton Park, Tristan Naumann, Anobel Y. Odisho, Bin Yu

    Abstract: Pre-trained transformers are often fine-tuned to aid clinical decision-making using limited clinical notes. Model interpretability is crucial, especially in high-stakes domains like medicine, to establish trust and ensure safety, which requires human engagement. We introduce SUFO, a systematic framework that enhances interpretability of fine-tuned transformer feature spaces. SUFO utilizes a range… ▽ More

    Submitted 26 February, 2024; v1 submitted 27 May, 2023; originally announced May 2023.