Zum Hauptinhalt springen

Showing 1–1 of 1 results for author: Woolverton, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2302.00805  [pdf, other

    cs.AI

    Conditioning Predictive Models: Risks and Strategies

    Authors: Evan Hubinger, Adam Jermyn, Johannes Treutlein, Rubi Hudson, Kate Woolverton

    Abstract: Our intention is to provide a definitive reference on what it would take to safely make use of generative/predictive models in the absence of a solution to the Eliciting Latent Knowledge problem. Furthermore, we believe that large language models can be understood as such predictive models of the world, and that such a conceptualization raises significant opportunities for their safe yet powerful… ▽ More

    Submitted 6 February, 2023; v1 submitted 1 February, 2023; originally announced February 2023.