Zum Hauptinhalt springen

Showing 1–2 of 2 results for author: Dhankhar, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.09276  [pdf, other

    cs.CL cs.LG

    H2O-Danube3 Technical Report

    Authors: Pascal Pfeiffer, Philipp Singer, Yauhen Babakhin, Gabor Fodor, Nischay Dhankhar, Sri Satish Ambati

    Abstract: We present H2O-Danube3, a series of small language models consisting of H2O-Danube3-4B, trained on 6T tokens and H2O-Danube3-500M, trained on 4T tokens. Our models are pre-trained on high quality Web data consisting of primarily English tokens in three stages with different data mixes before final supervised tuning for chat version. The models exhibit highly competitive metrics across a multitude… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  2. arXiv:2401.16818  [pdf, other

    cs.CL cs.LG

    H2O-Danube-1.8B Technical Report

    Authors: Philipp Singer, Pascal Pfeiffer, Yauhen Babakhin, Maximilian Jeblick, Nischay Dhankhar, Gabor Fodor, Sri Satish Ambati

    Abstract: We present H2O-Danube, a series of small 1.8B language models consisting of H2O-Danube-1.8B, trained on 1T tokens, and the incremental improved H2O-Danube2-1.8B trained on an additional 2T tokens. Our models exhibit highly competitive metrics across a multitude of benchmarks and, as of the time of this writing, H2O-Danube2-1.8B achieves the top ranking on Open LLM Leaderboard for all models below… ▽ More

    Submitted 15 April, 2024; v1 submitted 30 January, 2024; originally announced January 2024.