Zum Hauptinhalt springen

Showing 1–6 of 6 results for author: Mauch, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2401.12068  [pdf, other

    cs.SD cs.LG eess.AS

    Resource-constrained stereo singing voice cancellation

    Authors: Clara Borrelli, James Rae, Dogac Basaran, Matt McVicar, Mehrez Souden, Matthias Mauch

    Abstract: We study the problem of stereo singing voice cancellation, a subtask of music source separation, whose goal is to estimate an instrumental background from a stereo mix. We explore how to achieve performance similar to large state-of-the-art source separation networks starting from a small, efficient model for real-time speech separation. Such a model is useful when memory and compute are limited a… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

  2. arXiv:2208.12724  [pdf, other

    cs.IR

    Multi-objective Hyper-parameter Optimization of Behavioral Song Embeddings

    Authors: Massimo Quadrana, Antoine Larreche-Mouly, Matthias Mauch

    Abstract: Song embeddings are a key component of most music recommendation engines. In this work, we study the hyper-parameter optimization of behavioral song embeddings based on Word2Vec on a selection of downstream tasks, namely next-song recommendation, false neighbor rejection, and artist and genre clustering. We present new optimization objectives and metrics to monitor the effects of hyper-parameter o… ▽ More

    Submitted 26 August, 2022; originally announced August 2022.

    Comments: 9 pages, 4 figures Accepted as paper at ISMIR 2022

  3. arXiv:2112.11436  [pdf, other

    cs.CL

    Lyric document embeddings for music tagging

    Authors: Matt McVicar, Bruno Di Giorgi, Baris Dundar, Matthias Mauch

    Abstract: We present an empirical study on embedding the lyrics of a song into a fixed-dimensional feature for the purpose of music tagging. Five methods of computing token-level and four methods of computing document-level representations are trained on an industrial-scale dataset of tens of millions of songs. We compare simple averaging of pretrained embeddings to modern recurrent and attention-based neur… ▽ More

    Submitted 29 November, 2021; originally announced December 2021.

  4. arXiv:2102.02282  [pdf, other

    cs.SD cs.LG cs.MM eess.AS

    Downbeat Tracking with Tempo-Invariant Convolutional Neural Networks

    Authors: Bruno Di Giorgi, Matthias Mauch, Mark Levy

    Abstract: The human ability to track musical downbeats is robust to changes in tempo, and it extends to tempi never previously encountered. We propose a deterministic time-warping operation that enables this skill in a convolutional neural network (CNN) by allowing the network to learn rhythmic patterns independently of tempo. Unlike conventional deep learning approaches, which learn rhythmic patterns at th… ▽ More

    Submitted 3 February, 2021; originally announced February 2021.

    Comments: 7 pages, 5 figures, Proceedings of the 21st International Society for Music Information Retrieval Conference, ISMIR 2020

    Journal ref: Proceedings of the 21st International Society for Music Information Retrieval Conference (2020) 216-222

  5. arXiv:1502.05417  [pdf, other

    physics.soc-ph cs.SD

    The Evolution of Popular Music: USA 1960-2010

    Authors: Matthias Mauch, Robert M. MacCallum, Mark Levy, Armand M. Leroi

    Abstract: In modern societies, cultural change seems ceaseless. The flux of fashion is especially obvious for popular music. While much has been written about the origin and evolution of pop, most claims about its history are anecdotal rather than scientific in nature. To rectify this we investigate the US Billboard Hot 100 between 1960 and 2010. Using Music Information Retrieval (MIR) and text-mining tools… ▽ More

    Submitted 17 February, 2015; originally announced February 2015.

    Comments: MS: 13 pages, 6 figures; SI: 15 pages, 7 figures

    Journal ref: R. Soc. open sci. 2015 2 150081

  6. Sequential Complexity as a Descriptor for Musical Similarity

    Authors: Peter Foster, Matthias Mauch, Simon Dixon

    Abstract: We propose string compressibility as a descriptor of temporal structure in audio, for the purpose of determining musical similarity. Our descriptors are based on computing track-wise compression rates of quantised audio features, using multiple temporal resolutions and quantisation granularities. To verify that our descriptors capture musically relevant information, we incorporate our descriptors… ▽ More

    Submitted 28 September, 2014; v1 submitted 27 February, 2014; originally announced February 2014.

    Comments: 13 pages, 9 figures, 8 tables. Accepted version

    Journal ref: IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 22 no. 12, pp. 1965-1977, 2014