Search | arXiv e-print repository

The algorithmic nature of song-sequencing: statistical regularities in music albums

Authors: Pedro Neto, Martin Hartmann, Geoff Luck, Petri Toiviainen

Abstract: Based on a review of anecdotal beliefs, we explored patterns of track-sequencing within professional music albums. We found that songs with high levels of valence, energy and loudness are more likely to be positioned at the beginning of each album. We also found that transitions between consecutive tracks tend to alternate between increases and decreases of valence and energy. These findings were… ▽ More Based on a review of anecdotal beliefs, we explored patterns of track-sequencing within professional music albums. We found that songs with high levels of valence, energy and loudness are more likely to be positioned at the beginning of each album. We also found that transitions between consecutive tracks tend to alternate between increases and decreases of valence and energy. These findings were used to build a system which automates the process of album-sequencing. Our results and hypothesis have both practical and theoretical applications. Practically, sequencing regularities can be used to inform playlist generation systems. Theoretically, we show weak to moderate support for the idea that music is perceived in both global and local contexts. △ Less

Submitted 8 August, 2024; originally announced August 2024.

arXiv:2401.00209 [pdf]

AI and Tempo Estimation: A Review

Authors: Geoff Luck

Abstract: The author's goal in this paper is to explore how artificial intelligence (AI) has been utilised to inform our understanding of and ability to estimate at scale a critical aspect of musical creativity - musical tempo. The central importance of tempo to musical creativity can be seen in how it is used to express specific emotions (Eerola and Vuoskoski 2013), suggest particular musical styles (Li an… ▽ More The author's goal in this paper is to explore how artificial intelligence (AI) has been utilised to inform our understanding of and ability to estimate at scale a critical aspect of musical creativity - musical tempo. The central importance of tempo to musical creativity can be seen in how it is used to express specific emotions (Eerola and Vuoskoski 2013), suggest particular musical styles (Li and Chan 2011), influence perception of expression (Webster and Weir 2005) and mediate the urge to move one's body in time to the music (Burger et al. 2014). Traditional tempo estimation methods typically detect signal periodicities that reflect the underlying rhythmic structure of the music, often using some form of autocorrelation of the amplitude envelope (Lartillot and Toiviainen 2007). Recently, AI-based methods utilising convolutional or recurrent neural networks (CNNs, RNNs) on spectral representations of the audio signal have enjoyed significant improvements in accuracy (Aarabi and Peeters 2022). Common AI-based techniques include those based on probability (e.g., Bayesian approaches, hidden Markov models (HMM)), classification and statistical learning (e.g., support vector machines (SVM)), and artificial neural networks (ANNs) (e.g., self-organising maps (SOMs), CNNs, RNNs, deep learning (DL)). The aim here is to provide an overview of some of the more common AI-based tempo estimation algorithms and to shine a light on notable benefits and potential drawbacks of each. Limitations of AI in this field in general are also considered, as is the capacity for such methods to account for idiosyncrasies inherent in tempo perception, i.e., how well AI-based approaches are able to think and act like humans. △ Less

Submitted 30 December, 2023; originally announced January 2024.

Comments: 9 pages

arXiv:2312.12399 [pdf]

Virtual Reality-Assisted Physiotherapy for Visuospatial Neglect Rehabilitation: A Proof-of-Concept Study

Authors: Andrew Danso, Patti Nijhuis, Alessandro Ansani, Martin Hartmann, Gulnara Minkkinen, Geoff Luck, Joshua S. Bamford, Sarah Faber, Kat Agres, Solange Glasser, Teppo Särkämö, Rebekah Rousi, Marc R. Thompson

Abstract: This study explores a VR-based intervention for Visuospatial neglect (VSN), a post-stroke condition. It aims to develop a VR task utilizing interactive visual-audio cues to improve sensory-motor training and assess its impact on VSN patients' engagement and performance. Collaboratively designed with physiotherapists, the VR task uses directional and auditory stimuli to alert and direct patients, t… ▽ More This study explores a VR-based intervention for Visuospatial neglect (VSN), a post-stroke condition. It aims to develop a VR task utilizing interactive visual-audio cues to improve sensory-motor training and assess its impact on VSN patients' engagement and performance. Collaboratively designed with physiotherapists, the VR task uses directional and auditory stimuli to alert and direct patients, tested over 12 sessions with two individuals. Results show a consistent decrease in task completion variability and positive patient feedback, highlighting the VR task's potential for enhancing engagement and suggesting its feasibility in rehabilitation. The study underlines the significance of collaborative design in healthcare technology and advocates for further research with a larger sample size to confirm the benefits of VR in VSN treatment, as well as its applicability to other multimodal disorders. △ Less

Submitted 19 December, 2023; originally announced December 2023.

Comments: 29 pages, 8 figures, 5 tables

arXiv:2310.13291 [pdf, other]

Assessing Privacy Risks in Language Models: A Case Study on Summarization Tasks

Authors: Ruixiang Tang, Gord Lueck, Rodolfo Quispe, Huseyin A Inan, Janardhan Kulkarni, Xia Hu

Abstract: Large language models have revolutionized the field of NLP by achieving state-of-the-art performance on various tasks. However, there is a concern that these models may disclose information in the training data. In this study, we focus on the summarization task and investigate the membership inference (MI) attack: given a sample and black-box access to a model's API, it is possible to determine if… ▽ More Large language models have revolutionized the field of NLP by achieving state-of-the-art performance on various tasks. However, there is a concern that these models may disclose information in the training data. In this study, we focus on the summarization task and investigate the membership inference (MI) attack: given a sample and black-box access to a model's API, it is possible to determine if the sample was part of the training data. We exploit text similarity and the model's resistance to document modifications as potential MI signals and evaluate their effectiveness on widely used datasets. Our results demonstrate that summarization models are at risk of exposing data membership, even in cases where the reference summary is not available. Furthermore, we discuss several safeguards for training summarization models to protect against MI attacks and discuss the inherent trade-off between privacy and utility. △ Less

Submitted 20 October, 2023; originally announced October 2023.

arXiv:2304.04918 [pdf, other]

doi 10.1145/3543873.3584621

Explicit and Implicit Semantic Ranking Framework

Authors: Xiaofeng Zhu, Thomas Lin, Vishal Anand, Matthew Calderwood, Eric Clausen-Brown, Gord Lueck, Wen-wai Yim, Cheng Wu

Abstract: The core challenge in numerous real-world applications is to match an inquiry to the best document from a mutable and finite set of candidates. Existing industry solutions, especially latency-constrained services, often rely on similarity algorithms that sacrifice quality for speed. In this paper we introduce a generic semantic learning-to-rank framework, Self-training Semantic Cross-attention Ran… ▽ More The core challenge in numerous real-world applications is to match an inquiry to the best document from a mutable and finite set of candidates. Existing industry solutions, especially latency-constrained services, often rely on similarity algorithms that sacrifice quality for speed. In this paper we introduce a generic semantic learning-to-rank framework, Self-training Semantic Cross-attention Ranking (sRank). This transformer-based framework uses linear pairwise loss with mutable training batch sizes and achieves quality gains and high efficiency, and has been applied effectively to show gains on two industry tasks at Microsoft over real-world large-scale data sets: Smart Reply (SR) and Ambient Clinical Intelligence (ACI). In Smart Reply, $sRank$ assists live customers with technical support by selecting the best reply from predefined solutions based on consumer and support agent messages. It achieves 11.7% gain in offline top-one accuracy on the SR task over the previous system, and has enabled 38.7% time reduction in composing messages in telemetry recorded since its general release in January 2021. In the ACI task, sRank selects relevant historical physician templates that serve as guidance for a text summarization model to generate higher quality medical notes. It achieves 35.5% top-one accuracy gain, along with 46% relative ROUGE-L gain in generated medical notes. △ Less

Submitted 10 April, 2023; originally announced April 2023.

Journal ref: Companion Proceedings of the ACM Web Conference 2023 (WWW '23 Companion), April 30-May 4, 2023, Austin, TX, USA

arXiv:2006.10174 [pdf, other]

MIMICS: A Large-Scale Data Collection for Search Clarification

Authors: Hamed Zamani, Gord Lueck, Everest Chen, Rodolfo Quispe, Flint Luu, Nick Craswell

Abstract: Search clarification has recently attracted much attention due to its applications in search engines. It has also been recognized as a major component in conversational information seeking systems. Despite its importance, the research community still feels the lack of a large-scale data for studying different aspects of search clarification. In this paper, we introduce MIMICS, a collection of sear… ▽ More Search clarification has recently attracted much attention due to its applications in search engines. It has also been recognized as a major component in conversational information seeking systems. Despite its importance, the research community still feels the lack of a large-scale data for studying different aspects of search clarification. In this paper, we introduce MIMICS, a collection of search clarification datasets for real web search queries sampled from the Bing query logs. Each clarification in MIMICS is generated by a Bing production algorithm and consists of a clarifying question and up to five candidate answers. MIMICS contains three datasets: (1) MIMICS-Click includes over 400k unique queries, their associated clarification panes, and the corresponding aggregated user interaction signals (i.e., clicks). (2) MIMICS-ClickExplore is an exploration data that includes aggregated user interaction signals for over 60k unique queries, each with multiple clarification panes. (3) MIMICS-Manual includes over 2k unique real search queries. Each query-clarification pair in this dataset has been manually labeled by at least three trained annotators. It contains graded quality labels for the clarifying question, the candidate answer set, and the landing result page for each candidate answer. MIMICS is publicly available for research purposes, thus enables researchers to study a number of tasks related to search clarification, including clarification generation and selection, user engagement prediction for clarification, click models for clarification, and analyzing user interactions with search clarification. △ Less

Submitted 17 June, 2020; originally announced June 2020.

arXiv:2006.00166 [pdf, other]

Analyzing and Learning from User Interactions for Search Clarification

Authors: Hamed Zamani, Bhaskar Mitra, Everest Chen, Gord Lueck, Fernando Diaz, Paul N. Bennett, Nick Craswell, Susan T. Dumais

Abstract: Asking clarifying questions in response to search queries has been recognized as a useful technique for revealing the underlying intent of the query. Clarification has applications in retrieval systems with different interfaces, from the traditional web search interfaces to the limited bandwidth interfaces as in speech-only and small screen devices. Generation and evaluation of clarifying question… ▽ More Asking clarifying questions in response to search queries has been recognized as a useful technique for revealing the underlying intent of the query. Clarification has applications in retrieval systems with different interfaces, from the traditional web search interfaces to the limited bandwidth interfaces as in speech-only and small screen devices. Generation and evaluation of clarifying questions have been recently studied in the literature. However, user interaction with clarifying questions is relatively unexplored. In this paper, we conduct a comprehensive study by analyzing large-scale user interactions with clarifying questions in a major web search engine. In more detail, we analyze the user engagements received by clarifying questions based on different properties of search queries, clarifying questions, and their candidate answers. We further study click bias in the data, and show that even though reading clarifying questions and candidate answers does not take significant efforts, there still exist some position and presentation biases in the data. We also propose a model for learning representation for clarifying questions based on the user interaction data as implicit feedback. The model is used for re-ranking a number of automatically generated clarifying questions for a given query. Evaluation on both click data and human labeled data demonstrates the high quality of the proposed method. △ Less

Submitted 29 May, 2020; originally announced June 2020.

Comments: To appear in the Proceedings of SIGIR 2020

Showing 1–7 of 7 results for author: Luck, G