Event-Triggered Ephemeral Group Communication and Coordination over Sound for Smart Consumer Devices

Sensors (Basel). 2019 Apr 20;19(8):1883. doi: 10.3390/s19081883.

Abstract

Voice-based interfaces have become one of the most popular device capabilities, recently being regarded as one flagship user experience of smart consumer devices. However, the lack of common coordination mechanisms might often degrade the user experience, especially when interacting with multiple voice-enabled devices located closely. For example, a hotword or wake-up utterance such as "hi Bixby" or "ok Google" frequently triggers redundant responses by several nearby smartphones. Motivated by the problem of uncoordinated react of voice-enabled devices especially in a multiple device environment, in this paper, we discuss the notion of an ephemeral group of consumer devices in which the member devices and the transient lifetime are implicitly determined by an external event (e.g., hotword detection) without any provisioned group structure, and specifically we concentrate on the time-constrained leader election process in such an ephemeral group. To do so: (i) We first present the sound-based multiple device communication framework, namely tailtag, that leverages the isomorphic capability of consumer devices for the tasks of processing hotword events and transmitting data over sound, and thus renders both the tasks confined to the same room area and enables the spontaneous leader election process in a unstructured group upon a hotword event. (ii) To improve the success rate of the leader election with a given time constraint, we then develop the adaptive messaging scheme especially tailored for sound-based data communication that inherently has low data rate. Our adaptive scheme utilizes an application-specific score that is individually calculated by a member device for each event detection, and employs score-based scheduling by which messages of a high score are scheduled first and so unnecessary message transmission can be suppressed during the election process. (iii) Through experiments, we also demonstrate that, when a hotword is detected by multiple smartphones in a room, the framework with the adaptive messaging scheme enables them to successfully achieve a coordinated response under the given latency bound, yielding an insignificant non-consensus probability, no more than 2%.

Keywords: coordinated react; data communication over sound; ephemeral group; hotword; smart consumer device; voice-based interface.