Actuation without production bias

James Kirbya and Morgan Sondereggerb
a Institute for Phonetics and Speech Processing, Ludwig-Maximilians-Universität München
bDepartment of Linguistics, McGill University
a[email protected], b[email protected]
(Preprint: June 2024)
Abstract

Phonetic production bias is the external force most commonly invoked in computational models of sound change, despite the fact that it is not responsible for all, or even most, sound changes. Furthermore, the existence of production bias alone cannot account for how changes do or do not propagate throughout a speech community. While many other factors have been invoked by (socio)phoneticians, including but not limited to contact (between subpopulations) and differences in social evaluation (of variants, groups, or individuals), these are not typically modeled in computational simulations of sound change. In this paper, we consider whether production biases have a unique dynamics in terms of how they impact the population-level spread of change in a setting where agents learn from multiple teachers. We show that, while the dynamics conditioned by production bias are not unique, it is not the case that all perturbing forces have the same dynamics: in particular, if social weight is a function of individual teachers and the correlation between a teacher’s social weight and the extent to which they realize a production bias is weak, change is unlikely to propagate. Nevertheless, it remains the case that changes initiated from different sources may display a similar dynamics. A more nuanced understanding of how population structure interacts with individual biases can thus provide a (partial) solution to the ‘non-phonologization problem’.

Keywords: Propagation, actuation, population dynamics, dynamical systems, sound change

1 Introduction

How and why do speech sounds change? An outstanding goal of much work on sound change has been to understand what kinds of sound change are and aren’t possible, and why a given change takes place in certain languages when the necessary pre-conditions are present, but not in others – the celebrated actuation problem of Weinreich et al. (1968). If we think of a sound change as characterizing a change in the norm of a speech community (a definition that would certainly not be shared by all researchers), the actuation problem is separable into at least three distinct sub-problems:

  1. 1.

    What conditions the generation of an innovative variant in the speech of a single speaker?

  2. 2.

    Why do certain individuals adopt the innovative variants of others?

  3. 3.

    How do variants propagate in a speech community?

We refer to these as the problems of initiation, transmission and propagation, respectively (cf. Bermúdez-Otero 2020, Lindblom et al. 1995, Milroy & Milroy 1985, Janda 2003, Stevens & Harrington 2014, Croft 2000, Hall-Lew et al. 2021b).

Implicit in this type of bi- or tripartite111We find it useful to explicitly separate problems 2 and 3 in order to emphasize the distinction between the transmission of a variant between two individuals (sub-problem 2) and the propagation of a variant throughout a speech community to the extent that it becomes the dominant variant in that community (sub-problem 3). In the limiting case where the speech community in question consists of just two individuals, then these will functionally be the same, but we think it is still useful to draw a distinction between transmission/learning at the individual level and propagation/spread at the population level. division is the idea that change at the population level originates in phonetic variation at the individual level, an assumption made explicit by researchers such as Ohala (1981, 184): “…the initiation of such sound changes is accomplished by the phonetic mechanism just described; their spread, however, is done by social means, e.g., borrowing, imitation, etc.”

Again, while not all researchers agree that changes at the individual level are relevant, observable, or even theoretically coherent (see e.g., discussions in Weinreich et al. 1968, Hall-Lew et al. 2021b, and Garrett 2015), a great deal of work has implicitly or explicitly assumed some division of labor along these lines. It is thus worth exploring whether a model that assumes these components is in principle capable of generating patterns of change that correspond to the observed typology.

1.1 Initiation of change: phonetic biases

A common assumption made in many models is that while on some level sound change involves language learners/users selecting from what Ohala memorably called the ‘pool of synchronic variation’ (Ohala 1989), the variants in the pool are determined by universal aspects of speech production and perception, or phonetic bias factors (Ohala 1993, Blevins 2004, 2006, Garrett & Johnson 2013, and many others). Phonetic bias factors are asymmetries in patterns of phonetic realization which make certain sound changes more likely than others. Phonetic biases were traditionally understood to be production biases – physiological constraints on speech production – but acoustic-perceptual biases may also play a role (for reviews see Garrett & Johnson 2013, Ohala 1997, Blevins 2015). In this chapter, we use the term production bias to refer specifically to incremental and asymmetric phonetic biases with a articulatory basis. For example, vowel undershoot is the result of a production bias affecting the speed with which the articulators involved can achieve a particular spatial target. This is to be distinguished from vowel reduction, the outcome of phonologizing undershoot. What is important for present purposes is that production bias refers to the constraints giving rise to asymmetric, but non-phonologized variability.

Importantly, the mere existence of phonetic biases does not entail that they will lead to change at the population level. For as long as the existence of external forces impinging on speech patterns has been noted, so too has the observation that they don’t make change inevitable: indeed, the equilibrium state of language is arguably stability, not change (Milroy 1992, Janda 2003, Kiparsky 2015). At a minimum, a bias needs to result in a change in the speech behavior of at least a single individual, the classic Ohalaian ‘mini-sound change’. The idea that sound changes can take hold in the speech of individuals has led to work investigating why certain individuals (the innovators of Milroy & Milroy 1985) may be more poised to adopt bias variants than others. Proposals include individual differences in perception and/or cognitive processing style (Yu et al. 2013, Yu & Zellou 2019), production (Dediu & Moisik 2019), and/or learning experience (Bermúdez-Otero 2020). For more discussion of the role of individual differences in sound change, see the papers in Hall-Lew et al. 2021b). The critical point for the present discussion is simply that while biases exist, they do not necessarily entail change, even at the level of the individual.

1.2 Propagation of change: non-phonetic bias

Much as not all biases result in innovations in the speech of individual speakers, not all innovations propagate in a population. Why not? Here a wide range of proposals have been advanced. Lindblom et al. (1995) suggest that both articulatory ease and perceptual confusability may contribute to making some changes more or less likely to take hold. Other researchers propose that bias factors may be kept in check by a countervailing structural force, such as categoricity bias (Kirby & Sonderegger 2013, 2015) or contrast maintenance (Sóskuthy 2015). Kirby & Sonderegger (2015), building on the tradition of modeling population-level language change of Niyogi et al. (Niyogi & Berwick 1996, Niyogi 2006, Sonderegger & Niyogi 2010, 2013), show how population dynamics can be a source of stability: a change present in some individuals rooted in the same bias factor can spread or not spread through the population, depending on the learning algorithm assumed at the level of individuals and on population structure (e.g., how connected agents are), with population-level dynamics differing crucially depending on whether each agent learns from one or multiple teachers (Niyogi 2006, Niyogi & Berwick 2009, Smith 2009, Kirby & Sonderegger 2013, 2015).

There is, however, something of a disconnect between computational models of sound change propagation and theoretical and empirical work on change in the population. As noted above, much work on sound change from a phonetic perspective has emphasized the role of phonetic bias factors, and these have been implemented in models such as Pierrehumbert (2001) or Sóskuthy (2015). Yet, the broader literature on language variation and change often refers to other forces driving spread of a variant through a population: contact between groups with and without the variant, and social meaning attached to the variant or those who use it. As the above quote from Ohala suggests, the forces which give rise to the initial conditions for change at the level of the individual may be distinct from those which drive transmission and propagation at the population level. To understand the extent to which phonetic biases actually shape sound change typology, then, it is necessary to study them within the context of a population as well as within individuals, and to consider the full spectrum of forces which cause or inhibit the spread of a variant.

Perhaps most prosaically, frequency of interaction between and/or accommodation to individuals or groups of individuals impacts both transmission and propagation of variants (Trudgill 1986). Indeed, the many decades of research in the Labovian tradition argues that propagation of change in a speech community involves orderly shifts in the frequency of competing variants along demographic dimensions (Labov 2001). All else being equal, speakers are more likely to imitate the speech of groups (however defined, e.g., speakers of a different dialect, a certain gender, or a locally-defined category such as ‘burnout’ in Eckert 2000) they are more often in contact with, with group contact potentially magnifying existing phonetic bias-driven asymmetries (Harrington et al. 2018). Yet frequency of contact cannot be considered in a social vacuum. Work in the variationist sociolinguistic tradition (including Eckert 2000, Labov 2001, 2007, Eckert & Labov 2017, etc. and associated formal/computational work such as Burnett 2019 and Kauhanen 2020) emphasizes the centrality of social evaluation in determining who imitates who and under what circumstances. An important idea in this line of work is that not only innovative groups, but also innovative speakers are more likely to be imitated (Labov 1990, 2001, Harrington & Schiel 2017, Harrington et al. 2018). Approaches stemming from accommodation theory, including ‘identity-projection’ models and the notion of ‘style design’ (for a review, see Auer & Hinskens 2005), further emphasize that imitation is not automatic, but conditional on the social relationship between speaker and hearer. From the sociolinguistic perspective, then, understanding the dynamics of sound change requires understanding not just the source of a novel variant (which may indeed have come about due to some phonetic bias), but population-level variability in how variants are evaluated.

Other work raises the possibility that the likelihood of propagation could involve evaluation of specific variants themselves. For example, Baker et al. (2011) argue that adoption proceeds by imitation of variation, but that awareness of variation is limited to extreme targets. If learners only adopt variants they are aware of, and since extreme variants will by definition be rare, this helps explain why sound change itself is rare. Garrett & Johnson (2013) consider the possibility that whether or not a given token (exemplar) is retained (stored) may be socially motivated. This could potentially mean one of two things: that certain learners are more prone to retain innovative variants, potentially for reasons outlined above, or that a given variant has some socioindexical value affecting the probability with which it will be stored or retained, independent of the speaker who utters it (or the hearer’s relationship with/evaluation of that speaker). The idea that particular variants may be more salient to listeners is consistent with the sociolinguistic proposal that a variant must somehow be ‘marked’ before it can take on valence (social meaning) (Hall-Lew et al. 2021a, Silverstein 2003).

The foregoing review, while far from comprehensive, highlights the range of external forces that may be involved in the actuation of sound change at the population level above and beyond phonetic biases. That is to say, while change might take place at the population level due to the enhancement of an asymmetric phonetic bias, we must also consider other situations characterized by asymmetries like frequency of interaction, social evaluation, or individual differences. This leads naturally to asking: what do the dynamics of actuation look like when invoking forces besides phonetic bias? There may well be substantive differences in the evolution of a variant which are dependent on our assumptions about how that variant is propagated: due to positive evaluation of a group, of particular individuals, or of the variant itself. Because of the non-trivial mapping between individual learning/usage and population-level dynamics, computational modeling provides an attractive approach to understanding the potential contributions of different factors.

1.3 Computational approaches to modeling propagation

Empirically evaluating the time course of sound changes usually requires access to phonetic data gathered over long timespans (whether from panel studies or using apparent-time data), such as the Philadelphia Neighborhood Corpus (Labov et al. 2013), the Atlas of North American English (Labov et al. 2006), the Queen’s Christmas broadcasts (Harrington et al. 2000), or the Sounds of the City corpus (Stuart-Smith et al. 2017, Sonderegger et al. 2020, Sóskuthy & Stuart-Smith 2020; see also Cox, Palethorpe, and Penney, this volume). Such datasets enable rich inferences about the dynamics of how changes unfold over time in a population setting. Computational modeling is a complementary approach, providing not only a way to study changes for which the necessary empirical data are lacking, but also a (relatively) quick way to test hypotheses and make empirical predictions. Moreover, a well-specified computational model provides a baseline against which to judge alternative explanations, and the implementation of a model – the process of translating theoretical concepts into concrete parameters – can prove invaluable in terms of evaluating and adjudicating between possible explanations, and generally helps to avoid the “hazards of unaided reasoning”.222“With purely verbal arguments about evolutionary processes, it is too often the case that our conclusions do not follow from our assumptions. Unaided reasoning about the mass effects of many small forces operating over many generations has proven to be hazardous. Formalizing our arguments helps us understand which stories are possible explanations.” (McElreath & Boyd 2007, 6) This is the case not only because intuitions based on purely verbal descriptions can often be misleading, but also because in complex settings like language change, it is far from obvious how different factors will interact: iterating weak forces over time generally leads to surprises.

A good example of this is Baker (2008), a computational exploration of the classical Neogrammarian hypothesis that sound change is fundamentally the accumulation of error. In his simulations, agents are connected in a social network based on relative prestige and interact only with other agents in their subnetwork. Tokens produced by each agent are subject to a small amount of random noise. Baker shows that such a model is unable to produce the typical s-shaped curve familiar from empirical studies of sound change, in which transition between periods of stability is initially slow but then proceeds quite rapidly. Baker’s model also underscores an important point: because most variation does not result in change (what Kiparsky 2015 calls the ‘non-phonologization problem’), any adequate model of sound (indeed language) change must be able to model stability, as well as change. As Baker aptly demonstrates, simple-minded implementations of phonetic bias are insufficient in this regard, since the introduction of the bias in such models inevitably leads to its adoption (cf. Pierrehumbert 2001 and Wedel 2004, who show something similar in terms of how unconstrained memorization of new exemplars leads to increasingly diffuse phonetic category distributions).

Much like actuation may be initiated by forces other than production bias, stability may come about due to different mechanisms in different models. Several lines of modeling work have found that to meet the stability goal, it is necessary to include some kind of force promoting contrast maintenance, to keep separate phonetic categories stable, alongside an external force, such as a production bias, which induces change (Pierrehumbert 2001, Wedel 2006, Kirby 2013, Kirby & Sonderegger 2015, Sóskuthy 2015). Other work shows how individual- or group-level differences in evaluation can promote stability. For example, Garrett & Johnson (2013) demonstrate how both stability and change can come about in a model where a listener or listener group disregards biased variants. These simulations are instructive in that they show how different results may arise from the same starting conditions under different assumptions about the nature of bias.

1.3.1 Two modeling traditions: interactive-phonetic and population dynamics

Every computational model must contain some simplifications, and different models contain different more/less articulated components depending on their goals. We focus here on two strands of the computational modeling of sound change literature which make different choices in these regards, but which share a focus on two central goals: (1) replicating the dynamics of change observed in real-world cases and in population settings, while (2) allowing for both stability and change in the face of bias factors.

The first literature can be exemplified by the ‘interactive-phonetic’ agent-based model of Harrington and colleagues (hereafter IP-ABM: Harrington & Schiel 2017, Harrington et al. 2018, Stevens et al. 2019, Stevens & Harrington 2022, Gubian et al. 2023). While there are important differences between the various versions of this model, all implementations share some basic properties. First, they all consider interactions between a population of agents, with realistic initial conditions, seeded with data taken from phonetic studies of real speakers. Second, the representation of individuals in the model is fairly complex, including increasingly sophisticated and well-articulated representations of phonetic and phonological categories and rules for how they are updated and change as agents interact. Third, such models assume finite population sizes and (usually) exclusively ‘horizontal’ transmission in which all agents are part of a single generation that interacts with itself. The evolutionary dynamics of such models are inherently stochastic, meaning that typical behavior is obtained by running a simulation with a given set of inputs and parameterization many times. Finally, there is a common force which drives change in all of these models: the interaction between two distinct subpopulations, one of which shows the influence of a phonetic bias factor.

Simulations in these papers explore how different initial configurations can lead to stability vs. change, and the results are then compared to the trajectories of sound changes in progress, some of them known historical changes. For example, Stevens & Harrington (2022) consider the case of /s/-retraction, which is apparent in ¡str-¿ words in Australian English but not in Italian. In their IP-ABM simulations, Italian-initialized agents did not show evidence of /s/-retraction while English-initialized agents did. This demonstrates that both stability and change are possible within the IP-ABM framework (see also Jochim and Kleber, this volume).At least in its current implementation, however, stability and change arise in the IP-ABM model not due to different values of model parameters, but as a function of the input data and initial conditions (how agents are initialized). Thus, while the internal representations of agents are complex, the exploration of dynamics is simplified.

Another strand of this literature assumes a much simpler representation of each agent, and focuses on the study of the population-level dynamics as model parameters are varied. This includes approaches such as that of Sóskuthy (2015) as well as our own prior work (Kirby & Sonderegger 2013, 2015), described in more detail below, which builds upon earlier explorations of both sound change (Sonderegger & Niyogi 2010, 2013) as well as syntactic change (Niyogi & Berwick 1996, 2009, Niyogi 2006). These studies assume much simpler models of category representation and learning than the IP-ABM model, but emphasize the impact of varying a range of model parameters, such as phonetic bias, categoricity bias, population structure, and variant frequency. Of central interest is mapping out the space of stable and unstable states as model parameters are varied (what Sóskuthy calls the ‘adaptive landscape’), which determine what path the population-level sound system will follow for some given initial state.

A central claim of this latter literature is that (at least part of) the solution to the actuation problem lies in the population-level dynamics: sound systems (characterized as a population-level distribution of how particular word or sound is pronounced) will always evolve towards stable states; thus, stability will (correctly) emerge as the default. A sound system only changes when a change in some model parameter makes the population’s current state unstable. Like any complex adaptive system, such (phase) transitions tend to be non-linear: as a model parameter is changed (e.g., the degree of phonetic bias relative to categoricity bias) past a critical value, the population’s current state suddenly becomes unstable (giving rise to the celebrated S-curve of linguistic change; see e.g., Blythe & Croft 2012). In this general approach, the goal of the modeler is therefore to map out how changing model parameters changes what the stable states are.

If the approach typified by the IP-ABM model has as its goal to find model configurations which replicate known cases of change (but which do not erroneously predict change where none has occurred), models of the second type are assessed by their ability to meet three more abstract goals. We seek models with a general structure that we can show allows for (a) the possibility of stability in the face of bias; (b) the possibility of change in the face of bias; and (c) a nonlinear transition from stable variation to change as a function of system parameters.

1.4 The present study

In previous work, we have described a modeling framework (outlined in §2 below) in which categoricity bias can enforce stability even in the face of production bias, and in which production bias can induce change even in the presence of categoricity bias. Such a model is adequate in the sense that it allows for both stability and change. However, as discussed in §1, production biases are not the only forces which can perturb a population from equilibrium. This leads us to pose the following research questions:

  1. 1.

    Does the use of production bias as a perturbing force have a unique dynamics, or can a nonlinear transition from one stable state to another emerge when a different force is employed?

  2. 2.

    If it can, will any kind of external force produce the same dynamics at the population level, or do different forces have different dynamics?

Note that in this setting, ‘dynamics’ means the deterministic dynamics of well-mixed and infinitely large populations which define a landscape of possible trajectories of change, rather than the properties of specific temporal trajectories in simulations corresponding to a real-world instance of a particular sound change (as is typically the focus in agent-based simulations such as the IP-ABM).

We consider models containing two external forces beyond phonetic production bias: (1) contact between subpopulations with different stable speech patterns, and (2) the weight agents give to tokens depending on their source (based either on the speaker’s group membership or individual identity) or the extent to which a token is phonetically innovative. We assess each model based on whether both stability and change are possible as model parameters (= the strength of the external force) are varied. In this paper, we focus on the particular example of the propagation of a phonologized coarticulatory variant, but the general framework is applicable to many types of change involving a wide range of non-phonetic production biases.

The broader question addressed by this exercise is: can we safely assume that any proposed force driving change could lead to change, iterated over time in a population? This fundamental assumption is implicit in a great deal of theorizing about sound change, but remains untested. The answer is not obvious, because (as we will see) even in simple models where individual agents are anything but sophisticated, unintuitive outcomes may still result at the population level (see also Jochim and Kleber, this volume). If the answer turns out to be ‘no’, this would mean that a thorough understanding of population dynamics, in addition to their source, will be necessary to get to grips with how particular changes actuate.

2 Modeling framework

We begin by reviewing the simulation framework introduced in Kirby & Sonderegger (2013, 2015), which forms the basis for the simulations we present here.333Code for simulations reported in this paper can be found at https://osf.io/b7sgq/. Code for simulations reported in Kirby & Sonderegger (2015) can be found at https://github.com/kirbyj/evomod/tree/master/ks. Our notation follows Kirby & Sonderegger (2015), which can be consulted for more detail on the general framework. In that work, as well as in the simulations discussed in the present paper, we use the phenomenon of West Germanic primary umlaut, a textbook example of the phonologization of coarticulation, as an illustrative example (Table 1). In pre-Old High German (OHG), short low /a/ was fronted and raised to /e/ when a high front vowel or glide occurred in the following syllable, as in *[gasti] >>> /gesti/ (modern German Gäste). At some stage, the conditioning vowel was weakened, but presumably not before the raising of the stem vowel was firmly established. We focus therefore on the state of affairs which gave rise to the shift from proto-West Germanic to pre-OHG, i.e., the establishment of an [e]-like allophone of /a/ in the context of a following /i/.

Table 1: Primary umlaut in West Germanic (after Iverson & Salmons 1996, 71).

WGmc Pre-OHG OHG Mod. German *gasti gesti gest Gäste *lambir lembir lemb Lämme *fasti festi fest fest


In Kirby & Sonderegger (2015), we began by making as few assumptions as possible and built up an increasingly complex model until we had an adequate account of actuation, defined as a model which generates the following regimes:

  1. (a)

    the stability of limited coarticulation in the population, as in pre-Old High German;

  2. (b)

    the stability of full coarticulation in the population, as in Old High German; and

  3. (c)

    sudden and nonlinear change from stable limited to stable full coarticulation, as model parameters are varied.

We refer to (a)-(c) as our modeling goals.

2.1 Linguistic setting

Agents in our model are represented by a simple lexicon consisting of three vowel categories {V1, V2, V12}, where V12 represents V1 in the coarticulation-inducing context of V2 (e.g., /a   i/, as opposed to plain /a/ or /i/)444We employ this notation, rather than using e.g. /e/, to underscore the fact that phonemically we are modeling a state where there are two categories: that is, /a   i/ is a (stable) variant of /a/, rather than a separate phoneme.. Vowels are represented as first formant (F1) values. For simplicity, the F1 distributions of V1 and V2 are assumed to be normal (V1N(μa,σa2)similar-tosubscript𝑉1𝑁subscript𝜇𝑎superscriptsubscript𝜎𝑎2V_{1}\sim N(\mu_{a},\sigma_{a}^{2})italic_V start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∼ italic_N ( italic_μ start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT , italic_σ start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ), V2N(μi,σi2)similar-tosubscript𝑉2𝑁subscript𝜇𝑖superscriptsubscript𝜎𝑖2V_{2}\sim N(\mu_{i},\sigma_{i}^{2})italic_V start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∼ italic_N ( italic_μ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_σ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT )), known to all learners, the same for all learners, and stable over time.

The F1 distribution of V12 is normal, with fixed variance as for V1 and a mean we denote by c𝑐citalic_c:

V12N(c,σa2)similar-tosubscript𝑉12𝑁𝑐superscriptsubscript𝜎𝑎2V_{12}\sim N(c,\sigma_{a}^{2})italic_V start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT ∼ italic_N ( italic_c , italic_σ start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) (1)

We will sometimes refer to V12 (or equivalently, c𝑐citalic_c, which determines the distribution of V12) as the contextual variant. c𝑐citalic_c lies between the means of V1 and V2, and μacsubscript𝜇𝑎𝑐\mu_{a}-citalic_μ start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT - italic_c is the extent to which /a/ (V1) is coarticulated in the context of /i/ (V2).

In addition, we assume that productions of V12subscript𝑉12V_{12}italic_V start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT may be subject to phonetic production bias, represented by a quantity λ𝜆\lambdaitalic_λ which describes the tendency of a speaker to over- or undershoot articulatory targets.555In Kirby & Sonderegger (2015), the production bias is allowed to vary normally across tokens. For simplicity we assume a single fixed production bias here. Thus, the actual productions of an agent with contextual variant c𝑐citalic_c follow the distribution:

F1N(cλ,σa2)similar-to𝐹1𝑁𝑐𝜆superscriptsubscript𝜎𝑎2F1\sim N(c-\lambda,\sigma_{a}^{2})italic_F 1 ∼ italic_N ( italic_c - italic_λ , italic_σ start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) (2)

(Note that the bias adjusts the mean of V12 by negative λ𝜆\lambdaitalic_λ, because V2 has lower F1 than V1.) This scenario is illustrated in Fig. 1.

Refer to caption
Figure 1: Lexicon and learning parameters. μasubscript𝜇𝑎\mu_{a}italic_μ start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT and μisubscript𝜇𝑖\mu_{i}italic_μ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT are the means of normally-distributed vowel categories V1 (/a/) and V2 (/i/). c𝑐citalic_c is the mean of the normally-distributed contextual variant, V12. λ𝜆\lambdaitalic_λ represents the strength of the bias favouring coarticulated variants.

2.2 Learning and evolution

We assume that agents are divided into discrete generations, each containing a very large number of agents.666In the limit of infinitely-large populations, the evolution of the distribution of c𝑐citalic_c becomes deterministic, and can be analyzed as a dynamical system (see e.g., Niyogi 2006, chapter 5). Learners in generation t+1𝑡1t+1italic_t + 1 receive n𝑛nitalic_n examples of V12subscript𝑉12V_{12}italic_V start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT drawn from generation t𝑡titalic_t. Each learner’s task is to infer c𝑐citalic_c. We assume that learners apply a learning algorithm which is ‘rational’, in the sense that they assume each token in their learning data to be generated according to Equation 1, and estimate the most probable value of c𝑐citalic_c. That is, each learner’s task is simply to infer how much /a/ is produced like /i/ in the context of /i/ (/a   i/).

In Kirby & Sonderegger (2015), we allowed this basic model to vary in two ways. The first was changing the learning algorithm agents applied, by adding a categoricity bias: the degree to which agents preferred /a   i/ tokens to conform to the fixed distributions of /a/ and /i/. This was implemented as a prior over the distribution of c𝑐citalic_c. We explored two cases, a simple Gaussian prior where c𝑐citalic_c was biased to be near μasubscript𝜇𝑎\mu_{a}italic_μ start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT (corresponding to a preference/expectation for /a   i/ to be realized as /a/) and a more complex polynomial prior representing a bias for c𝑐citalic_c to be a value around either μasubscript𝜇𝑎\mu_{a}italic_μ start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT oder μisubscript𝜇𝑖\mu_{i}italic_μ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT (corresponding to either no coarticulation, or full coarticulation). The strength of this ‘complex prior’ is controlled by a parameter a𝑎aitalic_a, as shown in Fig. 2. All simulations in the current paper use this prior to capture categoricity bias.

Refer to caption
Figure 2: Prior distribution over c𝑐citalic_c, for values between the means of V2 and V1 (μi=530subscript𝜇𝑖530\mu_{i}=530italic_μ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 530, μa=730subscript𝜇𝑎730\mu_{a}=730italic_μ start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT = 730). The parameter a𝑎aitalic_a controls the strength of the categoricity bias, with values nearer to 0 corresponding to a greater preference for values of c𝑐citalic_c near either endpoint.

The second issue we addressed was that of population structure: whether agents were assumed to learn from single teachers, a frequent assumption in computational modeling of language change (often called ‘iterated learning’: Smith 2009, Griffiths & Kalish 2007, Kirby et al. 2007, S.), or multiple teachers, as illustrated in Fig. 3. Because we observed significant differences between the dynamics in single- vs. multiple-teacher settings (c.f. Niyogi & Berwick 2009), and because only the results from the multiple-teacher settings were consistent with the extant sociolinguistic evidence, we restrict our attention to the multiple-teacher setting in the present paper.777Kirby & Sonderegger (2015) also considers the case where each agent learns from two (randomly chosen) teachers, which we do not consider here because the dynamics were similar to the multiple-teacher case.

Refer to caption
Figure 3: Two types of population structure considered in models in Kirby & Sonderegger (2013, 2015): (a) Single-teacher scenario, where each learner in generation t+1𝑡1t+1italic_t + 1 receives all her data from a single teacher in generation t𝑡titalic_t. (b) Multiple-teacher scenario. Each data point comes from a random teacher, each chosen uniformly at random (with replacement) from teachers in generation t𝑡titalic_t.

Considering the ensemble of all agents in generation t𝑡titalic_t, the state of the population at t𝑡titalic_t can be characterized by a probability distribution describing how likely different values of c𝑐citalic_c are. Formally, this is the distribution of a random variable, which we write as Ctsuperscript𝐶𝑡C^{t}italic_C start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT. Similarly, the values of c𝑐citalic_c learned by agents in generation t+1𝑡1t+1italic_t + 1 can be characterized by Ct+1superscript𝐶𝑡1C^{t+1}italic_C start_POSTSUPERSCRIPT italic_t + 1 end_POSTSUPERSCRIPT, whose probability distribution describes how likely different values of c𝑐citalic_c now are. For simplicity, we assume that the number of agents per generation is very large. The evolution of the distribution of c𝑐citalic_c is then deterministic, making its behavior more easily analyzed as a dynamical system. This and several other aspects of our modeling framework, such as the assumption that generations are discrete, are shared with the broader literature on dynamical systems models of language change (e.g., Niyogi & Berwick 1996, Niyogi 2006, Yang 2000; see §1.3.1).

2.3 Assessment

In this framework, given a choice of learning algorithm, population structure, and degree of phonetic bias, we may characterize the evolution of the distribution of c𝑐citalic_c and determine the extent to which it satisfies our modeling goals. When learners are assumed to have a categoricity bias like that in Fig. 2, analytic solutions become intractable, and we must proceed by simulation to determine how the distribution of c𝑐citalic_c evolves. For practical reasons, in this chapter we present simulations over a limited number of generations (between 50 and 500), but it is important to keep in mind that the stable distribution of c𝑐citalic_c remains a fundamentally deterministic, rather than stochastic, function of the parameter settings. As a result, the number of generations our simulations require to converge to a stable state is not directly comparable to simulation time in stochastic agent-based simulations such as Harrington & Schiel (2017), Stevens et al. (2019), or Jochim and Kleber (this volume). Our goal is less to characterize the speed, slope, or trajectory of change than to determine the parameter regimes in which change occurs at all, as well as which change will in a occur in a particular regime.

Fig. 4 shows an example for the setting where all agents have a minimally coarticulated variant in generation 1 (c𝑐citalic_c is normally distributed with mean μa10subscript𝜇𝑎10\mu_{a}-10italic_μ start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT - 10), and there some production bias (λ=𝜆absent\lambda=italic_λ = 2) along with weak categoricity bias (a=𝑎absenta=italic_a = 0.02). The results can be visualized in two ways. The left panel shows the evolution of the entire distribution of c𝑐citalic_c, while the right panel summarizes this distribution by showing just the evolution of its mean (representing the ‘average pronunciation’ of V12). From either representation, it is clear that P(Ct)𝑃superscript𝐶𝑡P(C^{t})italic_P ( italic_C start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) quickly converges to having most mass around 530 Hz (= μisubscript𝜇𝑖\mu_{i}italic_μ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, the F1 of V2), meaning that the population converges on a state of full coarticulation.888Settings of other parameters for this simulation and those shown in Fig. 5: μa=530subscript𝜇𝑎530\mu_{a}=530italic_μ start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT = 530, μi=730subscript𝜇𝑖730\mu_{i}=730italic_μ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 730, σa=σi=50subscript𝜎𝑎subscript𝜎𝑖50\sigma_{a}=\sigma_{i}=50italic_σ start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT = italic_σ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 50, M=500𝑀500M=500italic_M = 500, n=100𝑛100n=100italic_n = 100.

Refer to caption
Refer to caption
Figure 4: Example of the distribution of c𝑐citalic_c in the population at time t𝑡titalic_t (probability density function P(Ct)𝑃superscript𝐶𝑡P(C^{t})italic_P ( italic_C start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT )). The population starts with minimal coarticulation at t=0𝑡0t=0italic_t = 0 (cN(720,102)similar-to𝑐𝑁720superscript102c\sim N(720,10^{2})italic_c ∼ italic_N ( 720 , 10 start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT )) and ends with full coarticulation by t=100𝑡100t=100italic_t = 100. All agents have weak categoricity bias (a=0.02𝑎0.02a=0.02italic_a = 0.02) and strong production bias (λ=2𝜆2\lambda=2italic_λ = 2). Left panel: shading is proportional to P(Ct)𝑃superscript𝐶𝑡P(C^{t})italic_P ( italic_C start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ). Right panel: solid line and shading show the mean ±plus-or-minus\pm± 2 standard deviations of P(Ct)𝑃superscript𝐶𝑡P(C^{t})italic_P ( italic_C start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ).

In Kirby & Sonderegger (2013, 2015), it was shown that only a model with both production and categoricity biases could achieve all three modeling goals (stable limited coarticulation, stable full coarticulation, and change from one to the other). When the degree of categoricity bias is held constant, increasing λ𝜆\lambdaitalic_λ past a critical value causes a rapid transition from stable limited coarticulation (low λ𝜆\lambdaitalic_λ) to stable full coarticulation (high λ𝜆\lambdaitalic_λ). This dynamics of this case are characterized by a trade-off between the strength of categoricity bias and production bias, with a sudden change from limited to full coarticulation. This ‘adaptive landscape’ is illustrated in Fig. 5, which shows the degree of coarticulation the population will end up with, as a𝑎aitalic_a and λ𝜆\lambdaitalic_λ are varied, for the same starting state (limited coarticulation), in a concrete example. To generate this figure, we ran a simulation like the one shown in Fig. 4 using a range of values of a𝑎aitalic_a and λ𝜆\lambdaitalic_λ, and recorded where the average degree of coarticulation stabilizes after a large number of generations (t=2500𝑡2500t=2500italic_t = 2500). In Fig. 4, this stable state is near μisubscript𝜇𝑖\mu_{i}italic_μ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT (= 530 Hz).

Before moving on to our extensions, we emphasize two further aspects of this earlier work. First, in models with both categoricity bias and production bias, phonologization is not inevitable (cf. Baker 2008): the mere presence of a production bias (λ>0𝜆0\lambda>0italic_λ > 0) does not entail that a novel variant will inevitably become the dominant variant in the speech community. Second, the dynamics of phonologization in this model are non-linear: a very small change in the degree of bias can result in phonologization (sudden population-level change) or not (population-level stability), simply depending on the state of the system (e.g., the amount of production vs. categoricity bias). In particular, there is a clear bifurcation: once λ𝜆\lambdaitalic_λ is large enough relative to a𝑎aitalic_a, the stable state will be one of full coarticulation. It is thus not necessary to postulate any additional mechanism to explain both phonologization and ‘non-phonologization’.

Refer to caption
Figure 5: Mean value of c𝑐citalic_c in the population in its stable state, starting from a population of agents with a minimally coarticulated /a   i/ variant (cN(μa10,102)similar-to𝑐𝑁subscript𝜇𝑎10superscript102c\sim N(\mu_{a}-10,10^{2})italic_c ∼ italic_N ( italic_μ start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT - 10 , 10 start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT )) with production bias λ𝜆\lambdaitalic_λ and categoricity bias a𝑎aitalic_a (smaller a𝑎aitalic_a = stronger bias). The final mean changes non-linearly as λ𝜆\lambdaitalic_λ and a𝑎aitalic_a are varied. Red and dark blue correspond to full/no coarticulation of V12subscript𝑉12V_{12}italic_V start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT, respectively.

3 Extensions: contact and social weight

In this section, we extend this framework to consider two additional types of external forces beyond production bias: contact between subpopulations (§3.1) and social weight999What we are referring to as social weight encompasses the more traditional concept of ‘prestige’, but as discussed in Salmons 2021 (Ch. 7), this term is typically not well-defined, so we explicitly avoid it here. at the level of variants, individual speakers, and groups (§3.2). We assess each model based on whether both stability and change are possible as model parameters are varied. Our first extension considers a case conceptually similar to that explored by Harrington & Schiel (2017), Stevens et al. (2019), Gubian et al. (2023) inter alia, in which a bias variant is firmly established in the speech of one group but not another. In the context of our modeling framework, the question we ask here is: are both stability and change possible when heterogeneous groups interact?

3.1 Model 1: Subpopulations in contact

In this setting we consider a population consisting of two groups. Group A is characterized by a contextual variant exhibiting little or no coarticulation, while for Group B, this same variant is extremely coarticulated (Fig.  6). Let aProb𝑎𝑃𝑟𝑜𝑏aProbitalic_a italic_P italic_r italic_o italic_b be the probability that a Group B agent learns from (equivalently, imitates/stores an exemplar from) a Group A agent, and let bProb𝑏𝑃𝑟𝑜𝑏bProbitalic_b italic_P italic_r italic_o italic_b be the probability a Group A agent learns from a Group B agent. In this setting, there is no production bias (λ=0𝜆0\lambda=0italic_λ = 0), and the a𝑎aitalic_a parameter controlling the strength of the categoricity bias was set at 0.01, encoding a strong dispreference for intermediate variants (see Fig. 2). In other respects, the simulation procedure was the same as outlined in §2.

Refer to caption
Figure 6: Lexical distribution of V1, V2, and c𝑐citalic_c at the outset of the subpopulation mixing simulations. Distribution (a) (dashed line) is the distribution of c𝑐citalic_c for group A; distribution (b) (dotted line) is the distribution for group B.

A selection of the results of these simulations after 50 simulation epochs are given in Fig. 7, which shows the evolution of c𝑐citalic_c for each group. The main result is that the starting mean F1 value c𝑐citalic_c characterizing each group’s stable state (Fig. 6) appears to be stable even when there is some degree of interaction between them (upper left panels). However, obtaining (and storing/imitating) just 5% of training examples from a different group can be enough to induce the entire population to converge to one or the other group’s mean. There may also exist regimes in which both groups converge on an intermediate distribution, i.e., a more extreme version of the aProb=bProb=0.06𝑎𝑃𝑟𝑜𝑏𝑏𝑃𝑟𝑜𝑏0.06aProb=bProb=0.06italic_a italic_P italic_r italic_o italic_b = italic_b italic_P italic_r italic_o italic_b = 0.06 panel shown in Fig. 7.

In regimes in which there is a sizable asymmetry between aProb𝑎𝑃𝑟𝑜𝑏aProbitalic_a italic_P italic_r italic_o italic_b and bProb𝑏𝑃𝑟𝑜𝑏bProbitalic_b italic_P italic_r italic_o italic_b, the direction of convergence appears to be predictable. The panels in the lower right quadrant of Fig. 7 in which aProb=bProb𝑎𝑃𝑟𝑜𝑏𝑏𝑃𝑟𝑜𝑏aProb=bProbitalic_a italic_P italic_r italic_o italic_b = italic_b italic_P italic_r italic_o italic_b, particularly the case where aProb=bProb=0.1𝑎𝑃𝑟𝑜𝑏𝑏𝑃𝑟𝑜𝑏0.1aProb=bProb=0.1italic_a italic_P italic_r italic_o italic_b = italic_b italic_P italic_r italic_o italic_b = 0.1, merit additional mention. That both groups are seen converging to the group B mean in the lower right panel of Fig. 7 is unlikely to consistently replicate; repeating this simulation many times, with the same parameter settings, should yield approximately as many outcomes which converge to the group A mean. This is because simulation requires that we work with a finite population sample, whereas the ‘true’ (deterministic) stable state is contingent on the assumption of infinite populations.

This simplistic scenario could be further complicated in many ways, e.g., by introducing differences in the orientations of the distributions between groups. However, it already shows us that, in the context of our modeling framework, the subpopulations in contact scenario fulfills our modeling goals, with a dynamics very similar to that seen by the introduction of a production bias: there exist parameter regimes in which both stability and change are possible.

Refer to caption
Figure 7: Results of subpopulation simulation modeling contact between groups with minimal coarticulation (Group A) and full coarticulation (Group B) at t=0𝑡0t=0italic_t = 0.

3.2 Models 2-4: social weighting

The next three models explore whether both stability and change are possible in the presence of social value associated with more coarticulated variants, with valued speakers who coarticulate more, and with groups characterized by coarticulation. We are intentionally remaining agnostic as to exactly what motivates the association of ‘social value’ with a variant, a speaker, or a group, but see Section 1, as well as Garrett & Johnson (2013) and Salmons (2021, chapter 7), for some discussion.

In our social weighting settings, each token yisubscript𝑦𝑖y_{i}italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is transmitted as a vector of an F1 value together with a social weight value wi[1,wmax]subscript𝑤𝑖1subscript𝑤𝑚𝑎𝑥w_{i}\in[1,w_{max}]italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ [ 1 , italic_w start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT ]. How this value is determined varies across the three models. In Model 2, higher social weight is associated with more coarticulated tokens, i.e., tokens of /a   i/ are more highly weighted the more similar they are to /i/. In Model 3, tokens of /a   i/ from a high-coarticulation group are given greater weight, so exposure to highly coarticulated tokens of /a   i/ is modulated by probability of group interaction. And in Model 4, the weight is associated with tokens of /a   i/ from individual teachers who coarticulate more. These weights are distinct from the probabilities (aProb and bProb) with which an agent of one group learns a token from another.

As before, the learner’s task is to estimate c𝑐citalic_c using the weighted average of yisubscript𝑦𝑖y_{i}italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. In these scenarios, however, tokens with values of w1>1subscript𝑤11w_{1}>1italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT > 1 (i.e., those that are either more coarticulated, or from teachers who coarticulate more), will have greater influence on the evolution of c𝑐citalic_c in the population.

3.2.1 Model 2: social weighting by variant

In the Model 2 simulations, we return to a setting with a single population in which /a/ is only slightly coarticulated in the context of /i/ (Fig. 8). In these simulations, we vary the parameter w𝑤witalic_w controlling the reference social weight accorded to coarticulated variants. The stronger the weight, the more that more strongly coarticulated tokens of yisubscript𝑦𝑖y_{i}italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT (i.e, those produced with a lower F1, more like /i/ than /a/) contribute to the maximum likelihood estimate of the /a   i/ category.

Refer to caption
Figure 8: Initial distribution of parameter c𝑐citalic_c (dashed line: mean pronunciation of V12) in the Model 2 population at time t=0𝑡0t=0italic_t = 0.
Refer to caption
Figure 9: Model 2 simulation: Distribution of c𝑐citalic_c in the population at time t𝑡titalic_t (Ctsuperscript𝐶𝑡C^{t}italic_C start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT) for varying weights w𝑤witalic_w indicating preference for the coarticulated variant.

The results of simulations for different values of w𝑤witalic_w run for 250 generations are shown in Fig. 9. Stability can be preserved when coarticulated variants have some social weight, but having the social weight of the coarticulated variant be just 10 % more than that of the uncoarticulated variant can be enough to induce change to full coarticulation in the whole population. Note that in this scenario, it is assumed that all learners weight coarticulated variants the same and that this weight never changes over time; as such, even for low w𝑤witalic_w, convergence to the coarticulated variant may be inevitable given enough simulation generations. Further exploration of the parameter space would be required to determine this for certain. Once again, however, we see that (a) the introduction of a bias does not make change inevitable, and (b) the dynamics of the change are rapid and non-linear for at least some parameter settings – where by ‘rapid and non-linear’ we mean that small numerical changes in parameter settings in certain regions of the parameter space can result in qualitatively different outcomes.

3.2.2 Model 3: social weighting by group

Model 3 has the same basic architecture as Model 1 (§3.1), but with two additional parameters aWeight𝑎𝑊𝑒𝑖𝑔𝑡aWeightitalic_a italic_W italic_e italic_i italic_g italic_h italic_t and bWeight𝑏𝑊𝑒𝑖𝑔𝑡bWeightitalic_b italic_W italic_e italic_i italic_g italic_h italic_t, corresponding to how much data from group A is weighted for a learner in group B and how much data from group B is weighted for a learner in group A, respectively. Weight values ranged from 0 to 1. As in Model 1, learners in each generation learn from teachers both in their own group and, potentially, the other group. In Model 1, we saw how for low settings of aProb𝑎𝑃𝑟𝑜𝑏aProbitalic_a italic_P italic_r italic_o italic_b and bProb𝑏𝑃𝑟𝑜𝑏bProbitalic_b italic_P italic_r italic_o italic_b (the parameters controlling the number of tokens learners receive from the opposing group), the distribution of c𝑐citalic_c in both groups (one with no coarticulation and one with stable coarticulation) would remain constant. In Model 3 we experimented with different values of these parameters, as well as with the weights given to tokens from each group.

While space does not permit a visual display of the entire parameter space exploration, some representative results are shown in Fig. 10 for the case where aProb=bProb𝑎𝑃𝑟𝑜𝑏𝑏𝑃𝑟𝑜𝑏aProb=bProbitalic_a italic_P italic_r italic_o italic_b = italic_b italic_P italic_r italic_o italic_b (using the value aProb=bProb=0.03𝑎𝑃𝑟𝑜𝑏𝑏𝑃𝑟𝑜𝑏0.03aProb=bProb=0.03italic_a italic_P italic_r italic_o italic_b = italic_b italic_P italic_r italic_o italic_b = 0.03). Again, here, both stability and change are possible. Stability can be preserved even when tokens from the high-coarticulation group B are socially valued by low-coarticulation group A (bWeight=0.20.5𝑏𝑊𝑒𝑖𝑔𝑡0.20.5bWeight=0.2-0.5italic_b italic_W italic_e italic_i italic_g italic_h italic_t = 0.2 - 0.5). But even a small preference for tokens from the coarticulating group can be enough to induce change to full coarticulation in the whole population. Simulation results for other low values of aProb𝑎𝑃𝑟𝑜𝑏aProbitalic_a italic_P italic_r italic_o italic_b/bProb𝑏𝑃𝑟𝑜𝑏bProbitalic_b italic_P italic_r italic_o italic_b (e.g., aProb=bProb𝑎𝑃𝑟𝑜𝑏𝑏𝑃𝑟𝑜𝑏aProb=bProbitalic_a italic_P italic_r italic_o italic_b = italic_b italic_P italic_r italic_o italic_b 0.05) look qualitatively similar, though the exact values of aWeight𝑎𝑊𝑒𝑖𝑔𝑡aWeightitalic_a italic_W italic_e italic_i italic_g italic_h italic_t/bWeight𝑏𝑊𝑒𝑖𝑔𝑡bWeightitalic_b italic_W italic_e italic_i italic_g italic_h italic_t where transitions from stability to change occur differ.

Refer to caption
Figure 10: Results of Model 3 simulations for fixed aProb=bProb=0.3𝑎𝑃𝑟𝑜𝑏𝑏𝑃𝑟𝑜𝑏0.3aProb=bProb=0.3italic_a italic_P italic_r italic_o italic_b = italic_b italic_P italic_r italic_o italic_b = 0.3 and varying weights aWeight𝑎𝑊𝑒𝑖𝑔𝑡aWeightitalic_a italic_W italic_e italic_i italic_g italic_h italic_t and bWeight𝑏𝑊𝑒𝑖𝑔𝑡bWeightitalic_b italic_W italic_e italic_i italic_g italic_h italic_t.

3.2.3 Model 4: social weight by individual

Finally, in Model 4 we return to a single population, and consider the case where social weight values are associated with individual speakers. In this scenario, we associate every teacher mM𝑚𝑀m\in Mitalic_m ∈ italic_M in generation t𝑡titalic_t with a social weight value wmsubscript𝑤𝑚w_{m}italic_w start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT as well as a value cmsubscript𝑐𝑚c_{m}italic_c start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT, the mean of their coarticulated variant. If these happen to be positively correlated – i.e., if tokens from teachers who coarticulate more also have higher social weight – this implies more coarticulation in generation t+1𝑡1t+1italic_t + 1, which could, but need not, accumulate and lead to change (cf. Baker et al. 2011). The degree of correlation between the social weight wmsubscript𝑤𝑚w_{m}italic_w start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT of any individual teacher m𝑚mitalic_m and their coarticulation parameter cmsubscript𝑐𝑚c_{m}italic_c start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT is controlled by a simulation-level parameter ρ𝜌\rhoitalic_ρ, ranging from 0 (uncorrelated) to 1 (perfect correlation, so the strongest coarticulators also have the highest social weight). Individual teacher weights were sampled at random from the range {1,wmax}1subscript𝑤𝑚𝑎𝑥\{1,w_{max}\}{ 1 , italic_w start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT }.101010The exception to this was for speakers in the first 25 generations, for whom weights were assigned as the weighted average of the teacher’s coarticulation parameter c𝑐citalic_c and the uniformly randomly sampled value in {1,wmax}1subscript𝑤𝑚𝑎𝑥\{1,w_{max}\}{ 1 , italic_w start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT }. This was done to encourage individual simulation runs to start in roughly similar areas of the parameter space.

Refer to caption
Figure 11: Results of Model 4 simulations for varying maximum weight wmaxsubscript𝑤𝑚𝑎𝑥w_{max}italic_w start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT and correlation ρ𝜌\rhoitalic_ρ between w𝑤witalic_w and c𝑐citalic_c for any individual teacher m𝑚mitalic_m.

The results for several values of ρ𝜌\rhoitalic_ρ and wmaxsubscript𝑤𝑚𝑎𝑥w_{max}italic_w start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT are given in Fig. 11 (for runs of 500 generations). In this setting, stability is the default; it is only when ρ1𝜌1\rho\approx 1italic_ρ ≈ 1 that we observe trajectories that appear to indicate change. In other words, in this setting, change requires a near-perfect correlation between coarticulation and social weight, where (tokens from) individuals who coarticulate are weighted 100-1000 times higher than those who do not. These findings are arguably expected given the way that weights are assigned to teachers (uniformly at random) and how learners receive input from teachers (also uniformly at random) in this setting. However, further simulations with many more generations are needed before any firm conclusions can be drawn.

4 General discussion

In this paper, we have considered the question of whether, in a population setting with a categoricity bias, propagation of a phonetic production bias has a unique dynamics (question 1), and if not, will any kind of driving force produce the same dynamics (question 2). To study these questions, we compared our earlier work with two new external forces: interacting groups with different stable states (Model 1, §3.1), and various types of social weights assigned to variants, groups, or individuals (Models 2-4, §3.2).

The dynamics of Model 1, in which subpopulations with different distributions of a bias variant were put in contact, were broadly similar to those observed in Kirby & Sonderegger (2015): there exist parameter setting regimes (here, the degree of contact between groups) in which both stability and change are possible. Similar dynamics were observed in Models 2 and 3 (§3.2.1 and 3.2.2), in which social weights were assigned to variants and groups, respectively. Thus the answer to our first question – does the introduction of a phonetic production bias have a unique dynamics? – is clearly ‘no’. This was not a foregone conclusion, given that previous models including phonetic production biases have had extremely idealized or simplified population structures.

Models 2, 3, and 4 are all implementations of social weight. Yet the dynamics of Model 4 (§3.2.3) were markedly different from the others, in the sense that there is less of a qualitative difference between the regimes compared with the previous models. Thus the answer to our second question – will any kind of driving force produce the same dynamics? – is also ‘no’.

Why are the dynamics of Model 4 so different? The answer has less to do with how weights are assigned than it does with the correlation between weights and observations. In Model 2, where social weight was associated with variants, the correlation between weights and observations is perfect: the more strongly an observation was coarticulated, the more highly it was weighted, by all learners across all generations. In Model 3, the social weight is associated with groups, but applies to all observations equally. That is, an /a/-like token of /a   i/ from the highly weighted group is valued just as much as an /i/-like token. The correlation between weights and observations is therefore somewhat reduced. In Model 4, where weights are properties of individuals, the correlation is weakened even further. For the scenarios we explored, in which the distribution of w𝑤witalic_w in the population was uniform, near-perfect correlation between weights and observations was required for change to be seen in the population-level distribution of c𝑐citalic_c.

The results of our simulations cannot be taken as evidence that any of the scenarios we have considered – production bias, interacting subpopulations, or differences in social weighting – are responsible for any particular empirically observed instance of sound change. Given the current state of our scientific understanding of sound change, it is far from clear how the parameter values in any computational model should be set on the basis of real-world properties of utterances, individuals, and social groups. This is at least in part a consequence of our modeling strategy, which is focused on describing the possible space of long-term, stable behaviors of a parameterized system, rather than the temporal trajectory of change given any particular parameterization. However, we believe that our findings are still useful in demonstrating that (i) change is not inevitable in the presence of bias and (ii) not all biases engender the same evolutionary dynamics. This gives us confidence in the conclusions we can draw from (and the hypotheses we might generate with) such models.

5 Conclusion

In this paper, we have shown how different combinations of external forces can result in a similar evolutionary dynamics of the spread of change, at least in a setting with a strong categoricity bias. This implies that a similar evolutionary dynamics may actually underlie the actuation of changes from different sources. These results show us that although sound changes may have different sources, the resulting dynamics of propagation throughout a population may nonetheless be quite similar. We take this finding to be reassuring, given that empirical studies of many different changes show a similar trajectory.

At the same time, we have shown that not all external forces give rise to evolutionary dynamics where both stability and change are possible. Some intuitively plausible mechanisms, such as high social value associated with the speech of individuals, appear to be too noisy to have an effect when iterated over time in a speech community. This, too, is a positive result, because it demonstrates that the models in this general framework are not ‘doomed to success’, but that there exist regions of the parameter space in which stability is possible. In this respect, population dynamics can provide at least a partial answer to the actuation problem and to the ‘non-phonologization problem’ (Kiparsky 2015) of why change does not automatically arise whenever its conditioning environment is present. Actuation is possible without production bias, and the dynamics of change driven by production bias are not unique.

Supplementary materials

Simulation code to replicate and extend the simulations reported in this chapter can be found at this paper’s OSF archive: https://osf.io/b7sgq/.

Acknowledgements

Portions of this work were originally presented at the 2014 UC-Berkeley workshop ‘Sound Change in Interacting Human Systems’. We are grateful to that audience, as well as audiences at the Ohio State University and McGill University, editors Felicitas Kleber and Tamara Rathcke, an anonymous reviewer, and the participants in our 2013 and 2015 Linguistic Institute courses, for comments, suggestions, and inspiration. We alone remain solely responsible for any errors of fact or interpretation. This work was made possible in part by grants from the Fonds de recherche du Québec (#183356) and the Canada Foundation for Innovation (#32451) to M. Sonderegger.

References

  • Auer & Hinskens (2005) Auer, Peter & Frans Hinskens. 2005. The role of interpersonal accommodation in a theory of language change. In Peter Auer, Frans Hinskens & Paul Kerswill (eds.), Dialect Change, 335–357. Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9780511486623.015.
  • Baker (2008) Baker, Adam. 2008. Addressing the actuation problem with quantitative models of sound change. University of Pennsylvania Working Papers in Linguistics 14(1). 29–41.
  • Baker et al. (2011) Baker, Adam, Diana Archangeli & Jeff Mielke. 2011. Variability in American English s-retraction suggests a solution to the actuation problem. Language Variation and Change 23(3). 347–374. https://doi.org/10.1017/S0954394511000135.
  • Bermúdez-Otero (2020) Bermúdez-Otero, Ricardo. 2020. The initiation and incrementation of sound change: Community-oriented momentum-sensitive learning. Glossa: a journal of general linguistics 5(1). 1–32. https://doi.org/10.5334/gjgl.627.
  • Blevins (2004) Blevins, Juliette. 2004. Evolutionary phonology. Cambridge: Cambridge University Press.
  • Blevins (2006) Blevins, Juliette. 2006. A theoretical synopsis of Evolutionary Phonology. Theoretical Linguistics 32(2). 117–166. https://doi.org/10.1515/TL.2006.009.
  • Blevins (2015) Blevins, Juliette. 2015. Evolutionary Phonology: a holistic approach to sound change typology. In Patrick Honeybone & Joseph Salmons (eds.), The Oxford Handbook of Historical Phonology, 485–500. Oxford: Oxford University Press.
  • Blythe & Croft (2012) Blythe, Richard A. & William Croft. 2012. S-curves and the mechanisms of propagation in language change. Language 88(2). 269–304. https://doi.org/10.1353/lan.2012.0027.
  • Burnett (2019) Burnett, Heather. 2019. Signalling games, sociolinguistic variation and the construction of style. Linguistics and Philosophy 42(5). 419–450. https://doi.org/10.1007/s10988-018-9254-y.
  • Croft (2000) Croft, William. 2000. Explaining language change: an evolutionary approach. Harlow: Pearson Education.
  • Dediu & Moisik (2019) Dediu, Dan & Scott R. Moisik. 2019. Pushes and pulls from below: Anatomical variation, articulation and sound change. Glossa: a journal of general linguistics 4(1). https://doi.org/10.5334/gjgl.646.
  • Eckert (2000) Eckert, Penelope. 2000. Language Variation as Social Practice: The Linguistic Construction of Identity in Belten High. Malden, MA & Oxford, UK: Wiley-Blackwell.
  • Eckert & Labov (2017) Eckert, Penelope & William Labov. 2017. Phonetics, phonology and social meaning. Journal of Sociolinguistics 21(4). 467–496. https://doi.org/10.1111/josl.12244.
  • Garrett (2015) Garrett, Andrew. 2015. Sound change. In Claire Bowern & Bethwyn Evans (eds.), The Routledge handbook of historical linguistics, 227–248. London and New York: Routledge.
  • Garrett & Johnson (2013) Garrett, Andrew & Keith Johnson. 2013. Phonetic bias in sound change. In Alan Yu (ed.), Origins of sound change: Approaches to phonologization, 51–97. Oxford: Oxford University Press.
  • Griffiths & Kalish (2007) Griffiths, Thomas L. & Michael L. Kalish. 2007. Language evolution by iterated learning with Bayesian agents. Cognitive Science 31(3). 441–480.
  • Gubian et al. (2023) Gubian, Michele, Johanna Cronenberg & Jonathan Harrington. 2023. Phonetic and phonological sound changes in an agent-based model. Speech Communication 147. 93–115. https://doi.org/10.1016/j.specom.2023.01.004.
  • Hall-Lew et al. (2021a) Hall-Lew, Lauren, Amanda Cardoso & Emma Davies. 2021a. Social meaning and sound change. In Emma Moore, Lauren Hall-Lew & Robert J. Podesva (eds.), Social Meaning and Linguistic Variation: Theorizing the Third Wave, 27–53. Cambridge: Cambridge University Press. https://doi.org/10.1017/9781108578684.002.
  • Hall-Lew et al. (2021b) Hall-Lew, Lauren, Patrick Honeybone & James Kirby. 2021b. Individuals, communities, and sound change: an introduction. Glossa: a journal of general linguistics 6(1). 1–17. https://doi.org/10.5334/gjgl.1630.
  • Harrington et al. (2018) Harrington, Jonathan, Felicitas Kleber, Ulrich Reubold, Florian Schiel & Mary Stevens. 2018. Linking cognitive and social aspects of sound change using agent-based modeling. Topics in Cognitive Science 10(4). 707–728. https://doi.org/10.1111/tops.12329.
  • Harrington et al. (2000) Harrington, Jonathan, Sallyanne Palethorpe & Catherine I. Watson. 2000. Does the Queen speak the Queen’s English? Nature 408(6815). 927–928.
  • Harrington & Schiel (2017) Harrington, Jonathan & Florian Schiel. 2017. /u/-fronting and agent-based modeling: The relationship between the origin and spread of sound change. Language 93(2). 414–445. https://doi.org/10.1353/lan.2017.0019.
  • Iverson & Salmons (1996) Iverson, Gregory K. & Joseph C. Salmons. 1996. The primacy of primary umlaut. Beiträge zur Geschichte der deutschen Sprache und Literatur 118. 69–86. https://doi.org/10.1515/bgsl.1996.1996.118.69.
  • Janda (2003) Janda, Richard. 2003. ’Phonologization’ as the start of dephonetization—Or, on sound change and its aftermath: Of extension, generalization, lexicalization and morphologization. In Brian D. Joseph & Richard D. Janda (eds.), The handbook of historical linguistics, 401–422. Malden, MA: Blackwell.
  • Kauhanen (2020) Kauhanen, Henri. 2020. Replicator–mutator dynamics of linguistic convergence and divergence. Royal Society Open Science 7(11). 201682. https://doi.org/10.1098/rsos.201682.
  • Kiparsky (2015) Kiparsky, Paul. 2015. Phonologization. In Patrick Honeybone & Joseph C. Salmons (eds.), The Oxford handbook of historical phonology, 563–582. Oxford: Oxford University Press.
  • Kirby (2013) Kirby, James. 2013. The role of probabilistic enhancement in phonologization. In Alan C. L. Yu (ed.), Origins of sound change: Approaches to phonologization, 228–246. Oxford: Oxford University Press.
  • Kirby & Sonderegger (2013) Kirby, James & Morgan Sonderegger. 2013. A model of population dynamics applied to phonetic change. In Markus Knauff, Michael Pauen, Natalie Sebanz & Ipke Wachsmuth (eds.), Proceedings of the 35th Annual Conference of the Cognitive Science Society, 776–781.
  • Kirby & Sonderegger (2015) Kirby, James & Morgan Sonderegger. 2015. Bias and population structure in the actuation of sound change. arXiv:1507.04420 [cs.CL] https://doi.org/10.48550/arXiv.1507.04420.
  • Kirby et al. (2007) Kirby, Simon, Mike Dowman & Thomas L Griffiths. 2007. Innateness and culture in the evolution of language. Proceedings of the National Academy of Sciences 104(12). 5241–5245. https://doi.org/10.1073/pnas.0608222104.
  • Labov (1990) Labov, William. 1990. The intersection of sex and social class in the course of linguistic change. Language Variation and Change 2(2). 205–254. https://doi.org/10.1017/S0954394500000338.
  • Labov (2001) Labov, William. 2001. Principles of linguistic change vol. 2: Social factors. Oxford: Oxford University Press.
  • Labov (2007) Labov, William. 2007. Transmission and diffusion. Language 83(2). 344–387.
  • Labov et al. (2006) Labov, William, Sharon Ash & Charles Boberg. 2006. The atlas of North American English: phonetics, phonology, and sound change: a multimedia reference tool. Berlin & New York: Mouton de Gruyter.
  • Labov et al. (2013) Labov, William, Ingrid Rosenfelder & Josef Fruehwald. 2013. One hundred years of sound change in Philadelphia: Linear incrementation, reversal, and reanalysis. Language 89(1). 30–65.
  • Lindblom et al. (1995) Lindblom, Björn, Susan Guion, Susan Hura, Seung-Jae Moon & Raquel Willerman. 1995. Is sound change adaptive? Rivista di Linguistica 7(1). 5–37.
  • McElreath & Boyd (2007) McElreath, Richard & Robert Boyd. 2007. Mathematical models of social evolution: A guide for the perplexed. Chicago: University of Chicago Press.
  • Milroy (1992) Milroy, James. 1992. Linguistic variation and change: on the historical sociolinguistics of English. Oxford: B. Blackwell.
  • Milroy & Milroy (1985) Milroy, James & Lesley Milroy. 1985. Linguistic change, social network and speaker innovation. Journal of Linguistics 21(2). 339–384.
  • Niyogi (2006) Niyogi, Partha. 2006. The computational nature of language learning and evolution. Cambridge: MIT Press.
  • Niyogi & Berwick (1996) Niyogi, Partha & Robert C. Berwick. 1996. A language learning model for finite parameter spaces. Cognition 61(1-2). 161–193. https://doi.org/10.1016/S0010-0277(96)00718-4.
  • Niyogi & Berwick (2009) Niyogi, Partha & Robert C. Berwick. 2009. The proper treatment of language acquisition and change in a population setting. Proceedings of the National Academy of Sciences 106(25). 10124–10129.
  • Ohala (1981) Ohala, John J. 1981. The listener as a source of sound change. In Carrie S. Masek, Roberta A. Hendrick & Mary Frances Miller (eds.), Proceedings from the 17th Annual Meeting of the Chicago Linguistic Society, Volume 2: Papers from the Parasession on Language and Behavior, vol. 2, 178–203. Chicago: Chicago Linguistic Society.
  • Ohala (1989) Ohala, John J. 1989. Sound change is drawn from a pool of synchronic variation. In Leiv E. Breivik & Ernst H. Jahr (eds.), Language change: contributions to the study of its causes (Trends in Linguistics. Studies and Monographs [TiLSM] 43), 173–198. Berlin & New York: De Gruyter Mouton. https://doi.org/10.1515/9783110853063.173.
  • Ohala (1993) Ohala, John J. 1993. The phonetics of sound change. In Charles Jones (ed.), Historical linguistics: Problems and perspectives, 237–278. London: Longman.
  • Ohala (1997) Ohala, John J. 1997. Aerodynamics of phonology. In Proceedings of the 4th Seoul International Conference on Linguistics, 92–97. Seoul.
  • Pierrehumbert (2001) Pierrehumbert, Janet B. 2001. Exemplar dynamics: Word frequency, lenition and contrast. In Joan L. Bybee & Paul J. Hopper (eds.), Frequency and the Emergence of Linguistic Structure (Typological Studies in Language 45), 137–158. Amsterdam: John Benjamins Publishing Company. https://doi.org/10.1075/tsl.45.08pie.
  • Salmons (2021) Salmons, Joseph. 2021. Sound change. Edinburgh: Edinburgh University Press.
  • Silverstein (2003) Silverstein, Michael. 2003. Indexical order and the dialectics of sociolinguistic life. Language & Communication 23(3-4). 193–229. https://doi.org/10.1016/S0271-5309(03)00013-2.
  • Smith (2009) Smith, Kenny. 2009. Iterated learning in populations of Bayesian agents. In Proceedings of the Annual Meeting of the Cognitive Science Society, 697–702. Austin, TX.
  • Sonderegger & Niyogi (2010) Sonderegger, Morgan & Partha Niyogi. 2010. Combining data and mathematical models of language change. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, 1019–1029. Uppsala, Sweden.
  • Sonderegger & Niyogi (2013) Sonderegger, Morgan & Partha Niyogi. 2013. Variation and change in English noun/verb pair stress: Data and dynamical systems models. In Alan C. L. Yu (ed.), Origins of sound change: Approaches to phonologization, 262–284. Oxford: Oxford University Press. https://doi.org/10.1093/acprof:oso/9780199573745.003.0013.
  • Sonderegger et al. (2020) Sonderegger, Morgan, Jane Stuart-Smith, Thea Knowles, Rachel Macdonald & Tamara Rathcke. 2020. Structured heterogeneity in Scottish stops over the twentieth century. Language 96(1). 94–125. https://doi.org/10.1353/lan.2020.0003.
  • Stevens & Harrington (2014) Stevens, Mary & Jonathan Harrington. 2014. The individual and the actuation of sound change. Loquens 1(1). e003. https://doi.org/10.3989/loquens.2014.003.
  • Stevens & Harrington (2022) Stevens, Mary & Jonathan Harrington. 2022. Individual variation and the coarticulatory path to sound change: agent-based modeling of /str/ in English and Italian. Glossa: a journal of general linguistics 7(1). https://doi.org/10.16995/glossa.8869.
  • Stevens et al. (2019) Stevens, Mary, Jonathan Harrington & Florian Schiel. 2019. Associating the origin and spread of sound change using agent-based modelling applied to /s/-retraction in English. Glossa: a journal of general linguistics 4(1). https://doi.org/10.5334/gjgl.620.
  • Stuart-Smith et al. (2017) Stuart-Smith, Jane, Brian José, Tamara Rathcke, Rachel Macdonald & Eleanor Lawson. 2017. Changing sounds in a changing city: an acoustic phonetic investigation of real-time change over a century of Glaswegian. In Chris Montgomery & Emma Moore (eds.), Language and a Sense of Place, 38–64. Cambridge: Cambridge University Press. https://doi.org/10.1017/9781316162477.004.
  • Sóskuthy (2015) Sóskuthy, Márton. 2015. Understanding change through stability: A computational study of sound change actuation. Lingua 163. 40–60. https://doi.org/10.1016/j.lingua.2015.05.010.
  • Sóskuthy & Stuart-Smith (2020) Sóskuthy, Márton & Jane Stuart-Smith. 2020. Voice quality and coda /r/ in Glasgow English in the early 20th century. Language Variation and Change 32(2). 133–157. https://doi.org/10.1017/S0954394520000071.
  • Trudgill (1986) Trudgill, Peter. 1986. Dialects in contact (Language in society 10). Oxford, UK & New York, NY: Basil Blackwell.
  • Wedel (2004) Wedel, Andrew. 2004. Category competition drives contrast maintenance within an exemplar-based production/perception loop. In Proceedings of the 7th Meeting of the ACL Special Interest Group in Computational Phonology: Current Themes in Computational Phonology and Morphology, 1–10. Association for Computational Linguistics.
  • Wedel (2006) Wedel, Andrew. 2006. Exemplar models, evolution and language change. The Linguistic Review 23(3). https://doi.org/10.1515/TLR.2006.010.
  • Weinreich et al. (1968) Weinreich, Uriel, William Labov & Marvin I. Herzog. 1968. Empirical foundations for a theory of language change. In Winfred P. Lehmann & Yakov Malkiel (eds.), Directions for historical linguistics, 95–195. Austin: University of Texas Press.
  • Yang (2000) Yang, Charles D. 2000. Internal and external forces in language change. Language Variation and Change 12(3). 231–250. https://doi.org/10.1017/S0954394500123014.
  • Yu et al. (2013) Yu, Alan C. L., Carissa Abrego-Collier & Morgan Sonderegger. 2013. Phonetic imitation from an individual-difference perspective: subjective attitude, personality and “autistic” traits. PLoS ONE 8(9). e74746. https://doi.org/10.1371/journal.pone.0074746.
  • Yu & Zellou (2019) Yu, Alan C.L. & Georgia Zellou. 2019. Individual differences in language processing: phonology. Annual Review of Linguistics 5(1). 131–150. https://doi.org/10.1146/annurev-linguistics-011516-033815.