Vocalization in conversation can couple the movements in human interaction (Condon and Ogston 1966; Kendon 1972, Kendon 1983; Gill 2002; Shockley et al. 2003, 2007). Consider the case of talking on the phone. In addition to listening to what each other are saying, we rely on phonological features to gauge and grasp the intent and sense of the other, i.e. including breath, modulation, pitch, pulse, tempo, and rhythm. As we do so, our bodies move with these features, both as we speak and as the other speaks (Bavelas 2007). In grounding our communication, we occasionally move simultaneously in body (even in a silent pause, e.g. nodding our heads at the same time) and speech (Gill and Kawamori 2002; Shockley et al. 2003).
Work in phonetics by Local (2003); Local 2003) and Local (2007) shows how qualities in vocal sounds of participants are taken up by each other, where pitch, tempo, melody, function to bind turn-taking dynamics. ‘Participants attend to the moment-by-moment evolution of complexes of phonetic detail and what that detail encodes about other levels of linguistic organization so that they can locate the precise temporal moment to begin their talk’ (Local 2003 p. 4). They seem to monitor the phonetic and timing detail of both their own talk and the talk of others and can entrain the rate, rhythm, timing, and also pitch range and loudness characteristics of their speech to that which has just been produced by another speaker (Local 2007).