Intellectual Merits Symposium

“Basileo Martínez Cruz and Christian DiCanio – recording Triqui; Tlaxiaco, Oaxaca, Mexico. Photo from May, 2019. (© Christian DiCanio)

For a 1.5 hour Symposium, invited participants will focus on the history and achievements of the Documenting Endangered Languages (DEL) program in the past fifteen years since it was formed within the national Science Foundation and National Endowment for the Humanities. The participants of this Symposium will consider DEL particularly in the context of “Intellectual Merits,” namely the potential of documentation research to advance knowledge in linguistics and related fields in significant and potentially transformative ways. This organized session proposal is being submitted in tandem with two other session proposals (a Workshop and a cluster of Themed Posters) to both celebrate DEL and its accomplishments, and also critically consider its future.


Demonstration of traditional fonio harvesting in Boundé, Burkina Faso, accompanied by traditional music.

The five Symposium speakers included in this proposal represent active scholars conducting well-known and respected documentation research via DEL funding. They will present their findings by illustrating the ways in which their work and outputs have contributed to the field of linguistics and to other disciplines. 

Laura McPherson collecting narratives with village elders in Boundé, Burkina Faso.


Keren Rice: A brief introduction to DEL: reflections on the intellectual merit of language documentation

I served on the Documenting Endangered Languages (DEL) panel in its early years, between 2006 and 2013, and was the Program Officer for DEL in 2011-12. Prior to my involvement with DEL, I had been a member of the Linguistics panel. This puts me in a good position to think about DEL and its accomplishments over the past 15 years. I will comment in particular on the challenges that DEL has been faced with in terms of what is perceived to be intellectual merit.

The intellectual merit portion of the NSF guidelines asks a very broad question: what is the potential for the proposed activity to advance knowledge and understanding within its own field or across different fields? There are several supplementary questions concerning the potential for transformation, the quality of the plan and of the team, and resources.

Something frequently heard in the early years of DEL was that the proposals were strong in broader impacts, but many expressed that DEL research was not strong in intellectual merit. My own sense is that this is far from the reality of what was happening. While perhaps it was true at the start, in fact the turn towards recognizing the merits of documentation is transforming linguistics. As documentation has come to be understood, it is best viewed as a leader in terms of intellectual merit. This will be amply illustrated by the talks in this session, and I will draw attention to just a few points in the introduction to the session.

Documentation has brought together a diverse range of perspectives including linguistics, anthropology, cognitive science, biology, geography, climate change, philosophy, and computer science. Documentation has challenged the concept of what intellectual merit means with advances in methodology, with the championing of non-western conceptions of science, with the blurring of the boundary between intellectual merit and broader impacts, with changing conceptions of the goals of linguistics. Documentation has strengthened the focus on ethics in research with humans and on social justice as a responsibility of researchers. The notion of ecological validity of research has been strengthened. There is a reinvigoration of old areas, including language change, language contact, the role of ethnography in linguistics, and the importance of considering social factors in studying language. There are changing conceptions of what it means to know a language, and changing ideas of what dictionaries and grammars can be. There are developments in technology that perhaps would not have happened without the collection of texts that need to be transcribed and parsed.

Rather than considering the type of research funded by DEL as ground-breaking in broader impacts but basically inconsequential in terms of intellectual merit, language documentation, and with it DEL, should be regarded as a leader in terms of intellectual merit, playing an important role in transforming linguistics into the vibrant and exciting field that it is today.


Laura McPherson: Speaking through music: The role of balafon surrogate speech in documentation and analysis of Seenku

An unexpected consequence of documenting Seenku (Mande, Burkina Faso) was the discovery of a speech surrogate system on the balafon, a resonator xylophone, which musically encodes the language’s rhythm and complex tone system in order to communicate. In this talk, I highlight three examples of how exploring the surrogate speech in the process of language documentation has yielded surprising and varied benefits, in the domains of both intellectual merit and broader impacts: 1. It has been an important tool in tonal analysis, revealing speakers’ sensitivity to underlying contrasts in lexical and grammatical tone while eschewing automatic tonal simplifications in natural speech. This supports a psychologically real distinction between levels of structure posited by phonological theory, accessible to musicians when encoding speech. Consequently, the fact that the balafon encodes complex sandhi alternations shows that these cannot be postlexical and must instead belong to a deeper layer of morphophonology. 2. It has served as a catalyst for text collection across various cultural domains, such as farming, oral history, and caste structure, while also documenting an endangered musical tradition. These recorded narratives, transcribed and annotated for grammatical information, provide a naturalistic corpus of speech for linguistic analysis, while at the same time providing cultural data of interdisciplinary interest. 3. It has provided an unexpected way in which to present DEL-funded work to non-linguistic audiences, including school children, neurologists, and philanthropists of the arts. The result has been to raise awareness for disappearing linguistic and cultural diversity in places like Burkina Faso while presenting a taste of linguistics to audiences who may otherwise never have the opportunity. 

The case of Seenku is not unusual. Every language presents unique linguistic challenges, and every culture and community may offer unique ways of studying these challenges. But the very nature of endangered and understudied languages means that we as researchers may have little idea of either before embarking on documentation. The DEL program provides the opportunity to take a dynamic and adaptive approach to language documentation and analysis, with discoveries made in the field—like the balafon surrogate language—shaping the project trajectory. In other words, the innovative results of DEL projects are often unforeseen, but the nature of the work supported by DEL nearly guarantees such results.

Ultimately, having the freedom to take a broader and more interdisciplinary approach to language documentation through exploring these unexpected paths has the potential both to advance linguistic theory and to build bridges between our work, our linguistic communities, and the general public.


Christian DiCanio: Phonetics and DEL: experimental methods and tools for endangered language corpora

Over the past few decades, the development of easily accessible phonetic analysis software and more portable speech recording equipment has enabled the careful investigation of phonetic detail in the speech signal (c.f. Whalen & McDonough 2015). With these tools, acoustic and articulatory data from language documentation projects have, for instance, informed debates regarding how patterns of consonant harmony function (Whalen et al. 2011), the nature of variation across languages with similar stop inventories (Kakadelis 2018), and the universality of prosody in human language (DiCanio & Hatcher 2018). The language documentarian plays a critical role not only in the phonetic research, but also in advances in corpus phonetics, an expanding area of research involving advanced computational methods applied to large, unstructured sets of recordings (c.f. Liberman 2019). Phoneticians, and researchers more generally, have become increasingly interested in examining speech in ecologically typical contexts, both in well-studied languages (Stuart-Smith et al 2015, Davidson 2016, Chodroff & Wilson 2017) and in endangered languages (DiCanio et al. 2015, DiCanio & Whalen 2015). In this talk, I highlight two areas where the particular marriage of technical skills in phonetic/computational methods and language documentation has produced advances reaching beyond each of these  disciplines.

The first example is the development of a computational model predicting surface phonetic allophony (DiCanio et al 2017). Following the recording and careful morphophonological transcription of approximately 100 hours of Yoloxóchitl Mixtec speech (Amith and Castillo García, no date), a research team of documentarians and phoneticians developed a phonological transducer to produce surface phonological representations, e.g. transcribed /yu3ba3=on4/ ‘father=2S’ was converted to /ju³βõ⁴/. This surface phonological form was then phonetically segmented using forced alignment (DiCanio et al. 2013). Stop consonants demonstrate great variability in spontaneous speech and surface phonetic variant types were tagged, e.g. /t/ may be realized with some degree of voicing. A deep neural network trained on this data was able to predict surface allophony (voiced, devoiced, spirantized) with almost 90% accuracy. This result is not only relevant to Mixtec phonetics, but automatic allophone detection is also useful for the diagnosis of apraxia of speech, a disorder typified by variation in stop production (Davis et al 1998). 

The second example is the development of automatic alignment/recognition systems. Here, documentarians work closely with phoneticians and computational linguists on improving phonetic annotation in endangered language corpora. I focus on the specific linguistic and computational challenges that Itunyoso Triqui, an endangered and prosodically complex Mesoamerican language, poses for creating a forced alignment system. Variation in tone production in spontaneous speech is then examined with this corpus, a current topic of particular relevance in speech recognition (c.f. Lin et al 2018). 

In both areas highlit above, it is not only the documentation data but also the documentarian who crucially helps to advance the phonetic research. This collaboration can result in advances extending beyond phonetics as a linguistic science and demonstrates the relevance of language documentation to science more generally.


Lenore A. Grenoble: Experimental methods in documenting multilingualism and change

Language documentation has largely (and understandably) focused on recording proficient speakers while still possible. This project, however, takes a different approach in studying contact-induced change and shift in process, in an effort to understand the complex interplay of social, cognitive and linguistic factors at works in language loss. The analysis of documentary materials largely involves qualitative approaches that are combined with elicitation-based fieldwork and ethnography. This project combines these more traditional methods with experimental psycholinguistic methods to test the range and limits of changes in morphosyntax, and the acceptability of both new and pre-shift constructions for current speakers of varying proficiency levels.

The language ecologies of the northeastern region of Russia in the Sakha Republic (Yakutia) provide an excellent testing ground for hypotheses about language contact, as multiple Indigenous languages there are undergoing change and loss as speakers shift to Russian, the national language. Across the Indigenous populations, historically high levels of multilingualism in local languages have given way to bilingualism in Russian or, increasingly, Russian monolingualism. 

I focus on the use of experimental methods, with particular attention to word order changes in two of languages, Sakha (Turkic) and Even (Tungusic), in different stages of shift.  Word order is well-known to be susceptible to contact-induced change (Heine 2008) and to correlate with a number of other typological parameters (Dryer 2007; Song 2001). Thus, if Sakha and Even adopt VO order, we might also find indications of other syntactic changes, e.g. prefixation, prepositions, and finite subordinate clauses. These predictions stem from the hypothesis that word order parameters are consistent within a language, and that these correlations are functionally and structurally motivated. Whether this is accurate remains an open empirical question; Dunn et al. (2011) provide contrary evidence.

Our preliminary work, funded by NSF DEL, suggests that these changes take place sporadically in the speech of Even speakers, with language shift probably impeding them from becoming grammaticalized. They are diffusing more systematically in Sakha speech. This needs systematic testing with a combination of experimental and sociolinguistic field data to determine whether these changes are indicative of imperfect learning and language shift rather than contact-induced convergence. Research in this area has primarily focused on heritage speakers of majority languages in immigrant communities (Polinsky 2018); this research brings a broader range of linguistic data from diverse languages in differing ecologies. 

Beyond contributions to linguistic theory, the project aims to expand and fine-tune our understanding of the mechanisms behind language shift, so that interested communities can enact measures to stop or offset them. By studying varieties that result from language contact, we can facilitate revitalization efforts with new speakers who often produce contact varieties, such as the case of an emergent creole-like variety of Dena’ina (Athabascan), in Alaska, that shows heavy influence from English syntax (Holton 2009). Moving away from puristic tendencies in documentation offers potential advantages, including leveraging the structure of the majority L1 to facilitate acquisition of the revitalized L2, a known learning effect (Onnis & Thiessen 2013).


Gary Holton: What is DEL and what is it good for?

Nearly three decades ago in Chicago, Ken Hale and other perceptive LSA members identified a crisis within the discipline of linguistics, warning that the field risked becoming “the only science that presided obliviously over the disappearance of 90% of the very field to which it is dedicated” (Krauss 1992:10). The response has been both global and paradigm-shifting, leading to a renewed focus on language documentation and language reclamation. There is no doubt that the National Science Foundation Documenting Endangered Languages (DEL) program has been central to that effort. Not only has DEL provided direct support for documentation efforts, it has also contributed to capacity building; the development of new tools for documentation; and the adoption of new approaches to documentation. As we reflect on 15 years of DEL funding, two important questions emerge. First, to what degree has a distinct DEL program contributed to the successful response to the endangered languages crisis? Specifically, could this effort have been equally-well achieved directly within existing NSF programs? Even if we answer this question in the affirmative, there remains a second, perhaps more relevant, question. Namely, is a distinct DEL program still useful? In other words, has the DEL program now achieved its intended purpose and outlived its usefulness as a distinct program?

This second question is less heretical than it might first appear. The success of the DEL program has led to an increased awareness of endangered languages across the entire field of linguistics. It is no longer unusual for a theoretician to engage in field work, and few field workers now ignore the plight of language loss within indigenous communities where they work. DEL-funded efforts such as the Austin Principles of Data Citation in Linguistics (Berez-Kroeker et al. 2018) promote increased citations of primary data in linguistics publications. No longer a fringe topic, endangered language documentation and reclamation are now part of the fabric of the discipline of linguistics. So it is not unreasonable to argue that while DEL may have once been useful, the changes in the field over the last three decades have made the need for a distinct funding program less compelling.

In this paper I argue that the answer to both of the questions posed above is “no.” Drawing on examples from several DEL-funded projects I show that at least in some cases DEL projects would have been unlikely candidates for NSF support without a dedicated funding stream. I then review key features of DEL which distinguish it both from the Linguistics program and from other programs with NSF, showing that the innovations facilitated by DEL are largely unique to this program and not replicated elsewhere within NSF, nor in other funding agencies (cf. Holton and Seyfeddinipur 2018). Extrapolating from these examples I conclude that there is still a need for a distinct DEL program. Whether or not the world’s languages are still in crisis, there remains much work to be done. We are far from completing the work of documenting the world’s languages; there is still great need for new tools and infrastructure for language documentation; and there is much yet to do to build capacity for undertaking documentation work.