During the Poster Session, invited participants share methodologies and outcomes representing 15 years of the Documenting Endangered Languages (DEL) grant program of the National Science Foundation (NSF) and National Endowment for the Humanities (NEH).
Posters will present in the context of documentation practices and technologies, and their implications for both linguistic theory and for broadening participation and collaboration in linguistic research. The session will include participants representing different awardee profiles (indigenous, non-indigenous, members of “the academy,” community members) and “talking dictionaries”. Many projects include innovations in information technology, such as computer-aided grammar analysis and socio-spatial analysis of multilingualism. Online resources such as dictionaries, pedagogical texts, and archives for scholars and community members are featured, such as the Kala online encyclopedia. Posters will also represent broad geographic and language classifications.
We view this organized poster session as an opportunity for both reflection on significant achievements made possible by DEL-funded research, and also for consideration about what directions these projects should take for the future. At this same time, the United Nations has declared 2019 to be the International Year of Indigenous Languages (IYIL), and the LSA has committed to tailoring a number of programs and events at its annual meeting and institute to celebrate indigenous languages, community-centered initiatives and resources for further involvement and investment. We view this organized session as complementary to IYIL-connected events. This session is aligned with LSA priorities for research funding, endangered languages and their preservation, and for enhanced understanding of the essential role of language in human life.
Emily M. Bender, Joshua Crowgey, Michael Wayne Goodman, Kristen Howell, Haley Lepp, Fei Xia, and Olga Zamaraeva: AGGREGATION: Building Computational Resources Automatically from IGT
The AGGREGATION Project, supported under NSF DEL grants BCS-1160274 and BCS-1561833, has been working to bring the benefits of grammar engineering to language documentation without requiring field linguists to become grammar engineers. We achieve this by automatically creating precision grammars on the basis of analyses and annotations already produced by field linguists in the form of IGT, a typologically-grounded cross-linguistic grammar resource (the LinGO Grammar Matrix, Bender et al 2002, 2010), and natural language processing systems developed for enriching IGT for low-resource languages (Georgi 2016; Xia et al., 2016).
Precision grammars are machine-readable encodings of mutually-consistent linguistic hypotheses, in our case, concerning morphotactics, morphosyntax and the syntax-semantics interface. They can be used to automatically process text, producing syntactic structures and semantic representations from input strings and generating output strings from semantic representations. Text processed in this way can then be searched for sentences or word forms with structures of interest or items that are not covered by the grammar (i.e. fall outside current hypotheses).
Research products of AGGREGATION to date include the Xigt data format for encoding interlinear glossed text (IGT) enriched with additional annotations such as part of speech tags, parse trees and dependency structures (Goodman et al 2015); the MOM system for inferring morphotactic rule sets on the basis of collections of IGT (Zamaraeva 2016, Zamaraeva et al 2019a building on Wax 2014); an interactive visualization system for viewing MOM output (Lepp et al 2019); several libraries of customizable linguistic analyses of phenomena not previously covered by the Grammar Matrix customization system (Howell and Zamaraeva 2018, Howell et al. 2018, Zamaraeva et al. 2019b); and a system for inferring typological parameters for the customization system, resulting in skeleton grammars created automatically from IGT (Bender et al. 2013, Bender et al. 2014, Howell et al. 2017, Zamaraeva et al. 2019a).
Jeff Good: Individual-based socio-spatial networks and multilingual repertoires
Investigations of the distribution of languages over geographic space are typically based on simplified representations where a set of points or polygons is overlaid onto a map, and a single language is assigned to each (see, e.g., Dryer & Haspelmath 2013). Such approaches fail to acknowledge the well-known fact that it is people who speak languages, not places, and they inhibit precise modeling language distributions, especially in contexts where multiple languages are used within a single community, which is often the case for communities associated with endangered languages. There are still significant barriers to collecting and analyzing data on individual-based multilingual repertoires for large-scale areas. However, existing techniques make this possible for small areas, with potential lessons for the study of endangered languages (see, e.g., Hildebrandt & Hu 2014).
This paper reports on the results of interdisciplinary research applying socio-spatial analytical methods to a database of information on the multilingual repertoires of individuals from a rural region of Cameroon where the average adult speaks around five-to-six languages (Esene Agwara 2013), many of which are endangered. Patterns of language competence were examined with respect to social and spatial networks (see Bian 2016).
Figure 1 provides an example of the results of this work. It represents individuals in the region of focus who are competent in three local linguistic varieties. Grey circles represent sampled individuals not competent in all of these varieties, and orange circles represent individuals who are competent in them. Colored straight lines connect individuals to the villages associated with each of these varieties. This network is overlaid onto a spatial representation of the local road and footpath network. The color of the lines represents difficulty of movement, with green indicating least difficulty, red most difficulty, and orange moderate difficulty.
An important pattern that emerges from Figure 1 is the distribution of individuals who have competence in the three focus varieties but who do not live in the villages that those varieties are associated with. For the most part, they lie in socio-economically marginal villages where competence in many languages of the region is a significant social asset given limited opportunities within their own villages (see Di Carlo et al. 2019). From the perspective of understanding patterns of language maintenance, this suggests that, at least in this part of the world, a significant number of speakers of a given language may be highly multilingual individuals from less socio-economically powerful communities who interact with speakers of more powerful communities. This has clear implications for our understanding of the linguistic ecologies of regions characterized by high degrees of multilingualism and also suggests important motivations for the maintenance of multilingual practices in such parts of the world.
Larry Kimura & Danielle Yarbrough: Kaniʻāina, Voices of the Land: A DEL/TCUP-funded digital repository for spoken ‘Ōlelo Hawaiʻi
In this poster we present “Kaniʻāina, Voices of the Land,” the first online repository of ʻŌlelo Hawaiʻi spoken by L1 speakers. This poster will be bilingual in ʻŌlelo Hawaiʻi and English.
Kaniʻāina (http://ulukau.org/kaniaina/) is a digital repository with a bilingual ʻŌlelo Hawaiʻi and English interface that currently provides interactive access to some 525 hours of audio recordings, including the celebrated Ka Leo Hawaiʻi radio broadcasts that aired between 1972 and 1988. These recordings are a treasure chest of Hawaiian language and cultural knowledge shared from among Hawaiʻi’s last L1 ʻŌlelo Hawaiʻi, born between 1882 and 1920. Most are from this birth range, and just a few are from younger generations of first language speakers. The Kaniʻāina website is hosted on Ulukau, a bilingual digital library interface that, with some 2 million page-hits per month, is already arguably the single most-accessed site for ʻŌlelo Hawaiʻi materials.
In addition to providing an interface for listening to spoken ʻŌlelo Hawaiʻi recordings, Kaniʻāina, in partnership with the Kaipuleohone Digital Language Archive, will also properly preserve those recordings and transcripts permanently in a world-class digital language archive and implement a procedure for crowdsourced transcription of additional recordings from e.g. the public and from University of Hawai‘i students of ʻŌlelo Hawaiʻi. The project will also serve as a catalyst for a cross-campus graduate educational exchange between students at UHH and UHM through a course to be offered in the 2020-2021 academic year.
Kaniʻāina grows out of decades of successful cutting edge immersion-based language education and statewide interest in promoting ʻŌlelo Hawaiʻi use at every level. This project represents a continuing refinement of the methods of language documentation and unparalleled technologies for preserving, disseminating and mobilizing four decades of documentation of spoken ʻŌlelo Hawaiʻi.
The Kaniʻāina project is representative of the broad scope of the NSF DEL program. Sponsorship comes from the NSF/NEH DEL program and the NSF Tribal Colleges and Universities Program. The project PIs are Indigenous and non-Indigenous scholars and activists from Ka Haka ʻUla O Keʻelikōlani College of Hawaiian Language at U Hawaiʻi at Hilo, and the Department of Linguistics at U Hawaiʻi at Mānoa.
Kaniʻāina has been made possible in part by a major grant from the National Science Foundation (BCS 1664070) and the National Endowment for the Humanities: Exploring the human endeavor (PD-255910-17).
Brook Lillehaugen, Felipe Lopez, and Savita Deo: Zapotec Talking Dictionaries: DEL impact in creating resources, supporting language activists, & educating undergraduates
This poster examines the impact of DEL funding (DEL/NSF Research Experience for Undergraduates Site Grant, PI K. David Harrison, Award #1451056) which has supported three cohorts of undergraduates to work alongside linguists and language activists in Oaxaca building Talking Dictionaries for four Valley Zapotec language varieties (Otomanguean, zab). We view collaboration and engaged reflection as processes in the creation of the dictionaries and the pedagogical experience for undergraduates, and consider their role in language documentation more generally.
Lexical projects on under-resourced languages provide an opportunity to raise new questions about lexicography generally (Mosel 2011) and digital lexicography in particular. These Valley Zapotec Talking Dictionaries are situated within a larger array of digital language activism projects, as the Zapotec co-authors of each of the Talking Dictionaries also engage in other online promotion of their language, including tweeting. The Talking Dictionaries, then, were grown in these directions: tweets and YouTube videos can be embedded in lexical entries, bringing Zapotec knowledge to the forefront while making use of the relatively small corpus of written Zapotec texts (Harrison et al. 2019).
The Zapotec collaborators on each of the Talking Dictionaries are also Zapotec language teachers, and sharing Zapotec language publicly is one of their goals in creating these dictionaries. Members of the Zapotec diaspora are explicitly viewed as potential users. The flexibility of the design, which allows real time revisions, is viewed as a benefit, especially to the diaspora community. Not only does this allow the addition of specialized knowledge when the opportunity arises, such as the names and uses of medicinal plants which are known by few, but it also facilitates language activism as a platform for the creation of new terminology, such as Zapotec words for ‘tweet’.
This is a dynamic lexicographic project: the platform, goals, and collaboration all evolve. As much as the work is about language and linguistics, it is also about working together is a multilingual, transnational, diverse team. Undergraduate participants gain skills that will prepare them for a large range of (academic and non-academic) career paths. Students not only expand their knowledge about linguistics, language, and culture, but also practice applying patience, cooperation, and persistence in a complex and detail-oriented research environment. Ultimately, the reflective pedagogical environment that supports students in thinking through the dimensions of this work also serves the entire team and the larger collaborative language documentation work.
Christine Schreyer: Kala Walo Nua: Collaborating across communities and disciplines through the documentation of the Kala language in aquatic environments
This community-based language documentation project includes a range of collaborations, between Kala speaking villages, but also between researchers with background in biology, anthropology, and linguistics. The project focuses on Kala, an unwritten language until 2010 when its speakers adopted a standardized writing system. The name of our project Kala Walo Nuã (literally, the Kala Mouth One, or the one Kala language project) emphasizes the collaboration between all six of the villages which speak Kala despite their being four divergent dialects of the language. Kala is a threatened language spoken by about 2000 people living in six coastal villages in Papua New Guinea. A country roughly the size of California, Papua New Guinea is home to more than 850 languages, many of which are under-documented and face extinction because they are spoken by small populations under intense development pressures. The exceptional biological diversity of Papua New Guinea is also under threat and, because ecological knowledge is deeply embedded in oral language traditions, language conservation may promote sustainable use of natural resources. This project simultaneously documents language and environmental knowledge. Kala speakers live near-subsistence lifestyles, growing most of their food in riverside gardens and obtaining most of their protein from coastal fishing. Given Kala-speakers’ coastal adaption and their deep historical, economic and cultural attachment to rivers, a focus on the aquatic environment provides an ideal basis for a deep understanding of their language and the relationship of linguistic to biological diversity.
Our project outcomes include: 1) expanding a Kala-Tok Pisin-English dictionary from 282 to 1500 words; 2) developing an online environmental encyclopedia with 500 entries in Kala and English; and 3) writing a sketch pedagogical grammar of the language. Information for the environmental encyclopedia will be gathered mainly through oral narratives conducted with Kala knowledge experts along river banks or near other aquatic habitats as well as through review of underwater marine video with Kala knowledge keepers. The narratives are expected to include accounts of historic relationships of specific clans to rivers, place name origins, species descriptions, descriptions of resource-use practices, and descriptions of religious beliefs associated with these environments. Digitized data is archived by Kaipuleohone, the University of Hawai‘i Digital Language Archive, and an upcoming interactive exhibit at Bishop Museum will share project results with the public.
Siri Tuttle: DEL and ANLC build bridges-Texts, dictionary, grammar, archives, and CoLang 2016
The Alaska Native Language Center, established in 1972 by Alaska statute, has found a responsive partner in the Documenting Endangered Languages program at NSF. In my projects at ANLC (2005-present), academic goals are balanced with goals for language communities. A third goal is also present throughout my work: that of building bridges between earlier research, often inaccessible to non-specialists, and those who can most benefit from the products of this research.
Specifically designed to include community input in the design of a publication, “Lower Tanana Dictionary and Literacy”created an ANLC publication Benhti Kokht’ana Kenaga’, Lower Tanana Pocket Dictionary (2009; published in shirt pocket and parka pocket sizes). Members of the Minto language community wanted a small-sized dictionary, comprising a word list agreed on by community elders. During this project, elders also called for the study of traditional song lyrics, a directive that has led to further research supported by NEH and the Swedish Research Council.
Alaska researchers create many recordings that never receive full transcription or translation. With co-investigator James Kari, in “Ahtna Texts” I brought Ahtna texts to publication that had been originally recorded as much as four decades previously. Two ANLC publications resulted: Ahtna Travel Narratives, edited by Kari (2010), and Yenida’a, Tsuts’aede, K’adiide, a book of myths, histories and oratory, edited by Kari and Tuttle (2018). The oratory in the latter book was provided by Ahtna speakers who participated in editing the older texts, and by their contribution changed the focus of the publication. Yenida’a is the first book of traditional Ahtna narratives to be published since 1989.
In “Alaskan Athabascan Grammar Database Development,” I worked with Olga Lovick to design a comparative grammar database for closely related, but quite distinct Dene languages of Alaska’s interior. This project illuminated the need for more precise glossing and sourcing of published examples from endangered languages, and helped establish guidelines for grammatical and functional ontology for different user communities.
“RAPID: Increasing access and discoverability at ANLA” is part of a longterm effort supporting the Alaska Native Language Archive, changing it from ANLC’s working library into a modern physical and digital archive with access for language communities and researchers. Discoverability has been improved through the development of processing protocols, workflows and metadata design. Access has been increased through the involvement of a graduate and an undergraduate student in the processing and cataloguing of their heritage languages.
The Institute on Collaborative Language Research – ALASKA was presented at the University of Alaska Fairbanks in 2016. Lawrence Kaplan, Alice Taff and I were leads. Over 150 language workers visited during the longest days of the year and engaged with multiple workshops and other activities for two weeks. Three field methods classes were also offered following the workshops: in Hän Dene, Unangam Tunuu (an archive-based workshop), and Miyako. We continue to hear from CoLang participants about the inspiration and long-term community support they have drawn from this institute.
Racquel-Maria Sapien and Chief Ferdinand Mandé: Rewards and Challenges of Long-Term Collaboration: 15 years in Konomerume (and counting!)
Konomerume, Suriname is a village on the banks of the Wajambo River that represents the geographic border between two languages: Kari’nja (Cariban) and Lokono (Arawakan). Since both languages are now severely endangered, community members have been involved with several projects to document and preserve their native and heritage languages. This poster traces the nearly-15-year collaboration between the former village Chief, Ferdinand Mande, a team of community leaders, and Racquel-Maria Sapien, an academic linguist with an interest in community-collaborative language research. Together with other community members, they have produced tangible outcomes that include multimedia documentary corpora, academic articles describing aspects of morphosyntax, and pedagogical materials such as an elementary school curriculum and lessons for the community-led adult class. Broader impacts include increased participation for members of an underrepresented group in the academic discussion of their language as well as ongoing training. For example, community members have presented aspects of their work at international academic conferences in French Guiana and the United States. In addition, they have participated in training workshops such as CoLang (formerly InField) and are now able to conduct language projects of their own design. More recently, one member of the community was invited to participate in a workshop on Community-Based Language Research Across the Americas (CBLRAA) that took place prior to the 2019 LSA and SSILA meetings. Her participation included presenting a poster at the LSA meeting. Chief Mande, Dr. Sapien, and other Konomerume community members are currently engaged in an NSF-DEL supported documentation project that has brought together members of several communities with a shared goal of documenting Lokono. For this community-to-community project, native and heritage speakers of one endangered language (Kari’nja) are documenting another (Lokono). In addition, they are training members of other communities, including Powakka and Witsantie, in tools and techniques for documentation. This poster highlights previous and ongoing projects, and describes challenges such as competing goals, communication lapses, and unreliable transportation with an eye toward identifying and implementing collaborative problem-solving strategies. The poster illustrates a multi-faceted collaboration that highlights the benefits of effective relationship building to scientific inquiry in linguistics and other social sciences.