Archives strengthening historical narratives: Sharing digital and linked data resources for broader reach and sustainability
AbstractPrivate collections provide engaging windows into little-known subjects that, when made discoverable, are incredibly relevant to many diverse audiences. The Texas Coastal Bend Collection (TCBC) is a digital-first private collection that offers rich insight into the culture of the Texas Coastal Bend ranching communities, starting with the Irish immigration in 1834. The site’s topic-based framework immerses people in the region’s cultural history. Rich, well-structured metadata (subjects, people, places, historic events, relationships) allows every page to be a gateway for exploring over 200 artistic photographs, 9,000 images, archival documents, books, maps, genealogies, and 1,400 hours of oral history. We describe the strategies and tools that enable rich exploration of the TCBC’s unique resources, its maintenance by a small dedicated staff, and how meaningful digital connections with other institutions can foster storytelling across an array of subjects. The digital approach that underpins the TCBC, incorporating highly structured categorization, linked data, IIIF, and a unique audio player, provides insights that can be used by other museums and archives. This talk is valuable for people who manage historic and/or archival collections that aspire to be online, and institutions that plan to create shareable online digital resources that can be connected with other institutions’ collections. It provides insights into ways that oral history and images can be used to assemble flexible topic-based narrative structures, and, it provides insight into technical and curatorial approaches, so others don’t have to “start from scratch.”
Keywords: Online museum, Archives, Linked Data, Collaboration, Storytelling, Oral history
Private collections provide engaging windows into little-known subjects that, when made discoverable, are remarkably relevant to many diverse audiences. The Texas Coastal Bend Collection (TCBC, http://texascoastalbend.org) is an online cultural history museum and archive based on a private collection, offering rich insights into the Texas Coastal Bend ranching communities starting with the Irish immigration in 1834. The site’s design has a topic-based framework to introduce and immerse people in the region’s cultural history. Rich, well-structured metadata (subjects, people, places, historic events, relationships) allows every page to be a gateway for exploring 1,400 hours of oral history audio, over 100 artistic portrait photographs, 9,000 images, a wealth of archival documents, books, maps, and genealogies.
Extending beyond the traditional definition of institutional or personal archives, the TCBC is centered around the voices of the African American and Mexican cowboys, the primarily Anglo-America ranchers, as well as the women and the townspeople populating the Coastal Bend region. To facilitate discovery of the embedded stories found in this historical region, the TCBC has focused on cross-linking related assets with an emphasis on subjects, people, and the sense of place. Other cultural and oral history collections can not only use this site as a digital resource, but build on the approaches to design, relationship-based linking and navigation, subject and topic management, and oral history interaction.
What makes this collection interesting and unique?
The Texas Coastal Bend Collection is focused on a three-county geographic area with one economic engine (ranching), and a second insulating industry (oil) that shifts, but doesn’t disrupt, the first. The ranching culture in South Texas was fairly isolated until the 1970s, when oil money allowed many of the big ranches to continue to employ many of the old ways. The relative isolation was a container for a culture that was only beginning to be diluted by the modern world when cousins Louise and Nancy O’Connor, descendants of the O’Connor ranching family, started collecting oral history interviews in 1981.
The collection reflects the first-hand experience of a relatively self-contained culture, centered around the land and the natural environment. Human connection to nature is the beating heart of this most human collection.
There is also the nuanced story of three ethnicities blending into a tripartite culture, where one community grows out of the working interactions of three interdependent cultures. The Irish immigration into South Texas “is, in many ways, coincident with the founding of the republic and the development of the state” (Fry, 2010), so it’s not surprising that many Irish ranchers came to embody the Texas mythology of “courage, determination, ingenuity, and loyalty” (Busby, 1995). The collection also presents a day-to-day telling of the experiences of the African-American cowhands, who illuminate Professor Michael ‘Cowboy Mike’ Searles’ (teacher of African-American history at Augusta State University) insight: “probably nowhere do the blacks have as much independence and freedom as the black cowboy” (Foehr, 2002). The third leg of the tripartite culture were the Vaqueros, the Mexican ranchers and cowhands who taught the Irish the Spanish traditions of working cattle (Davis, 2002). How these three peoples melded into a community provides an important cultural perspective.
Figure 1: Texas Coastal Bend Collection homepage; http://texascoastalbend.org
Like many private and historical collections, this one begins with a personal, in this case artistic, passion. Louise O’Connor is a photographer who was photographing what she knew and loved—the people who surrounded her in the ranching community of the Texas Coastal Bend. Nancy O’Connor is a museum installation artist who was using the ranching culture as her primary subject. As they worked, they listened to the stories that they had been hearing since childhood. Louise’s “aha” moment came when she asked cowhand L.V. Terrell to tell her about her grandfather. As the stories tumbled out, she realized she had to preserve not only a dying culture, but the lives of these people and “their importance to the Universe” (O’Connor, 2012). So, in 1981, the interviews began. Louise and Nancy adopted an approach to capturing oral history that was comfortable for the people in their community; sitting around together in groups of three, four, or five, swapping stories, and encouraging each other to share their memories. Louise asked the historical questions and Nancy asked the personal questions—together, they were able to make the interview process a social event for reminiscing. Consequently, the oral histories are deep and personal, and the portrait photographs are intimate. Add in the visual context of the historical photographs, drawn from all social strata, bottom to top, and the extensive document collection from lives long past, and the assembled assets reflect a detailed study of a way of life. The collection is a treasure trove for researchers. It is a living history, a human collection, not just an interpretation of objects.
Figure 2: each feature page provides that person’s portrait photograph, and access to their audio, images, and text assets
Essentially, the TCBC is a collection of voices delivering history as an actual lived experience, told by those who labor—rancher and cowhand alike.
Why present it as a digital collection?
Creating a digital collection is a natural extension of Louise’s and Nancy’s goal of sharing these stories with a world increasingly disconnected from the natural environment. They seek to make the history of the region and the ranching lifestyle approachable for everyone—not just raw material for scholars. This is not a collection that is a physical “destination” for researchers to sit in a room in South Texas, listening to hour after hour and reading archival documents. This collection seeks to reach a broad audience where researchers and the general public are both welcome. Presenting digital content (images, voices, texts) is increasingly a motivation for many holders of small collections and community/state archives, who recognize that their interests and materials can be combined with other collections electronically, extending the reach and value of the collection (Gruber, 2018; Hansen & Moya, 2015).
Audio presents a unique set of challenges to anyone wanting to explore the content. Two barriers confront the user out of the gate. First, analog audio tapes degrade over time, and handling analog tapes accelerates the degradation. Digitization is the only playback medium that allows unlimited access with no cumulative degradation. And the large volume of audio material in the collection negates any practical analog access. Second, the only way to know what exists is to listen. Listening demands a heavy time commitment. While transcription can support access to the audio, (and we use it), verbatim words are no substitute for the inflected voice. Digitization and curation together make it searchable, greatly decreasing the time commitment. By using a custom keyword hierarchy supported by verbatim and highlight transcripts, search results can generally locate key audio passages within a few minutes.
The same keyword hierarchy that serves the audio search capabilities also serves to connect our image and text assets. The data-rich digital platform allows all relational assets to be linked, and thus provides the opportunity for the relationships found between our three major media assets to be curated and woven into narratives. Likewise, our digital tools will facilitate the user constructing the stories they discover.
The digital world allows us to fully leverage the ability to have multiple copies of assets. We have set up a workspace for researchers and “re-users” to aggregate their search results by saving assets to customized folders.
The Texas Coastal Bend Collection is currently a private collection looking to find a forever home with a university or museum. We believe that our assets, and particularly our strong oral histories, are more valuable as an accessible digital collection, and that the platform we are creating will be a significant addition to an established institution’s existing collections.
Presenting the collection as a linked network of relationships
The creation of an integrated online museum and archive, where the assets are presented in relationship to each other, is the overarching goal of our asset presentation. Integrated and networked collections are increasingly identified as highly desirable to scholars and the general public (Jones, 2015). A primary access goal for TCBC is to encourage users to browse as a means of discovery. We believe that through self-directed discovery, our assets will more effectively engage users.
Our browsing is accomplished by lateral navigation. Lateral navigation works to cut across pre-determined hierarchal relationships that tend to silo assets. Lateral navigation lends itself to discovering relationships, our defining unit of cultural history, by expanding the user’s choice of what’s next. The particular style of digital storytelling we are exploring requires an active user whom we envision as residing at a center point surrounded by a globe of data points each equidistant from the user.
The website continually introduces users to new, related content, implicitly educating users on both the themes and the structure of the collection. A linked browsing approach can help users become increasingly comfortable with the subject and what is available. Early research indicated that people who are not already familiar with the subject of the collection would not succeed with only tight, self-directed navigation.
Having our collection online alongside other collections that touch on related themes and historical subjects allows our assets to be used as a resource for historians and as a curriculum enhancement for teachers. These assets are powerful additions to other collections of black history, Mexican culture, Irish immigration, and Western life. But beyond the educational utility, our main thrust is to engage browsers and create a platform for makers. The reuse of our assets by others is highly encouraged, and our tools seek to facilitate those opportunities. Currently, the underlying data is being prepared to migrate to Linked Open Data, extending the value of the collection to others.
The underlying technology and models have evolved as the understanding of the content relationships have evolved. The design centers around the following key principles:
- Focus on relationships, not just objects themselves
- Provide thematic entry points that make unfamiliar subjects approachable and exploratory
- Encourage discovery and serendipity
- Offer a clean, modern experience
- Take a linked data approach that allows the collection to share assets and integrate with other thematic collections
- Allow the collection to grow, and the design patterns to extend for use with other related collections
Important aspects of the design
Focus on relationships
Relationships illuminate meaning, meaning reveals story, and stories engage people. And, people are the heart of the culture’s stories, whether they’re the subject or the storyteller. They are the voices in the oral histories; they are the focal points in the portraits; they are both the narrators and subjects in the books and digital archival materials.
The design of our site seeks to illuminate people’s relationships to each other, to places, to the natural world, and to events, both big and small, that reflect their work and their community. The design also illuminates the broader themes that connect them to people and events outside the region. These relationships are best found in the recorded audio conversations, but it is all of our assets together, in relation to each other, that facilitate an understanding of the culture of the Texas Coastal Bend.
Figure 3: oral history audio page segment, showing the people in the conversation. The multi-track scrolling timeline provides a visual cue of the shared conversation among the participants
Figure 4: a segment of a subtopic page combines snippets of audio, photographs, and book quotes that together reflect and present the interwoven relationships among the people and the available content
Provide thematic entry points
As noted above, we recognize that many people who engage with the collection will not have a deep prior knowledge of the subject and the stories. Topic and subtopic pages are a key focal point in the site design, to draw people in and provide a “landscape” for their exploration of the rich content in the collection.
“Topics” are distinct from “subjects” in the metadata model:
- Topics are thematic, constructed as a collection of concepts and ideas that together provide an orientation to the life and work of the people. Topics and subtopic examples include “life in a cowboy camp”; “the pride of standing tall in the saddle”; “from horses to trucks to helicopters”; “strong women of the TCB”; “conservation and stewardship”
- Subject hierarchies can be literal depictions or conceptual statements (“about-ness”), where in both cases they apply to individual assets (images, book pages and text documents, audio segments). Subject examples include “throwing calves”; “pasture riders”; “midwives”; “securing food”; “attending church”; “living in nature”
There is an integrated relationship between topics and subjects that provides significant capabilities for building relationships between the assets, optimizes search and relevance ranking, provides connecting points with other collections and scholarship, and eventually allows the curator, researchers, and users to construct story narratives that extend our understanding of the material.
Figure 5: a layered classification approach for subject-tagging each asset, using multiple fields in the database to provide more nuanced understanding of subjects. Topics, which are used to provide entry points for broader narratives, are separate but connected to subjects
Figure 6: hierarchical topic structure is linked to subjects that are reflected in the topic. The combination of the two provides enhanced technical capabilities, and also provides bridges out to other collections and scholarship through linked data and content sharing.
The thematic structure encourages exploration at many different levels for people with varying degrees of knowledge about, and interest in, the subject:
- Easy orientation for a first exposure to the collection.
- Topical interest within a specific area, e.g. ranching and cowboy life, black history, Mexican history, the tripartite culture, rural life in the early 20th
- Aggregated entry points for scholarly and archival research into previously unavailable resources.
Encourage discovery and serendipity
Assets are classified at a very detailed level using the extensive vocabulary that reflects the different thematic, contextual, and personal relationships in the material. This deep tagging offers users unexpected insights about the material as well as launch-points for further, more focused exploration of a particular subject.
The audio interviews range from half an hour or so to as much as five hours. Each interview is broken down into tracks (individual audio files) of 20 minutes to over an hour, with each track broken into segments. Each segment is then tagged to identify the people, places, and subjects that are mentioned in the segment. The user interface provides the ability to filter the entire track down to only the segments that reflect a specific subject or subjects, based on a user’s interests. At the same time, each segment shows the user the related tags, prompting new subject explorations.
Figure 7: side-by-side images of the audio playback page, with segments of the interview unfiltered and then filtered by a specific subject, person, or place
Offer a clean, modern experience
The Texas Coastal Bend Collection aims to reach a wide audience by presenting history as approachable and fascinating. There is a large amount of data connected with each asset, and it is a design challenge to prioritize that which facilitates the browsing experience. The browsing experience brings its own design challenges to find the sweet spot between too many and too few navigation choices.
The audio page is laid out vertically, rather than the more traditional horizontal layout, to allow more effective scrolling through the segments; the ability to see the ebb and flow of the conversation in the channel indicator; the ability to navigate quickly to different places in the conversation; a clear way to connect transcript to audio; and a clear way to present related subject tags and additional controls for users to manage. The design will also extend to presenting related images, pulled automatically from the collection via the subject/topic relationship model described earlier in this paper.
Books are presented in an interesting way, using two different page formats: the traditional book replication, and a Web view.
- Each book is available in its entirety and shown as a scanned digital page by page layout. Navigation is available to skim through the entire book quickly, when looking for something specific. The book is fully searchable, and since each page has its own tagging, users have the ability to filter down to specific pages of interest
- Each book’s chapters are made up of thematic sections. The Web view reformats these sections as a vertical scroll with text in the left column and the book’s images in the right column. This allows an increased flexibility to add photos or audio snippets to the book pages and enhances the ability to link to the collection’s other assets.
Figure 8: illustration of two different book layouts, showing the same section of the book presented in digitized page format and as a content section with related assets
The underlying linked structure of the data
Take a linked data approach
All the content presented in the Texas Coastal Bend Collection has been modeled from an entity-based, networked, linked data approach (Degler & Johnson, 2015). What does that mean?
- All the major “entities” of the collection—people, organizations, places, events, subjects, themes, content types—were modeled as primary items in the data model, allowing them to have behaviors and relationships associated with them programmatically
- The “network” of relationships between entities is inherent in the way that content is managed, rather than having to be applied as an after-thought
- Linked data modeling identified the nature of the relationships between the entities, and how that could be used to optimize the search and browse experience, as well as prepare for future sharing of data and objects with others
Figure 9: TCBC’s high level object model, showing the way that entity relationships are established across the collection
The introduction of a publicly exposed and fully integrated linked data framework is vital to extend the value of the TCBC, as has been identified as a trend for other museums and archival collections (Baumgaertner & Lehner, 2017). The model was designed to eventually align with the emerging Linked.Art model for museum collections (http://Linked.Art), as well as emerging norms/standards for archival linked data modeling.
As described previously, the 1,400 hours of audio is managed in various ways so that it can be cataloged and described precisely, linked at precise points, and used in various ways on the site.
- Many of the original interviews were four-channel master recordings relying on four lavalier microphones for the participants. For the online presentation they have been merged into a single playback file with each channel distributed across the playback stereo spectrum. Additionally, each of the channels was mapped to the visual waveform to make clear on the page who is talking when.
- The interviews are presented as multiple audio files, which we’re calling “tracks,” that can range from twenty minutes to over an hour. An interview could contain as many as fifteen twenty minute “tracks.” Each track (audio file) is accessed individually.
- Each digital audio track is further subdivided into a set of “cue points”—sequential segments that are typically two to four minutes long—that play one after the other, linearly, on the page. Attached to each segment is either a verbatim transcript or a much shortened highlight transcript tagged appropriately with entity keywords (subjects, people and places). This creates the linked network throughout the stories on the site. Segments are determined by the flow of the conversation and the subjects that are discussed.
- Finally, the Web user interface will incorporate a future feature that allows registered users to save individual segments.
- Curated audio snippets, typically between 20 seconds and two minutes long, are integrated throughout the topic and subtopic pages and will eventually find their way on to the book web view pages. The snippets are either directly pulled from an audio file or are manually mixed from the four-track masters. These curated snippets are then linked back to the original interview audio page. In some cases, the individual speaker’s statement may be edited, or there may be multiple speakers from multiple interviews edited together to provide further insight into a topic.
The users’ ability to search over 30,000 segments of TCBC audio gives an unprecedented access to the oral history. We believe this is one of the most comprehensive oral history studies of a singular culture available online.
Allow the collection to grow and extend to other related collections
A key requirement for the project is to make it possible for a very small staff to maintain and enhance the online collection. The underlying database, tagging, and linking capabilities aim to streamline the cataloging process, and to prompt suggested terms. Batch tools allow metadata to be entered and modified across multiple assets. Related content is generated from the rich entity/theme/subject data structure and delivered as the site is used, so it is enhanced as the content and metadata is enhanced.
Another benefit of these capabilities is that it should make it easier in the future to incorporate additional collection objects and digitized archives from other ranches in the region or the state, if there is a desire to extend the scope of the current website.
IIIF (the International Image Interoperability Framework, http://iiif.io) and the Open Seadragon viewer (https://openseadragon.github.io) are used for presenting all photographs, books, and archival objects. This provides a level of extensibility and integration over time.
The TCBC’s underlying architecture mostly uses commonly available open source components, so it is easier to extend and leverage enhancements developed by the software community. At the same time, the modeling of the data is rich enough to allow a future sophisticated linked data representation to be presented for uses that want to connect the collection with other topically related collections.
There are a number of items on the TCBC roadmap. In content terms, we will continue work on expanding the thematic pages, and presenting topics and subtopics as entry points. The archival assets will continue to be digitized and transcribed, which may include the introduction of crowdsourced transcription.
In the future, we will add two audio features: individual segment return and persistent audio. Persistent audio, the capability of playing an audio piece while navigating anywhere on the site, provides the audio with the missing visual component, using active navigation of the website for related imagery.
The “Place” entity is not yet implemented. Like the “Person” page, the site will have a page template for those places that significantly inform the ranching culture of the Coastal Bend. The individual ranches would be the primary places identified, but the list would also include the towns and settlements that impact the culture. The place page will be where our collection of maps will be integrated.
The TCBC site will use infographics and interactive tools to visualize our data. We plan to develop a genealogy presentation graphic that illustrates the interrelatedness of the ranching and cowhand families in the Coastal Bend. We will develop interactive timelines that change what is being plotted to better look at lifespans, innovations, events (whether regional, national, or international), weather, epidemics, the cattle market, etc. In keeping with the site’s cross-linking goals, the data points used in these graphic presentations will link to any relational assets elsewhere in the project to encourage the browsing experience.
The project has, in its collection, a limited amount of archival video that has yet to be incorporated. We do have a growing number of on-camera interviews with various academic experts that will be used as short clips within the topics and subtopics to explain, clarify, provide insights and context, and generally give weight and authority to our curated presentations.
Interestingly, the ranching culture produces its fair share of artists. It’s not unusual for the ranchers themselves to have an artistic side. We plan on integrating two major artists who grew up in the culture and mine it to inform their work. Nancy O’Connor, one of the project’s founders, has a lifetime of museum installation sculptures that pay homage to the people and their spirit. Kermit Oliver, the son of one of our cowhand interviewees, is the only American artist to design for Hermès in Paris, and two of his works were shown at the new African American Smithsonian Museum. His paintings evoke the human connection to nature in ways words can never achieve. In both cases, the artistic expressions give a depth of understanding to a culture that few other collections can offer.
At some point, the collection and the digital infrastructure/website will find an institutional home. The project is already reframing its roadmap to prepare the collection for a future move, allowing for further expansion with other, related private collections. This prompts the need to address considerations for portability of the assets and code. Two features that have been identified as important for its future are the full implementation of Linked Open Data, as well as additional refinements in the ways that cataloging and cross-linking is handled in the administrative parts of the application. These enhancements can also extend the capabilities for TCBC to develop into a framework for other private collections to use.
As our curator Mark Coffey always says—“Onward.”
Fry, P.L. (2010). “Irish.” In Handbook of Texas Online. Published by the Texas State Historical Association. June 15, 2010. Consulted March 3, 2018. Available http://www.tshaonline.org/handbook/online/articles/pii01
Busby, M. (1995). Larry McMurtry and the West: An Ambivalent Relationship. Denton, TX: University of North Texas Press. Page 45.
Foehr, S. (2002). Waking Up in Nashville. London: Sanctuary Publishing. Page 186.
Davis, G. (2002). Land!: Irish Pioneers in Mexican and Revolutionary Texas. College Station, TX: Texas A&M University Press.
Ethan Gruber. (2018). “Fralin Museum coin collection leaps forward, embraces IIIF”. Numishare blog, 3-Jan-2018. American Numismatic Society. Consulted 4-Jan-2018 http://numishare.blogspot.com/2018/01/fralin-museum-coin-collection-leaps.html
Hansen E. & M. Moya. (2015). Texas Archive of the Moving Image. “In Next Week’s Episode…: Serializing the Online Exhibit.” Museum Computer Network presentation. 7-Nov-2015.
Jones, M. (2015). “Artefacts and archives: Considering cross-collection knowledge networks in museums.” MWA2015: Museums and the Web Asia 2015. Published August 15, 2015. Consulted September 29, 2017. http://mwa2015.museumsandtheweb.com/paper/artefacts-and-archives-considering-cross-collection-knowledge-networks-in-museums/
O’Connor, L.S. (2012). Audio interview from the Texas Coastal Bend Collection. Track 4, 21:20. Recorded March 20, 2012. Consulted January 13, 2018. Available http://www.texascoastalbend.org/collection/view/AI120320/interview/?q=ai120320&p=1&so=rv:d&n=0#t=14691&kf=&pf=&lf=&s=1278
Degler, D. & N. Johnson. (2015). “How and when to use LOD? Starting your institution’s conversations about Linked (Open) Data.” Workshop at Museums and the Web Conference. April 8, 2015. Accessed http://mw2015.museumsandtheweb.com/proposal/how-and-when-to-use-lod-starting-your-institutions-conversations-about-linked-open-data/
Baumgaertner, T. & F. Lehner. (2017). “The technical aspects of museum information vs. the museum professionals point of view: a conceptual change of perspective on data processing.” MW17: MW 2017. Published February 16, 2017. Consulted January 13, 2018.
Coffey, Mark, Watts, Alan and Degler, Duane. "Archives strengthening historical narratives: Sharing digital and linked data resources for broader reach and sustainability." MW18: MW 2018. Published January 19, 2018. Consulted .