In 2021, the San Diego Natural History Museum (SDNHM) received funding from the US Institute of Museum and Library Services (IMLS) to improve the condition, management, and accessibility of the dry Marine Invertebrate Collection (IMLS MA-249928-OMS-21). The collection is one of the oldest at the museum and dates back to the founding of the San Diego Society of Natural History in 1874. It is a large, dry collection of primarily gastropods and bivalves and is estimated to contain nearly five million specimens in over 91,000 specimen lots, which include 135 holotypes and 856 paratypes. The collection has lacked dedicated staff for nearly 25 years and suffered from data inaccessibility and inadequate physical and taxonomic curation (Fig. 1). However, this project attempts to mitigate previous issues through increased staff dedication, specimen digitization, collection analysis, and rehousing.
The research value of the collection cannot be overstated, both in terms of geographic and temporal uniqueness. While the collection is worldwide in representation, it houses some of the earliest known collections from the museum’s focal region: southern California and the peninsula of Baja California. The collection also houses an important regional collection of land snails, perhaps the most impacted fauna of the current mass extinction (Cowie et al. 2017, Regnier et al. 2009).
Historically, each specimen lot in the collection was recorded on a paper catalog card (Fig. 2). After scanning all cards, Optical Character Recognition (OCR) was employed with moderate results. The OCR dataset required large-scale data cleaning in Microsoft Excel to produce an initial digital dataset. Iterative bulk updates to select fields using data keys helped minimize the number of fields for manual review. Manual review of remaining fields for each database record was completed by trained volunteers. This approach maximized our dataset’s quality and culminated in the collection’s first publicly accessible digital specimen catalog. The SDNHM Marine Invertebrate Collection records can be found online via InvertEBase-Symbiota portal (Gries et al. 2014).
Using the digitized collection records, data analyses yielded an incredible geographic spread, extreme data quality disparities, many unlabeled specimens and outdated scientific names. Approximately 20% of catalog cards and their associated digital records lack scientific names. But for a majority of these specimens, the determination exists either as a name on the in-drawer card or the card is unlabeled but within an identified species grouping in the drawer. Thousands of specimens have been updated through this process already and are now also digitally discoverable. A tree map was generated to graphically show the relative proportion of localities based on number of species lots (Fig. 3). There are specimens from every state in the United States and from nearly every country in the world. However, the collection spread is shallow, with the mean number of records per country only 29. A number of records contain only vague locality information. Approximately 5% of records don’t have country information and include very general data like “Pacific Ocean”, while 16% of records have only country level information. Lastly, only about 30% of the collection records have associated collection dates. In an attempt to improve this data weakness, collector “date bands” based on birth and death dates are added to the dataset allowing for certain kinds of scientific studies.
Alongside digital advancements, this project improved physical storage through the purchase of more space-efficient cabinets and replacement of wood drawers, and increased accessibility by expediting specimen retrieval through drawer reorganization, labeling with updated scientific names, and creation of an efficient specimen locator system Fig. 4.
This project provides the foundation for future curatorial and scholarly work in the collection, through enhanced collection accessibility and ease-of-use. A refined triage strategy informed by the digitized collection and aided by organizational improvements of this project will guide upcoming efforts for potential deaccession of irrelevant materials. We are exploring collaborations with other institutions and research initiatives to maximize the scientific impact of the collection. The SDNHM is now positioned to reintroduce its Marine Invertebrate Collection as a leading, efficient, and lasting resource.