Skip to main content
eScholarship
Open Access Publications from the University of California

A Guide to Using GitHub for Developing and Versioning Data Standards and Reporting Formats

  • Author(s): Crystal-Ornelas, R;
  • Varadharajan, C;
  • Bond-Lamberty, B;
  • Boye, K;
  • Burrus, M;
  • Cholia, S;
  • Crow, M;
  • Damerow, J;
  • Devarakonda, R;
  • Ely, KS;
  • Goldman, A;
  • Heinz, S;
  • Hendrix, V;
  • Kakalia, Z;
  • Pennington, SC;
  • Robles, E;
  • Rogers, A;
  • Simmonds, M;
  • Velliquette, T;
  • Weierbach, H;
  • Weisenhorn, P;
  • Welch, JN;
  • Agarwal, DA
  • et al.
Abstract

Data standardization combined with descriptive metadata facilitate data reuse, which is the ultimate goal of the Findable, Accessible, Interoperable, and Reusable (FAIR) principles. Community data or metadata standards are increasingly created through an approach that emphasizes collaboration between various stakeholders. Such an approach requires platforms for collaboration on the development process that centers on sharing information and receiving feedback. Our objective in this study was to conduct a systematic review to identify data standards and reporting formats that use version control for developing data standards and to summarize common practices, particularly in earth and environmental sciences. Out of 108 data standards and reporting formats identified in our review, 32 used GitHub as the version control platform, and no other platforms were used. We found no universally accepted methodology for developing and publishing data standards. Many GitHub repositories did not use key features that could help developers to gather user feedback, or to create and revise standards that build on previous work. We provide guidance for community-driven standard development and associated documentation on GitHub based on a systematic review of existing practices.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View