An Overview of Ground Truth and Data Collection

What is ground truth data?

Ground truth data is data collected at scale from real-world scenarios, to train algorithms on contextual information such as verbal speech, natural language text, human gestures and behaviors, and spatial orientation. The broad use of the term “ground truth” is derived from the geological/earth sciences to describe the validation of data by going out in the field and checking “on the ground.” It has been adopted in other fields to express the notion of data that is “known” to be correct.

“Repeatability and Reproducibility”

In measuring the quality of experiments, repeatability and reproducibility are key.”

The preceding title and text come from an article posted on LabTube. This article emphasizes the importance of repeatability and reproducibility in scientific experiments. It also explains the differences between the two standards and how to achieve both. Click here to read the article.

Elsevier on Peer Review

Peer Review is one of the quality standards that CRE intends to propose for adoption and use as international data quality standards. Elsevier publishes technical articles in various peer-reviewed journals. Elsevier has published a position statement on the importance and use of peer review. It reads in part as follows:

“What is peer review?

Reviewers play a pivotal role in scholarly publishing. The peer review system exists to validate academic work, helps to improve the quality of published research, and increases networking possibilities within research communities. Despite criticisms, peer review is still the only widely accepted method for research validation and has continued successfully with relatively minor changes for some 350 years.”

“1500 Scientists lift the lid on reproducibility”

Editor’s note: Reproducibility of data is one of the data quality standards that CRE intends to propose for international adoption and use. The above-titled Nature article discussed what could be described as a reproducibility crisis in research. Part of the article reads as follows:

“More than 70% of researchers have tried and failed to reproduce another scientist’s experiments, and more than half have failed to reproduce their own experiments. Those are some of the telling figures that emerged from Nature‘s survey of 1,576 researchers who took a brief online questionnaire on reproducibility in research.

Peer Review Week

Editor’s note: posted the following article:


Read Alice Meadows’ Scholarly Kitchen post which announces the 2020 theme:


Peer Review Week is a yearly global event celebrating the essential role that peer review plays in maintaining scientific quality. The event brings together individuals, institutions, and organizations committed to sharing the central message that good peer review, whatever shape or form it might take, is critical to scholarly communications.

PEERE International Conference on Peer Review

The 2nd PEERE International Conference on Peer Review 2020 will be held as a fully open virtual event on 29 September – 1 October 2020.  The conference organizers posted the following description of the conference:

The conference aims to provide a forum for scholars, practitioners and science stakeholders to share research, models, tools and experience on peer review in different fields, e.g., medicine, computer science, social sciences and humanities. It aims to stimulate the use of evidence-based research in the design and implementation of peer review in a variety of fields and encourage more systematic research.

Peer Review Congress Seeks Research

The Ninth International Congress on Peer Review and Scientific Publication will be held in Chicago, September 12-14, 2021. The Congress’ aim is to encourage research into the quality and credibility of peer review and scientific publication, to establish the evidence base on which scientists can improve the conduct, reporting, and dissemination of scientific research. A long discussion of peer review and the conference, and a call for research, is available in JAMA here

Arctic Council on Models Validation

The Protection of the Arctic Marine Environment working group of the multi-nation Arctic Council has produced and published Underwater Noise in the Arctic, a State of Knowledge Report. On page 39, this Arctic Council report criticizes the accuracy of certain “modeling studies” of the effects of underwater noise on marine animals:

“Modeling studies allow for examination of potential noise levels and impacts on animals in regions where direct empirical measurements are difficult to obtain. Modeling studies can also be used to forecast future impacts that have not occurred yet. Challenges associated with modeling studies in the Arctic, especially for all of the studies reported here, is that there has been almost no ground-truthing, so the precision and accuracy of the results are unknown.”

U.S. State Department Seeks Comment on Updated NEPA Rules

The U.S. Department of State  is issuing a final rule to update the Department’s Regulations for Implementation of the National Environmental Policy Act (NEPA) to reflect a recent Executive Order that revised the process for the development and issuance of Presidential permits for certain facilities and land transportation crossings at the international boundaries of the United States. This is rule is effective July 13, 2020. Comments will be received until June 29, 2020.

Click here for more details and relevant links

U.S. EPA Seeks Comment on Changes to Proposed Strengthening of Transparency in Regulatory Science Rules

On March 3, 2020, the United States Environmental Protection Agency released a “Supplemental notice of proposed rulemaking” for the Agency’s previously proposed rule entitled “Strengthening Transparency in Regulatory Science.”  EPA explained in its notice:

“This supplemental notice of proposed rulemaking (SNPRM) includes clarifications, modifications and additions to certain provisions published on April 30, 2018. This SNPRM proposes that the scope of the rulemaking apply to influential scientific information as well as significant regulatory decisions. This notice proposes definitions and clarifies that the proposed rulemaking applies to data and models underlying both pivotal science and pivotal regulatory science. In this SNPRM, EPA is also proposing a modified approach to the public availability provisions for data and models that would underly significant regulatory decisions and an alternate approach. Finally, EPA is taking comment on whether to use its housekeeping authority independently or in conjunction with appropriate environmental statutory provisions as authority for taking this action.”