What is ground truth data?

Ground truth data is data collected at scale from real-world scenarios, to train algorithms on contextual information such as verbal speech, natural language text, human gestures and behaviors, and spatial orientation. The broad use of the term “ground truth” is derived from the geological/earth sciences to describe the validation of data by going out in the field and checking “on the ground.” It has been adopted in other fields to express the notion of data that is “known” to be correct.