Big Data to Knowledge – Harnessing Semiotic Relationships of Data Quality and Skills in Genome Curation Work

Hong Huang, hong huang

Research output: Contribution to journalArticlepeer-review

Abstract

This article aims to understand the views of genomic scientists with regard to the data quality assurances associated with semiotics and data–information–knowledge (DIK). The resulting communication of signs generated from genomic curation work, was found within different semantic levels of DIK that correlate specific data quality dimensions with their respective skills. Syntactic data quality dimensions were ranked the highest among all other semiotic data quality dimensions, which indicated that scientists spend great efforts for handling data wrangling activities in genome curation work. Semantic- and pragmatic-related sign communications were about meaningful interpretation, thus required additional adaptive and interpretative skills to deal with data quality issues. This expanded concept of ‘curation’ as sign/semiotic was not previously explored from the practical to the theoretical perspectives. The findings inform policy makers and practitioners to develop framework and cyberinfrastructure that facilitate the initiatives and advocacies of ‘Big Data to Knowledge’ by funding agencies. The findings from this study can also help plan data quality assurance policies and thus maximise the efficiency of genomic data management. Our results give strong support to the relevance of data quality skills communication for relationship with data quality assurance in genome curation activities.

Original languageAmerican English
JournalJournal of Information Science
DOIs
StatePublished - Jan 1 2018

Keywords

  • data quality
  • DIK hierarchy
  • genome curation
  • semiotics

Disciplines

  • Medicine and Health Sciences

Cite this