Sparse Glider Datasets: A Case Study for NoSQL Databases

Michael Lindemuth, Chad Lembke

Research output: Contribution to journalArticlepeer-review

Abstract

Multi-sensor platforms like buoys and gliders produce one or more readings per sensor on varying, discrete time frequencies. The resulting datasets are a matricies with rows containing readings from sensors that reported at a moment in time and NULL for missing readings from sensors that did not. Traditional Relational Database Management Systems (RDBMS) are already well suited for the dense matricies in which NULL values are infrequent. The efficiency of these systems deteriorates though as data becomes more sparse. The University of South Florida College of Marine Science Ocean Technology Group (COT) operates four gliders. Each glider produces dynamic, different sparse datasets. Other data management solutions exist, but they are based on a RDBMS. COT has been investigating an alternative without using and RDBMS. Glider Database Alternative with Mongo (GDAM) is a data management system for gliders built on the MongoDB NoSQL database engine. It is live in production at COT. GDAM is a collection of scripts which parse, process and store real-time glider datasets. Data is parsed as soon as it is transmitted via satellite to our shore-based servers. The system has been tested during two Slocum G1 glider deployments in September and October of 2012. Archival datasets dating back to March of 2009 have also been uploaded into this system. Records are indexed by time, GPS, and depth with the ability to add more indexes as necessary. The paper outlines dataset problems identified using data from COT glider operations in 2012. These problems inform a discussion of design decisions and possible options considering both RDBMS and NoSQL systems. The paper concludes by discussing the current implementation of GDAM.

Original languageAmerican English
Journal2013 OCEANS - San Diego
DOIs
StatePublished - Jan 1 2013

Keywords

  • Servers
  • Educational institutions
  • Relational databases
  • Indexes
  • Marine technology

Disciplines

  • Life Sciences
  • Marine Biology

Cite this