itsonlyamodel

Argovis R API

R API to www.argovis.com

Mentioned in this blog post, Argovis has an API that can send JSON data of Argo profiles, platforms and selections, and their metadata. This post will again retrieve Argodata, but this time in an R environment.

This script will guide an R user to:

1. Query a specific profile its id, designated by its platform (WMO) number with its cycle number, connected by an underscore. For example '3900737_9'

2. Query a specified platform by number. Example '3900737'.

3 Query profiles within a given shape, date range, and pressure range.

4 Query profiles metadata within a month and year.

1. Get A Profile

In [1]:
library(httr)
options(warn=-1)
In [2]:
get.profile <- function(profileName){
  baseURL <- "https://argovis.colorado.edu/catalog/profiles/"
  url <- paste(baseURL, profileName, sep="")
  resp <- GET(url)
  if(resp$status_code==200) {
    profile = content(resp, "parsed")
  }
  else {
    profile = profile = content(resp, "raw")
  }
  return(profile)
}

parse.into.df <- function(profile) {
    paramNames <- names(profile)
    meas <- profile$measurements
    names = unique(names(unlist(meas)))
    df <- data.frame(matrix(unlist(meas), nrow=length(meas), byrow=T))
    colnames(df) <- names
    df$profile_id <- profile$`_id`
    df$date <- profile$date
    df$cycle_number <- profile$cycle_number
    df$lat <- profile$lat
    df$lon <- profile$lon
    return (df)
}
In [3]:
profileName = "3900737_279"
profile = get.profile(profileName)
df = parse.into.df(profile)
In [4]:
head(df)
temppsalpresprofile_iddatecycle_numberlatlon
27.165 35.421 4.4 3900737_279 2017-09-12T23:18:42.002Z279 -4.363 -150.066
27.063 35.421 10.0 3900737_279 2017-09-12T23:18:42.002Z279 -4.363 -150.066
27.055 35.422 16.9 3900737_279 2017-09-12T23:18:42.002Z279 -4.363 -150.066
27.048 35.422 23.7 3900737_279 2017-09-12T23:18:42.002Z279 -4.363 -150.066
27.046 35.421 30.9 3900737_279 2017-09-12T23:18:42.002Z279 -4.363 -150.066
27.043 35.421 37.5 3900737_279 2017-09-12T23:18:42.002Z279 -4.363 -150.066

2. Get A Platform

In [5]:
get.platform <- function(platformNumber){
  baseURL <- "https://argovis.colorado.edu/catalog/platforms/"
  url <- paste(baseURL, platformNumber, sep="")
  resp <- GET(url)
  if(resp$status_code==200) {
    profiles <- content(resp, "parsed")
  }
  else {
    profiles = content(resp, "raw")
  }
  return(profiles)
}
platformNumber <- '3900737'
platformProfiles <- get.platform(platformNumber)

platformDf <- data.frame()
for (profile in platformProfiles)
{
    df <- parse.into.df(profile)
    platformDf <- rbind(platformDf, df)
}
In [6]:
tail(platformDf)
temppsalpresprofile_iddatecycle_numberlatlon
250282.928 34.603 1556.5 3900737_355 2019-12-15T10:31:17.001Z355 -6.51 -168.018
250292.817 34.610 1635.6 3900737_355 2019-12-15T10:31:17.001Z355 -6.51 -168.018
250302.689 34.617 1719.8 3900737_355 2019-12-15T10:31:17.001Z355 -6.51 -168.018
250312.536 34.625 1808.8 3900737_355 2019-12-15T10:31:17.001Z355 -6.51 -168.018
250322.383 34.634 1900.5 3900737_355 2019-12-15T10:31:17.001Z355 -6.51 -168.018
250332.266 34.641 1995.7 3900737_355 2019-12-15T10:31:17.001Z355 -6.51 -168.018

3. Get A Selection

Selections require a start date, end date, and nested array of longitude, latitudes.

In [7]:
get.selection <- function(startDate, endDate, shape, presRange){
    baseURL <- "https://argovis.colorado.edu/selection/profiles/"
    startDateQuery <- paste('?startDate=', startDate, sep="")
    endDateQuery <- paste('&endDate;=', endDate, sep="")
    shapeQuery <- paste('&shape;=', shape, sep="")
    
  if(missing(presRange)) {
      url <- paste(baseURL, startDateQuery, endDateQuery, shapeQuery, sep="")
  }
  else {
      presRangeQuery = paste('&presRange;=', presRange, sep="")
      url <- paste(baseURL, startDateQuery, endDateQuery, presRangeQuery, shapeQuery, sep="")   
  }
  resp <- GET(url)
  if(resp$status_code==200) {
    profiles <- content(resp, "parsed")
  }
  else {
    profiles <- content(resp, "raw")
  }
  return(profiles)
}
In [8]:
startDate='2017-9-15'
endDate='2017-10-31'
shape = '[[[-18.6,31.7],[-18.6,37.7],[-5.9,37.7],[-5.9,31.7],[-18.6,31.7]]]'
presRange='[0,30]'

selectionProfiles = get.selection(startDate, endDate, shape, presRange)
In [9]:
selectionDf <- data.frame()
for (profile in selectionProfiles)
{
    df <- parse.into.df(profile)
    selectionDf <- rbind(selectionDf, df)
}
In [10]:
head(selectionDf)
tempprespsalprofile_iddatecycle_numberlatlon
23.054 6 36.948 6902664_81 2017-10-29T19:44:00.000Z81 32.535 -16.905
23.052 7 36.948 6902664_81 2017-10-29T19:44:00.000Z81 32.535 -16.905
23.045 8 36.948 6902664_81 2017-10-29T19:44:00.000Z81 32.535 -16.905
23.039 9 36.948 6902664_81 2017-10-29T19:44:00.000Z81 32.535 -16.905
23.036 10 36.948 6902664_81 2017-10-29T19:44:00.000Z81 32.535 -16.905
23.033 11 36.948 6902664_81 2017-10-29T19:44:00.000Z81 32.535 -16.905

4. Get Metadata for a given month, year

Metadata queries require a month and year.

In [21]:
get.monthly.profile.pos <- function(month, year){
    baseURL <- 'https://argovis.colorado.edu/selection/profiles'
    url <- paste(baseURL, '/', toString(month), '/', toString(year), sep="")
    resp <- GET(url)
    if(resp$status_code==200) {
    profiles <- content(resp, "parsed")
    }
    else {
    profiles <- content(resp, "raw")
    }
    return(profiles)
}

parse.meta.into.df <- function(metaDataProfs){
    names = unique(names(unlist(metaDataProfs)))
    # remove unwrapped station_parameters*
    dfNames = c()
    for (name in names) {
      if (grepl('station_parameters', name) != 1) {
        dfNames = c(dfNames, name)
      }
    }
    dfNames <- c(dfNames, 'station_parameters')
    
    metaDf <- data.frame(matrix(ncol = length(dfNames), nrow = 0))
    colnames(metaDf) <- dfNames
    for (row in metaDataProfs) {
      dfRow <- data.frame()
      newRow <- list()
      rowNames <- names(row)
      for (key in dfNames) {
        if (is.na(match(key, rowNames))) {
          newRow[key] <- -999
        }
        else{
          newRow[key] <- row[key]
        }
      }
      metaDf[nrow(metaDf)+1, ] <- newRow
    }
    return(metaDf)
    }
In [22]:
# SLOW...I still need to find a better to merge lists into a data.frame
metaDataProfs = get.monthly.profile.pos(1, 2018)
metaDf <- parse.meta.into.df(metaDataProfs[0:50])
In [23]:
head(metaDf, 5)
_idPOSITIONING_SYSTEMPI_NAMEVERTICAL_SAMPLING_SCHEMEDATA_MODEPLATFORM_TYPEdatedate_addeddate_qclatdacplatform_numberBASINcontainsBGCisDeeppres_max_for_TEMPpres_min_for_TEMPpres_max_for_PSALpres_min_for_PSALstation_parameters
6901909_96 ARGOS Anja SCHNEEHORST Primary sampling: discrete [] D APEX 2018-01-31T23:57:47.000Z 2019-10-24T07:03:02.037Z 1 68.33600 coriolis 6901909 2 FALSE FALSE 1298.60 5.20 1298.60 5.20 temp
6901826_387 GPS Pierre-Marie POULAIN Primary sampling: averaged [10 sec sampling, 25 dbar average from 700 dbar to 700 dbar; 10 sec sampling, 10 dbar average from 700 dbar to 100 dbar; 10 sec sampling, 2 dbar average from 100 dbar to 5.4 dbar] R ARVOR 2018-01-31T23:55:59.999Z 2019-10-24T06:33:03.688Z 1 36.99538 coriolis 6901826 4 FALSE FALSE 630.60 6.30 630.60 6.30 temp
1901665_196 GPS BRECK OWENS Primary sampling: averaged [nominal 2 dbar binned data sampled at 0.5 Hz from a SBE41CP] R S2A 2018-01-31T23:52:32.002Z 2019-10-24T16:24:51.021Z 1 -27.68992 aoml 1901665 1 FALSE FALSE 1005.08 1.16 1005.08 1.16 pres
6901838_113 GPS Pierre-Marie POULAIN Primary sampling: averaged [10 sec sampling, 50 dbar average from 2000 dbar to 700 dbar; 10 sec sampling, 10 dbar average from 700 dbar to 100 dbar; 10 sec sampling, 5 dbar average from 100 dbar to 5.6 dbar]R ARVOR 2018-01-31T23:47:59.999Z 2019-10-24T06:40:02.519Z 1 -56.07378 coriolis 6901838 10 FALSE FALSE 1975.00 7.90 -999.00 -999.00 temp
2902600_124 ARGOS ZENGHONG LIU Primary sampling: averaged [] A PROVOR 2018-01-31T23:46:38.999Z 2019-10-25T17:17:14.375Z 1 -29.64900 csio 2902600 3 FALSE FALSE 1983.00 1.00 1983.00 1.00 temp

Notes on formatting JSON into data.frame

Full disclosure, my R skills could be improved. There are most likely more efficient ways to create these dataframes, particularly with the meta query. The issue lies when building the dataframe. Each JSON object may not have the same keys, which makes it difficult for R to handle.

Conclusion

This API is good for getting profiles, platforms, and selections into an R environment quickly. We did find some bottlenecks for converting metadata JSON into a data.frame.

If you are working with larger data projects, I would recommend using the Python API.

Thanks for reading. If you see anything you would like improved, feel free to email me at tyler.tucker@colorado.edu.