Using the UN Comtrade data API with R

Using the R sytem

Many users rely on non-database tools for analysing the data extracted UN Comtrade, and among those tools, the R system is one the most popular. R is an free, integrated software environment for data manipulation, calculation and graphical display, which implements a huge set of classical and modern statistical techniques. For an comprehensive introduction, please refer to http://cran.r-project.org/doc/manuals/R-intro.html.

Extracting codes and descriptions of reporters from the UN Comtrade API

Here is quick way to read the list of country/area codes into R:

library(rjson)
string <- "http://comtrade.un.org/data/cache/partnerAreas.json"
reporters <- fromJSON(file=string)
reporters <- as.data.frame(t(sapply(reporters$results,rbind)))

A user-defined function to extract data from the UN Comtrade API

The function defined in this example, get.Comtrade(), extracts data from UN Comtrade using either the csv or the json format.

get.Comtrade <- function(url="http://comtrade.un.org/api/get?"
                         ,maxrec=50000
                         ,type="C"
                         ,freq="A"
                         ,px="HS"
                         ,ps="now"
                         ,r
                         ,p
                         ,rg="all"
                         ,cc="TOTAL"
                         ,fmt="json"
)
{
  string<- paste(url
                 ,"max=",maxrec,"&" #maximum no. of records returned
                 ,"type=",type,"&" #type of trade (c=commodities)
                 ,"freq=",freq,"&" #frequency
                 ,"px=",px,"&" #classification
                 ,"ps=",ps,"&" #time period
                 ,"r=",r,"&" #reporting area
                 ,"p=",p,"&" #partner country
                 ,"rg=",rg,"&" #trade flow
                 ,"cc=",cc,"&" #classification code
                 ,"fmt=",fmt        #Format
                 ,sep = ""
  )
  
  if(fmt == "csv") {
    raw.data<- read.csv(string,header=TRUE)
    return(list(validation=NULL, data=raw.data))
  } else {
    if(fmt == "json" ) {
      raw.data<- fromJSON(file=string)
      data<- raw.data$dataset
      validation<- unlist(raw.data$validation, recursive=TRUE)
      ndata<- NULL
      if(length(data)> 0) {
        var.names<- names(data[[1]])
        data<- as.data.frame(t( sapply(data,rbind)))
        ndata<- NULL
        for(i in 1:ncol(data)){
          data[sapply(data[,i],is.null),i]<- NA
          ndata<- cbind(ndata, unlist(data[,i]))
        }
        ndata<- as.data.frame(ndata)
        colnames(ndata)<- var.names
      }
      return(list(validation=validation,data =ndata))
    }
  }
}

The following three examples illustrate its use with a few queries:

Example 1:

Using default parameters (i.e., latest annual HS total trade data flows, in json format), specifying only reporting and partner areas

> library("rjson")
> s1 <- get.Comtrade(r="842", p="124,484")
> s1
$validation
status.name status.value
"Ok" "0"
status.category status.description
"0" ""
status.helpUrl count.value
"" "6"
count.started count.finished
"2014-06-12T12:18:33.3861105-04:00" "2014-06-12T12:18:33.6513122-04:00"
count.durationSeconds datasetTimer.started
"0.2652017" "2014-06-12T12:18:33.6513122-04:00"
datasetTimer.finished datasetTimer.durationSeconds
"2014-06-12T12:18:34.7745194-04:00" "1.1232072"

$data
pfCode yr period periodDesc aggrLevel IsLeaf rgCode rgDesc rtCode rtTitle rt3ISO ptCode ptTitle
1 H4 2013 <NA> <NA> 0 0 1 Import 842 USA USA 124 Canada
2 H4 2013 <NA> <NA> 0 0 2 Export 842 USA USA 124 Canada
3 H4 2013 <NA> <NA> 0 0 3 Re-Export 842 USA USA 124 Canada
4 H4 2013 <NA> <NA> 0 0 1 Import 842 USA USA 484 Mexico
5 H4 2013 <NA> <NA> 0 0 2 Export 842 USA USA 484 Mexico
6 H4 2013 <NA> <NA> 0 0 3 Re-Export 842 USA USA 484 Mexico
pt3ISO cmdCode cmdDescE qtCode qtDesc TradeQuantity NetWeight TradeValue estCode
1 CAN TOTAL All Commodities 1 No Quantity <NA> <NA> 336685686292 0
2 CAN TOTAL All Commodities 1 No Quantity <NA> <NA> 300175625112 0
3 CAN TOTAL All Commodities 1 No Quantity <NA> <NA> 48503512727 0
4 MEX TOTAL All Commodities 1 No Quantity <NA> <NA> 283043473043 0
5 MEX TOTAL All Commodities 1 No Quantity <NA> <NA> 226152895832 0
6 MEX TOTAL All Commodities 1 No Quantity <NA> <NA> 44462740132 0

Example 2:

Same as Example 1 above, but in csv format. Note that in this case, there is no validation information included in the data output.

> library("rjson")
> s2 <- get.Comtrade(r="842", p="124,484", fmt="csv")
> s2
$validation
NULL

$data
Classification Year Period Period.Desc. Aggregate.Level Is.Leaf.Code Trade.Flow.Code Trade.Flow
1 H4 2013 NA NA 0 0 1 Import
2 H4 2013 NA NA 0 0 2 Export
3 H4 2013 NA NA 0 0 3 Re-Export
4 H4 2013 NA NA 0 0 1 Import
5 H4 2013 NA NA 0 0 2 Export
6 H4 2013 NA NA 0 0 3 Re-Export
Reporter.Code Reporter Reporter.ISO Partner.Code Partner Partner.ISO Commodity.Code Commodity
1 842 USA USA 124 Canada CAN TOTAL All Commodities
2 842 USA USA 124 Canada CAN TOTAL All Commodities
3 842 USA USA 124 Canada CAN TOTAL All Commodities
4 842 USA USA 484 Mexico MEX TOTAL All Commodities
5 842 USA USA 484 Mexico MEX TOTAL All Commodities
6 842 USA USA 484 Mexico MEX TOTAL All Commodities
Qty.Unit.Code Qty.Unit Qty Netweight..kg. Trade.Value..US.. Flag
1 1 No Quantity NA NA 336685686292 0
2 1 No Quantity NA NA 300175625112 0
3 1 No Quantity NA NA 48503512727 0
4 1 No Quantity NA NA 283043473043 0
5 1 No Quantity NA NA 226152895832 0
6 1 No Quantity NA NA 44462740132 0

Example 3:

An example to extract monthly data

> library("rjson")
> s3 <- get.Comtrade(r="842", p="0", ps="201201,201202,201203", freq="M")
> s3
$validation
status.name status.value
"Ok" "0"
status.category status.description
"0" ""
status.helpUrl count.value
"" "9"
count.started count.finished
"2014-06-12T14:56:01.9922781-04:00" "2014-06-12T14:56:02.0702786-04:00"
count.durationSeconds datasetTimer.started
"0.0780005" "2014-06-12T14:56:01.9922781-04:00"
datasetTimer.finished datasetTimer.durationSeconds
"2014-06-12T14:56:02.0702786-04:00" "0.0780005"

$data
pfCode yr period periodDesc aggrLevel IsLeaf rgCode rgDesc rtCode rtTitle
1 HS 2012 201201 January 2012 0 0 1 Imports 842 United States of America
2 HS 2012 201201 January 2012 0 0 2 Exports 842 United States of America
3 HS 2012 201201 January 2012 0 0 3 Re-exports 842 United States of America
4 HS 2012 201202 February 2012 0 0 1 Imports 842 United States of America
5 HS 2012 201202 February 2012 0 0 2 Exports 842 United States of America
6 HS 2012 201202 February 2012 0 0 3 Re-exports 842 United States of America
7 HS 2012 201203 March 2012 0 0 1 Imports 842 United States of America
8 HS 2012 201203 March 2012 0 0 2 Exports 842 United States of America
9 HS 2012 201203 March 2012 0 0 3 Re-exports 842 United States of America
rt3ISO ptCode ptTitle pt3ISO cmdCode cmdDescE qtCode qtDesc TradeQuantity NetWeight
1 <NA> 0 World <NA> TOTAL All Commodities 0 <NA> <NA> <NA>
2 <NA> 0 World <NA> TOTAL All Commodities 0 <NA> <NA> <NA>
3 <NA> 0 World <NA> TOTAL All Commodities 0 <NA> <NA> <NA>
4 <NA> 0 World <NA> TOTAL All Commodities 0 <NA> <NA> <NA>
5 <NA> 0 World <NA> TOTAL All Commodities 0 <NA> <NA> <NA>
6 <NA> 0 World <NA> TOTAL All Commodities 0 <NA> <NA> <NA>
7 <NA> 0 World <NA> TOTAL All Commodities 0 <NA> <NA> <NA>
8 <NA> 0 World <NA> TOTAL All Commodities 0 <NA> <NA> <NA>
9 <NA> 0 World <NA> TOTAL All Commodities 0 <NA> <NA> <NA>
TradeValue estCode
1 188224982925 0
2 118156544582 0
3 14874235690 0
4 176338299043 0
5 123448585034 0
6 15647392580 0
7 202088822547 0
8 139760502675 0
9 17447257948 0