R-download data from the Hidroweb portal

The National Water Agency makes available on its portal Hidroweb the download of historical series referring to the data obtained by various monitoring stations.

I would like to automate the download of these historical series, however ANA has changed its website and I am having a hard time accomplishing the task. in this question is the code in R that I relied on to download the files.

At first I had replaced the links of http://hidroweb.ana.gov.br/ to http://www.snirh.gov.br/hidroweb/, however, the download is failing to perform properly.

I imagine that the problem is in line r <- POST(url = paste0(baseurl[1], est, baseurl[2]), body = list(cboTipoReg = "10"), encode = "form"), since the command of POST seems not to be able to properly access the link http://www.snirh.gov.br/hidroweb/Estacao.asp?Codigo=2851050&CriaArq=true&TipoArq=1.

When accessed with query &CriarArq=true&TipoArq=1 the site shows only the message "request not allowed", it is not possible to scrape the page to find the file .zip.

I had imagined that only the command POST with the URL previous and the form cboTipoReg = '10' would suffice, but it doesn't seem to be accessing the page in the expected way. Would anyone know how to explain to me what's wrong?

Author: Renato, 2018-03-11

1 answers

Oi

We can do similar by updating the base url to: " http://www.snirh.gov.br/hidroweb/rest/api/documento/convencionais?tipo=1&documentos="

Suggestion in R:

## Download automático dados de estações convencionais SNIRH -----

library(httr)

baseurl = "http://www.snirh.gov.br/hidroweb/rest/api/documento/convencionais?tipo=&documentos="
ListaEstacaoes = c(40025000,40050000,40070000,40100000)

setwd("D:/PastaParaDownload")
destino = getwd()

# tipo=1 arquivo access *.mdb
# tipo=2 arquivo texto  *.txt
# tipo=3 arquivo excel  *.csv
tipo = 1

#substituindo o tipo
baseurl = gsub("tipo=",paste0("tipo=",tipo),baseurl)

for(i in 1:length(ListaEstacaoes)){
  baseurl_est = paste0(baseurl,ListaEstacaoes[i])

  #Conexao
  r = POST(url = baseurl_est, body = list(cboTipoReg = "10"), encode = "form")
  if (r$status_code == 405) {
    cont = content(r, as = "text", encoding="ISO-8859-1")
    download.file(baseurl_est, paste0(ListaEstacaoes[i], ".zip"), mode = "wb")
  }
}

Additionally, for telemetry stations:

## Download automático dados de estações telemetricas SNIRH -----

library(httr)

# URL Base 1 - gera o arquivo
baseurl1 = "http://www.snirh.gov.br/hidroweb/rest/api/documento/gerarTelemetricas?codigosEstacoes=&tipoArquivo=&periodoInicial=&periodoFinal="
# URL Base 2 - baixa o arquivo
baseurl2 = "http://www.snirh.gov.br/hidroweb/rest/api/documento/baixarTelemetricas?codigosEstacoes="

# Informação necessaria das estacoes telemetricas: ID Codigo
# para obter o ID consultas ficha descritiva da estacao em http://gestorpcd.ana.gov.br/ 
idEstacaoes     = c(84863570,91264360,84763570,84663550,94765311)
CodigoEstacaoes = c(15400000,15341000,15360000,15380000,15326010)

#setwd("D:/PastaParaDownload")
destino = getwd()

#Parametros necessarios e comuns as estacoes
    # Estacoes telemetricas disponiveis apenas em dois formatos
    #tipo = 2 # tipo=2 arquivo texto  *.txt
    tipo = 3 # tipo=3 arquivo excel  *.csv

    # Data de inicio e fim do recorte da serie no formato aaaa-mm-dd 
    dinicio = "2019-01-01"
    dfim = "2019-12-31"

    # Hora de inicio e fim do recorte da serie no formato hh:mm
    hinicio = "00:00"
    hfim = "23:00"

#substituindo os parametros comuns as estacoes
baseurl1 = gsub("&tipoArquivo=",paste0("&tipoArquivo=",tipo),baseurl1)
baseurl1 = gsub("&periodoInicial=",paste0("&periodoInicial=",dinicio,"T03:",hinicio,".000Z"),baseurl1)
baseurl1 = gsub("&periodoFinal=",paste0("&periodoFinal=",dfim,"T03:",hfim,".000Z"),baseurl1)

for(i in 1:length(CodigoEstacaoes)){ 
  baseurl1i = gsub("codigosEstacoes=",paste0("codigosEstacoes=",idEstacaoes[i]),baseurl1)
  baseurl2i = gsub("codigosEstacoes=",paste0("codigosEstacoes=",idEstacaoes[i]),baseurl2)

  #Conexao
  BROWSE(baseurl1i)
  BROWSE(baseurl2i) 
}
 1
Author: Paula Wessling, 2020-06-08 00:15:33