Dealing with dates in AWK scripts

Context: I have a CSV file from which I want to extract and process some columns. I realized that AWK would be the perfect tool for this and everything was fine until I had to deal with timestamps - for example 2008-07-31T21:42:52.667

Problem 1: I need to calculate the amount of days that have passed between a base date (say 2008-07-31) and all timestamps in the first column of the entry.

Detail: I know that I can do difference operations if I can use the command date of BASH, because with the following command I can get the amount of seconds passed since the date system base date:

date -d"2008-07-31" +%s #RESULTADO: 1217473200s

This way my problem can be reduced to the following:

Problem 2: How to run a bash command from within AWK?

Author: Nigini, 2014-08-07

2 answers

OK. Following is an answer to Problem 2, which has already solved my problem in general, but maybe I have some other, cooler solution.

I can run a bash command in GAWK using the following construct:

STRING_COMANDO | getline RESULTADO_VAR_NOME

So I wrote the following script to take the first column of a file - which has a timestamp - and calculate the difference from the base date in seconds.

#!/usr/bin/gawk -f
BEGIN {
  base_date="2008-07-31"
  #Comando para obter a quantidade de segundos da data base
  "date -d\""base_date"\" +%s" | getline first_day
  print "BASE: " base_date " -> " first_day
  #Variáveis utilizadas para evitar execuções BASH repitidas
  #Só ajudou pois sei que meu arquivo só tem datas em sequência crescente
  now_date="2008-07-31"
  now_day=first_day
}
{
  #Crio uma variável temp = [DATA,HORA]
  split($1,temp,"T")
  #Só invoco o BASH novamente se a data mudou
  if(temp[1] != now_date){
    now_date=temp[1]
    "date -d\""now_date"\" +%s" | getline now_day
  } 
  print now_date", " now_day", "now_day-first_day
}
 3
Author: Nigini, 2014-08-07 15:26:11

Regarding the arithmetic of dates I present below a hypothesis Dest time with Perl

Assuming file F looks like this:

timestamp | legume | preço
2008-07-31T21:42:52.667 | batatas | 30
2008-08-31T21:42:52.667 | cebolas | 40

For demo I will add you a first column with the days until Christmas

perl -MDate::Simple=date -pe '
  print /(\d{4}-\d\d-\d\d)(?=T)/ ? date("2016-12-25")-date($1) : "","|"' F

Gives:

|timestamp | legume | preço
3069|2008-07-31T21:42:52.667 | batatas | 30
3038|2008-08-31T21:42:52.667 | cebolas | 40
 1
Author: JJoao, 2016-11-07 20:37:37