Dealing with dates in AWK scripts
Context: I have a CSV file from which I want to extract and process some columns. I realized that AWK would be the perfect tool for this and everything was fine until I had to deal with timestamps - for example 2008-07-31T21:42:52.667
Problem 1: I need to calculate the amount of days that have passed between a base date (say 2008-07-31) and all timestamps in the first column of the entry.
Detail: I know that I can do difference operations if I can use the command date of BASH, because with the following command I can get the amount of seconds passed since the date system base date:
date -d"2008-07-31" +%s #RESULTADO: 1217473200s
This way my problem can be reduced to the following:
Problem 2: How to run a bash command from within AWK?
2 answers
OK. Following is an answer to Problem 2, which has already solved my problem in general, but maybe I have some other, cooler solution.
I can run a bash command in GAWK using the following construct:
STRING_COMANDO | getline RESULTADO_VAR_NOME
So I wrote the following script to take the first column of a file - which has a timestamp - and calculate the difference from the base date in seconds.
#!/usr/bin/gawk -f
BEGIN {
base_date="2008-07-31"
#Comando para obter a quantidade de segundos da data base
"date -d\""base_date"\" +%s" | getline first_day
print "BASE: " base_date " -> " first_day
#Variáveis utilizadas para evitar execuções BASH repitidas
#Só ajudou pois sei que meu arquivo só tem datas em sequência crescente
now_date="2008-07-31"
now_day=first_day
}
{
#Crio uma variável temp = [DATA,HORA]
split($1,temp,"T")
#Só invoco o BASH novamente se a data mudou
if(temp[1] != now_date){
now_date=temp[1]
"date -d\""now_date"\" +%s" | getline now_day
}
print now_date", " now_day", "now_day-first_day
}
Regarding the arithmetic of dates I present below a hypothesis Dest time with Perl
Assuming file F looks like this:
timestamp | legume | preço
2008-07-31T21:42:52.667 | batatas | 30
2008-08-31T21:42:52.667 | cebolas | 40
For demo I will add you a first column with the days until Christmas
perl -MDate::Simple=date -pe '
print /(\d{4}-\d\d-\d\d)(?=T)/ ? date("2016-12-25")-date($1) : "","|"' F
Gives:
|timestamp | legume | preço
3069|2008-07-31T21:42:52.667 | batatas | 30
3038|2008-08-31T21:42:52.667 | cebolas | 40