The union of a set.tsv files in one
I'm doing ML to predict the communication quality in wireless mesh networks. I have a dataset that consists of a huge number of files .tsv (about 4000 pieces). File format: topo-2016-01-15-00_00.tsv; topo-2016-01-15-00-00_05.tsv; ...; topo-2016-01-15-15-23_55.tsv. The data was collected over a period of 14 days with an interval of 5 minutes. The file contents are shown in the figure:
How can I combine these files into one for later work? with them?
2
1 answers
If the computer's memory allows, then the simplest option (Python):
import pandas as pd
from glob import glob
files = glob("/folder/dataset/*.tsv")
df = pd.concat([pd.read_csv(f, sep="\t", skiprows=1) for f in files],
ignore_index=True)
1
Author: MaxU, 2020-05-01 12:58:46