How to read a specific amount of data or lines in Python?

Question

How to read a specific amount of data or lines in Python?

I have a file .lis , .txt or .csv and I need to take from this only a quantity of data or lines and omit the other data, that I only take the data that is between those lines or desired words; or rather how do I identify the word or line and that from this show me the lines or data to another word or line where it will end?

So far I have only been able to read the file with this code:

abrir = open('clase1.lis','r')
while True:
      linea = abrir.readline()
      if not linea: break
      print linea

Another way that there was tried was:

abrir = open('clase1.lis','r')
for q in abrir:
    print q

And others other than what they do is show me the whole file or print me all the data on the screen. But as I said above I only need a block of that file. The file is very large.

14

Author: DaxTter77, 2016-03-31

Source

4 answers

score 6 · Answer 1

If the file is large, you should read it line by line instead of loading the entire file into memory. For example with the following archivo.txt:

--------------------------
Hola me llamo Cesar
Soy de Lima
Me gusta Python
--------------------------
Hola me llamo Juan
Yo no soy de Lima
Odio Python
--------------------------
Hola me llamo Jose
Vivo cerca a Lima
Nunca he usado Python
--------------------------

And looking for the keyword Lima, you can get all the lines where that condition is met:

palabra = 'Lima'
ocurrencias = []
with open('archivo.txt') as lineas:
    for linea in lineas:
        if palabra in linea:
            ocurrencias.append(linea)
print ocurrencias

Or something more compact using filter:

palabra = 'Lima'
ocurrencias = filter(lambda line: palabra in line, open('archivo.txt').readlines())
print ocurrencias

For both cases the result will be a list with the lines found:

['Soy de Lima\n', 'Yo no soy de Lima\n', 'Vivo cerca a Lima\n']

score 5 · Answer 2

Let's try a little trick: every file object behaves like a iterator, with which you can loop the file line by line. To get the text between two lines (n,m) you can use the iterator utilities of the module itertools:

import itertools

with open("datos.txt") as data:
    texto = itertools.islice(data, n, m)

    for linea in texto:
        ....

If you are looking for occurrences of palabra in some lines:

import itertools

with open("datos.txt") as data:
    ocurrencias = (linea for linea in data if palabra in linea)

    for linea in ocurrencias:
        ....

Even combine both:

import itertools

with open("datos.txt") as data:
    texto = itertools.islice(data, n, m)
    ocurrencias = (linea for linea in texto if palabra in linea)

    for linea in ocurrencias:
        ....

score 4 · Answer 3

Assuming you have in your file .csv with content:

Irlanda,33°02'N,128°12'W
Rumania,33°03'N,128°25'W
Colombia,12°43'46?N,54°02'11?W
Los Angeles,34°03'N,118°15'W
Panama,40°42'46?N,74°00'21?W
Paris,48°51'24?N,2°21'03?E
Munchen,42°53'24?N,22°21'33?E
Mexico,30°42'36?N,44°00'21?W
Paris,48°51'24?N,2°21'03?E
Colombia,32°42'36?N,34°04'21?W

You can create a function to extract the records with the content you want

lista = [];

def buscaPalabra(str, file):       
    for line in file:        
        for part in line.split():            
            if str in part:                
                lista.append(line);
    return lista

For example when searching for "Colombia"

file = open('C:\Data\datos.csv','r')
print buscaPalabra("Colombia", file)

You would get the matches of "Colombia":

['Colombia,12°43'46?N,54°02'11?W \n', 'Colombia,32°42'36?N,34°04'21?W \n']

score 3 · Answer 4

Query, the result I get when searching a txt is as follows:

['Usuario: carlos.lopez\r\n', 'gital<br><br>Usuario: carlos.carus<br><br>BP: 1378704 <br><br>CUIL: 2025201=\r\n']

What I would need is that in this case it stays in a carlos variable.lopez in another variable 1378704 and in another the cuil, can you help me with this?

The code is as follows:

    lista = [];
    file = open('archivo.txt','r')

def buscaPalabra(str, file):       
    for line in file:        
       for part in line.split():            
            if str in part:                
                lista.append(line);
    return lista
print buscaPalabra("Usuario:", file)