when parsing beru.ru responds with errors: '403' or 'Connection aborted'

When trying to parse the site beru.ru errors are returned:

403:

import requests
from bs4 import BeautifulSoup
import time

URL = 'https://m.beru.ru/catalog/tovary-dlia-avto-i-mototekhniki/76688/list?hid=90402&how=aprice#1-0'

class ParserBeru:
    def __init__(self, url):
        self.url = url
        self.session = requests.Session()
        self.session.headers = {
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36',
            'Accept-Language': 'ru',
        }

    def get_page(self):
        res = self.session.get(url=self.url)
        res.raise_for_status()
        return res.text

def main():
    Parser = ParserBeru(url=URL)
    soup = BeautifulSoup(Parser.get_page(), 'lxml')
    print(soup)

if __name__ == '__main__':
    try:
        main()
    except Exception as e:
        print(f'Ошибка чтения страницы. Пожалуйста подождите...\n{e}')
        time.sleep(5)

And " ('Connection aborted.', RemoteDisconnected ('Remote end closed connection without response')) " when changing headers to:

self.session.headers = {
            'Host':'https://m.beru.ru/',
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36',
            'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
            'Accept-Language': 'ru,en-US;q=0.5',
            'Accept-Encoding': 'gzip, deflate, br',
            'DNT': '1',
            'Connection':'keep-alive',
            'Upgrade-Insecure-Requests': '1',
            'Pragma': 'no-cache',
            'Cache-Control': 'no-cache'
}

PS: if you parse with selenium, everything works fine. I assume that either there is a problem with cookies, or that beru sends a request, but of course request does not respond to it. if you write a couple of lines code with a solution to this problem, I would be very grateful :). if you read the answer, then writes: "Access to our service is temporarily prohibited!

It is possible that your computer is infected with malware that automatically accesses To Yandex."etc.

Zarenee thank you for your help

Author: Andruxa_Xren, 2020-09-15

1 answers

Maybe I'm wrong, but maybe you should send cookies to the site. For example,

HEADERS ={'cookes'= 'erhfuiwhgfiuwerhiw4ueghfwoqghuyq4go7w4gfuier(что-то)'}
r = requests.get(url, headers=HEADERS)

I hope it helped

 2
Author: Leonid M, 2020-11-25 14:17:13