How and what is the best way to parse a website with a SPA on python?

Question

How and what is the best way to parse a website with a SPA on python?

There is such a site I want to parse the data about this player using this link

I wrote this code for this purpose:

import requests
from bs4 import BeautifulSoup


def get_html():
    r = requests.get(url='https://www.atptour.com/en/players/felix-auger-aliassime/ag37/overview')
    return r.text
html = get_html()

def get_career(html):
    soup = BeautifulSoup(html, 'lxml')
    career = soup.find('tr')
    print(career)


get_career(html)

But here's the thing,the link that I'm parsing is a single-page application and, accordingly, the data that I need don't come in the full html code of the page.

How can I and what is the best way to parse sites with SPA?

0

python requests web-scraping

Author: Дух сообщества, 2019-12-28

Source

1 answers

score 2 · Accepted Answer

from selenium import webdriver

chromedriver = 'C:\\Program Files (x86)\\chromedrv\\chromedriver.exe'   # путь к драйверу может быть любым
opts = webdriver.ChromeOptions()
opts.add_argument('headless')
browser = webdriver.Chrome(options=opts, executable_path=chromedriver)
# browser.implicitly_wait(20)
browser.get('https://www.atptour.com/en/players/felix-auger-aliassime/ag37/player-stats')
mtlist = browser.find_elements_by_class_name('mega-table')
for mt in  mtlist:
    print(mt.text + '\n')

A variant of parsing using the Selenium library and the Chrome browser.