How to properly parse Html tags in Python
There is a question how to parse Html pages in python or rather here is the link to the page: https://3dtoday.ru/3d-models?page=1. It is necessary to parse this piece of code:
<div class="threedmodels_models_list__elem__title">
<a href="https://3dtoday.ru/3d-models/for-home/kitchen/derzhatel-filtra-rozhka-kofevarki" title="">
Держатель фильтра рожка кофеварки.
</a>
</div>
I don't understand how to parse the text of the tag <a>
?
0
2 answers
You should have used requests and bs4
import requests
from bs4 import BeautifulSoup
r = requests.get('https://3dtoday.ru/3d-models?page=1')
soup = BeautifulSoup(r.text, 'html.parser')
element = soup.find_all('div', class_='threedmodels_models_list__elem__str')
elem_soup = BeautifulSoup(str(element[1]), 'html.parser')
title = elem_soup.find_all('a')[2].text
print(title)
Output: Coffee maker horn filter holder.
Response to a comment, to get the headers of the 18 elements, you need to add a loop
for index in range(18):
elem_soup = BeautifulSoup(str(element[index]), 'html.parser')
title = elem_soup.find_all('a')[2].text
print(title)
0
Author: JackWolf, 2020-06-23 09:42:01
A popular library for parsing XML and HTML. http://zetcode.com/python/beautifulsoup/
Analysis on a similar issue. https://stackoverflow.com/q/13240700/13468321
0
Author: oxog hex, 2020-06-23 07:51:20