web-scraping

bs4 parsing of specific rows from an HTML table

I ask for help, here from the table of this type: <h1>Modems</h1> <a href="..">zurück</a> <b ... you understand the cumbersomeness of this code. Does bs4 or other python tools have something more concise and more adequate?

How and what is the best way to parse a website with a SPA on python?

There is such a site I want to parse the data about this player using this link I wrote this code for this purpose: import re ... e data that I need don't come in the full html code of the page. How can I and what is the best way to parse sites with SPA?

Is Twitter blocking oEmbed?

On the site, I installed the ability to scrape a tweet and display it in a frame when posting with a link to Twitter. Using h ... ns, are not blocked! But such a common request as: curl https://twitter.com Comes with a response about a page not found.

The same news headlines from the site are repeated many times

I'm a beginner.I found out about such a thing as BeautifulSoup a couple of days ago.So don't judge me harshly if I made a stu ... for comp in comps: print(comps['title']) As a result:

Web scraping Node js liba tress

I write a simple parser for one site. I use Libu tress to install the queue, what can I replace it with? I got an error and a ... t, null, 4)); }; q.push(URL); UPD: Links led to an absolute, but now it doesn't understand what the method is .push

use translate.google.ru from your Node.JS applications

After reading a series of articles on Habr " Web scraping with Node.js" made a scraping of the site ferra.ru, and wanted to ... m the community in which direction to think (last of all, use the Yandex API for yandex. translate; -)) Sincerely, Gerasim.

How to webscrapping a site that has POST method?

I'm having trouble doing webscrapping for sites that use the post method, for example, I need to extract all political party ... t is not possible to solve my problem, I would like directions where I can find examples of programming with the post method.

What is the best way to scrape the Datasus website in Python?

The link is this: http://tabnet.datasus.gov.br/cgi/tabcgi.exe?sih/cnv/nrbr.def I'm trying to send a POST through requests ... then the URL remains static. Do you think Selenium would be more suitable for this? Has anyone ever done anything like this?

R-download data from the Hidroweb portal

The National Water Agency makes available on its portal Hidroweb the download of historical series referring to the data obt ... ffice, but it doesn't seem to be accessing the page in the expected way. Would anyone know how to explain to me what's wrong?

Scraping with R-xpathSApply returning a list of 0

I am learning to read XML data in R. I wanted to extract the information of Brazilian football (name of the championship, p ... the code on the Globo Esporte website and had success, I imagine it has to do with the XML code itself of the Terra website.

XPath with Python-pick up text after tag in a div

I'm trying to grab a text after a tag that's inside a div, in an html. The problem I'm having is that I'm not getting the tex ... ly I get an empty string. My imports: import lxml.html as parser import requests from urllib.parse import urlsplit, urljoin

Web scraping with BeautifulSoup-find next does not return text

I want to extract the text from the excerpt below: <div class="matchDate renderMatchDateContainer" data-kickoff="1583784 ... ner" data-kickoff="1583784000000"></div> While in the browser I I can see the text: Could anyone help? Thanks.

Request API with JavaScript

I am making a web application in which the goal will be to use an API to only list some information (GET) and for this I woul ... id of a linked resource Console.log (response); }); The question is how can I make a connecting to an api with headers?

Sites with authentication-Web Scraping-Python

BR: I am trying to automate a process of getting data via web using Python. In my case, I need to pull the information from p ... in the browser, I can have access to some file information with post method. But I can not give a print on this information.

Automate web scraping in Python

I am trying to get the speeches of the deputies, which can be found here . The site has several pages (1 to 300 +/-) and on ... to go to the next page or go forward two, and there are times when it turns two pages which ends up resulting in wrong data.

I can't do "web scraping" properly from a Python comic strip site

Well, I was making a code that would check the day of each strip / gif of the page and, if the day is the same as the current ... e: print("Não foi possível baixar a imagem!") return False n += 1 return True get_img()

Web scraping from a microsoft forms form returns none [python]

Hello, I'm having a hard time doing a web scraping of a form made by microsoft forms. (Note: the form was made by me). I hav ... file can be seen in inspect-Network-XHR, but I'm not sure if this it's possible. Who can give a help, I will be grateful:)

How to get a person's friends and followers on Twitter using tweepy library?

The function getting_friends_follwers() below works if I remove the value 100 from (cursor2.items(100)) . My goal is to take ... f.write(str(user.screen_name)+ "\n") print('follower: ' + user.screen_name) f.close() getting_friends_follwers()

Scraping data using Robobrowser

I'm trying to scrape a form, to insert an attachment and submit, using Robobrowser. To open the page I do: browser.open('u ... ution to this? Or is there another way to fill out a form and submit it with other tools except RoboBrowser and Beautifulsale

Use lambda expressions to refine the parameters of a for in c#

Good afternoon! I would like to take a question, I am developing a collection code and at a certain point if it is necessary ... plit()[2].Trim(); } documento.itens.Add(itens); var ex = ""; } } Insert link description here