Python: requests.get("https://pt.stackoverflow.com/") never returns anything
When trying to use a requests.get(url)
I don't get any response from the server though, adding the kwarg timeout=1
for example, I get the answer after 1 second...
example:\
import requests
url = "https://google.com/"
r = requests.get(url, timeout=1)
print(r.elapsed)
I get
0:00:01.211611
.
To use
r = requests.get(url, timeout=5)
I get:
0:00:05.223328
.
From what I realized the function only returns me something when it arrives in the timeout. Maybe I'm doing something very wrong, but I think I should get a response from get as soon as the server reply and not after timeout...
I get the following result by printing the R. dict (I removed the content)
_content_consumed True
_next None
status_code 200
headers {'Date': 'Mon, 29 Jun 2020 20:40:42 GMT', 'Expires': '-1', 'Cache-Control': 'private, max-age=0', 'Content-Type': 'text/html; charset=ISO-8859-1', 'P3P': 'CP="This is not a P3P policy! See g.co/p3phelp for more info."', 'Content-Encoding': 'gzip', 'Server': 'gws', 'Content-Length': '5387', 'X-XSS-Protection': '0', 'X-Frame-Options': 'SAMEORIGIN', 'Set-Cookie': '1P_JAR=2020-06-29-20; expires=Wed, 29-Jul-2020 20:40:42 GMT; path=/; domain=.google.com; Secure, NID=204=AuH0fx2X4m3kT6AeVtg0YMDEGr6uehL7Kt8WyzO7cmIlNDq_qnh4QXcUybI9aPOMAuC8_PuHsidpBN--vMfU1jJRreb2lM340XOSv2-CZAkK1qfXbrSSii9cRG-uX1caNB3HlnL4QDjErvgcYPtedlatyvLEaLALJ4Lj0aigT7c; expires=Tue, 29-Dec-2020 20:40:42 GMT; path=/; domain=.google.com; HttpOnly'}
raw <urllib3.response.HTTPResponse object at 0x7f1cb7b5f0b8>
url http://www.google.com/
encoding ISO-8859-1
history []
reason OK
cookies <RequestsCookieJar[<Cookie 1P_JAR=2020-06-29-20 for .google.com/>, <Cookie NID=204=AuH0fx2X4m3kT6AeVtg0YMDEGr6uehL7Kt8WyzO7cmIlNDq_qnh4QXcUybI9aPOMAuC8_PuHsidpBN--vMfU1jJRreb2lM340XOSv2-CZAkK1qfXbrSSii9cRG-uX1caNB3HlnL4QDjErvgcYPtedlatyvLEaLALJ4Lj0aigT7c for .google.com/>]>
elapsed 0:00:05.147145
request <PreparedRequest [GET]>
connection <requests.adapters.HTTPAdapter object at 0x7f1cb6d8e1d0>
[Finished in 6.0s]
I also realized that in simpler html pages like:
http://www.brainjar.com/java/host/test.html
the problem does not happen and I get the answer almost immediately.
0:00:00.212320.
2 answers
The fact that it returns only at the end of the timeout leads me to believe that the request is failing. I ran the code pretty much as it is and it worked. Is it possible that your firewall is blocking the python executable? Try printing the status code.
url = "https://google.com/"
r = requests.get(url, timeout=10)
print(r.elapsed)
print(r.status_code)
0:00:00.180191
200
I just found out that the problem occurs when I make the request using ipv6.
When using ipv4, the problem does not occur.
This code taken from: https://stackoverflow.com/questions/33046733/force-requests-to-use-ipv4-ipv6 user answer: Jeff Kaufman, solved my problem.
# Monkey patch to force IPv4, since FB seems to hang on IPv6
import socket
old_getaddrinfo = socket.getaddrinfo
def new_getaddrinfo(*args, **kwargs):
responses = old_getaddrinfo(*args, **kwargs)
return [response
for response in responses
if response[0] == socket.AF_INET]
socket.getaddrinfo = new_getaddrinfo
This forces the request with ipv4