Is Twitter blocking oEmbed?

On the site, I installed the ability to scrape a tweet and display it in a frame when posting with a link to Twitter. Using https://github.com/oscarotero/Embed/ (and this is just for understanding the problem - the problem is not in this library at all).

Everything went fine for a year or two. But at the beginning of the week, I began to notice that Twitter does not give a normal look and when searching for the problem, I found that the IP of my server is banally blocked.

For example, From your own servers send: lynx https://twitter.com/QuinnyPig/status/1250910042246660096 Returns 404

With localhost or any other IP-everything is fine.

Accordingly, I conclude that the IP of my server is blocked personally.

I note that there are no more than 10-20 such embed posts on the site per day.

When searching for a solution to the problem, I didn't find anything intelligible about restricting oEmbed inserts of tweets. And why would Twitter do that?

Removed all restrictions from the firewall and fail2ban rules-checked-problem she stayed.

However, other oEmbed inserts, such as those from FB, Instagram, work fine.

Yes, and as I pointed out above, the reason is solely the suspicious blocking of my server's IP.

What and how to do?

If the problem is exceeding any limits on such oEmbed-posting, then tell me where to look, because I did not find this in the Twitter help.

I, of course. I can use a proxy, etc. means for scrapping, but I am interested in the essence of the problem - I (the server) is to blame for this or Twitter limits such a thing. If the latter, then how do other sites cope?


Update:

Our team leader asked a question to the Twitter support service and during the discussions it was found out that (for an unknown reason) the IP address of our server is "blocked", i.e. we receive a response with a 404 code to any request coming from our server.

At the same time, authorized requests using the keys issued to us, as well as the operation of our applications, are not blocked!

But such a common request as:

curl https://twitter.com

Comes with a response about a page not found.

Author: ShadowTrix, 2020-04-20

1 answers

Well, without any changes in the code or on the server, Twitter, unexpectedly, "returned" us access to its resources.

I believe that the problem was the following.

In robots.txt the site lacked permissive rules for the Twitter bot:

User-agent: Twitterbot
Disallow:

We fixed this a few days ago, and most likely it took Twitter a while to get us back access (maybe the cache was cleared or something they had updated).

In any case, everything is working now. just like before.

 0
Author: ShadowTrix, 2020-04-27 12:25:07