import requests
from bs4 import BeautifulSoup
respons = requests.get("https://www.reddit.com")
soup = BeautifulSoup(respons.text, "html.parser")
trend_news = soup.select("._3GfG_jvS9X-90Q_8zU4uCu _3Y1KnhioRYkYGb93uAKhBZ")
for news in trend_news:
link = news.find("a")["href"]
print(link)
I'm trying to get the links of the "trending news" on Reddit. However, when I run this script nothing happens, there's no output.
There are two problems with your code:
When using your CSS Selector ._3GfG_jvS9X-90Q_8zU4uCu _3Y1KnhioRYkYGb93uAKhBZ
, there should be a dot (.
) instead of the space. So it would instead be ._3GfG_jvS9X-90Q_8zU4uCu._3Y1KnhioRYkYGb93uAKhBZ
.
The page gets loaded dynamically, therefore the requests
module doesn't support it. (It won't matter even if you have fixed your CSS Selector).
You can get the data for "todays trending" via sending a GET
request to their API https://www.reddit.com/api/trending_searches_v1.json
that will return the data in the format of a python dictionary (dict
), which you can access the keys
/ values
.
import requests
# Add the user-agent header, otherwise, the page thinks that your a bot and will block you
headers = {
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36"
}
response = requests.get(
"https://www.reddit.com/api/trending_searches_v1.json", headers=headers
).json()
>>> # You can access all data like a regular python dictionary
>>> print(type(response))
<class 'dict'>
For example, to access the titles:
...
for data in response["trending_searches"]:
print(data["results"]["data"]["children"][0]["data"]["title"])
Output (currently):
Buttigieg promises 'infrastructure week' won't be a joke when he's transportation secretary
HBO Max now on the PlayStation 5
Mark Hunt vs Paul Gallen | 6-Rounds Boxing | Bankwest Stadium Australia
What fun in D2 dawning looks like
Line-ups Mashup for 7pm: Fulham vs Brighton, Liverpool vs Tottenham, West Ham vs Crystal Palace
[Roscher] Report: Russell Westbrook 'appalled' by Rockets team culture, which revolves around James Harden
Or, to access the links:
for data in response["trending_searches"]:
print("https://reddit.com" + data["results"]["data"]["children"][0]["data"]["permalink"])
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments