Python: getting the youtube author image from the video link

ubuntuMAN

Hello so i try to scrape off the author image url from the given video link using the urllib3 module, but due to different lengths of the url it causes to join other properties like the width and height

https://yt3.ggpht.com/ytc/AKedOLS-Bwwebj7zfYDDo43sYPxD8LN7q4Lq4EvqfyoDbw=s400-c-k-c0x00ffffff-no-rj","width":400,"height"

instead of this author image link which i want :

https://yt3.ggpht.com/ytc/AKedOLS-Bwwebj7zfYDDo43sYPxD8LN7q4Lq4EvqfyoDbw=s400-c-k-c0x00ffffff-no-rj

the code that i worked

import re
import urllib.request

def get_author(uri):
    html = urllib.request.urlopen(uri)
    author_image = re.findall(r'yt3.ggpht.com/(\S{99})', html.read().decode())
    return f"https://yt3.ggpht.com/{author_image[1]}"

sorry for my bad english, thanks in advance =)

Wiktor Stribiżew

If you are not sure about the length of the match, do not hardcode the amount of chars to be matched. {99} is not going to work with arbitrary strings.

Besides, you want to match the string in a mark-up text and you need to be sure you only match until the delimiting char. If it is a " char, then match until that character.

Also, dots in regex are special and you need to escape them to match literal dots.

Besides, findall is used to match all occurrences, you can use re.search to get the first one to free up some resources.

So, a fix could look like

def get_author(uri):
    html = urllib.request.urlopen(uri)
    author_image = re.search(r'yt3\.ggpht\.com/[^"]+', html.read().decode())
    if author_image:
        return f"https://{author_image.group()}"
    return None # you will need to handle this condition in your later code

Here, author_image is the regex match data object, and if it matches, you need to prepend the match value (author_image.group()) with https:// and return the value, else, you need to return some default value to check later in the code (here, None).

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

Get youtube video link in python

getting the youtube id from a link

Link YouTube video in Product Image gallery

Playing video from youtube link without UIWebView

Retrieve video ID from YouTube link

Show one video in webview by link from YouTube

Retrieve Video Stream from Youtube Link

Getting video id "Live Stream" from Youtube

Empty list most of the time outputted when trying to find first link when getting links from youtube (Python)

How to open a Youtube video link directly from android app?

Play first youtube embedded video from a list of link using jQuery

Extract just the audio link from a youtube video without converting

How can I get the link to YouTube Channel from Video Page?

Getting frames from Video Image in Android

python-vlc doesn't plays and response with the youtube video link?

How to return the link to the first Youtube video after a search in Selenium Python?

Getting "Time Elapsed" of a playing video from the YouTube API in ActionScript 3.0

Getting the "videoId" from a YouTube video using YouTubeV3-API

getting size of image from a link as dynamic content

Getting Image from Page Link for django application

Youtube get image url from channel (not getting image)

selenium python getting only first image link

Link of embed youtube video scraping

Youtube Video feeds link not working

Link to youtube video with start and stop

I want to play a video as podcast from YouTube in program without downloading the video from YouTube in Python

Python Facebook upload video from external link

capturing thumbnail image at specific timeframe from youtube link

Getting YouTube Video ID from user-supplied search term using YouTube API