how do i convert from a webcomponent to pandas dataframe

Zwink

i am trying to define two functions to easily grab any table off the web as a pandas dataframe using a link and xpath. however once i try to use pd.readhtml i get the error 'ValueError: No tables found' i added a print(html) and to my suprise the html contains my data as plain text. all html codes have dissapeared. Any idea why this is happening and how to convert from webelement to pandas dataframe?

my code:

import pandas as pd

def openchrome():
    from selenium import webdriver
    from selenium.webdriver.chrome.service import Service
    
    #open browser
    opt = webdriver.ChromeOptions()
    opt.add_argument('headless')
    serv = Service("d:\webdrivers\chromedriver")
    browser = webdriver.Chrome(service=serv,options=opt)
    return browser

def scrape(browser, link, xpath):
    from selenium.webdriver.common.by import By
    browser.get(link)
    html = browser.find_element( By.XPATH , xpath)
    print(html)
    df = pd.read_html(html)
    return df
    #df=pd.dataframe()
    #return df

browser = openchrome()
df = scrape(browser, 'https://www.multpl.com/s-p-500-pe-ratio/table/by-year', '/html/body/div[2]/div[2]/div[2]/div[1]/div[3]/div/div[1]/table')
  
montovaneli

As the error states, no tables are being found. Why?

  1. pd.read_html can't parse WebElement, only a URL, a file-like object, or a raw string containing HTML. That said, you may use html.get_attribute('outerHTML') to get the WebElement raw HTML as argument of pd.read_html .
def scrape(browser, link, xpath):
    from selenium.webdriver.common.by import By
    browser.get(link)
    html = browser.find_element(By.XPATH, xpath)
    print(html.get_attribute('outerHTML'))
    df = pd.read_html(html.get_attribute('outerHTML'))
    return df
    # df=pd.dataframe()
    # return df


browser = openchrome()
df = scrape(browser, 'https://www.multpl.com/s-p-500-pe-ratio/table/by-year',
            '/html/body/div[2]/div[2]/div[2]/div[1]/div[3]/div/div[1]/table')

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

How do I convert a pandas series which is multidimensional to pandas dataframe

how do i convert a pandas dataframe from wide to long while keeping the index?

How do I convert vincenty distance to float in pandas dataframe

How do i convert python list into pandas dataframe with predefined columns

How do I convert a numpy array into a pandas dataframe?

How do I convert timestamp to datetime.date in pandas dataframe?

How do I convert a text table to a pandas dataframe?

how do i convert a numpy array to pandas dataframe

How do I efficiently convert pandas dataframe to image array?

How do I use slots from a Vue2 webcomponent in Vue3?

How can I convert from Pandas DataFrame to TimeSeries?

How can I convert a pandas dataframe from a raw text in Python?

How do I convert a Pandas Dataframe with one column into a Pandas Dataframe of two columns?

How do I convert httr response content from list to dataframe?

How do I convert '$ -' from a string to a float using pandas?

How do I access global css variable in webcomponent

How do I remove/omit the count column from the dataframe in Pandas?

pandas - how do i get the difference from a dataframe on the same column

How do I call a value from a list inside of a pandas dataframe?

How do I go from long to wide in this dataframe in pandas?

How do I push coordinates from a pandas dataframe to into a List?

How do I go from Pandas DataFrame to Tensorflow BatchDataset for NLP?

How do I delete [ a list of ] rows from a DataFrame in Pandas?

How do I create a .txt file from a pandas dataframe?

How do I combine data from a Pandas DataFrame with a multiindex into a list

How do I create a histogram from a pandas dataframe?

How do I extract a list of lists from a Pandas DataFrame?

How do I copy a row from one pandas dataframe to another pandas dataframe?

How to convert from Pandas' DatetimeIndex to DataFrame in PySpark?