fetching data from the web page using DataFrame

Userabc

I am trying to scrape time series data using pandas DataFrame for Python 2.7 from the web page (http://owww.met.hu/eghajlat/eghajlati_adatsorok/bp/Navig/202_EN.htm). Could somebody please help me how I can write the code. Thanks!

I tried my code as follows:

html =urllib.urlopen("http://owww.met.hu/eghajlat/eghajlati_adatsorok/bp/Navig/202_EN.htm");
text= html.read();
df=pd.DataFrame(index=datum, columns=['m_ta','m_tax','m_taxd', 'm_tan','m_tand'])

But it doesn't give anything. Here I want to display the table as it is.

jezrael

You can use BeautifulSoup for parsing all font tags, then split column a, set_index from column idx and rename_axis to None - remove index name:

import pandas as pd
import urllib
from bs4 import BeautifulSoup

html = urllib.urlopen("http://owww.met.hu/eghajlat/eghajlati_adatsorok/bp/Navig/202_EN.htm");
soup = BeautifulSoup(html)
#print soup

fontTags = soup.findAll('font')
#print fontTags

#get text from tags fonts
li = [x.text for x in soup.findAll('font')]

#remove first 13 tags, before not contain necessary data 
df = pd.DataFrame(li[13:], columns=['a'])

#split data by arbitrary whitspace 
df = df.a.str.split(r'\s+', expand=True)

#set column names
df.columns = columns=['idx','m_ta','m_tax','m_taxd', 'm_tan','m_tand']

#convert column idx to period
df['idx'] = pd.to_datetime(df['idx']).dt.to_period('M')

#convert columns to datetime
df['m_taxd'] = pd.to_datetime(df['m_taxd'])
df['m_tand'] = pd.to_datetime(df['m_tand'])

#set column idx to index, remove index name
df = df.set_index('idx').rename_axis(None)
print df

         m_ta m_tax     m_taxd  m_tan     m_tand
1901-01  -4.7   5.0 1901-01-23  -12.2 1901-01-10
1901-02  -2.1   3.5 1901-02-06   -7.9 1901-02-15
1901-03   5.8  13.5 1901-03-20    0.6 1901-03-01
1901-04  11.6  18.2 1901-04-10    7.4 1901-04-23
1901-05  16.8  22.5 1901-05-31   12.2 1901-05-05
1901-06  21.0  24.8 1901-06-03   14.6 1901-06-17
1901-07  22.4  27.4 1901-07-30   16.9 1901-07-04
1901-08  20.7  25.9 1901-08-01   14.7 1901-08-29
1901-09  15.9  19.9 1901-09-01   11.8 1901-09-09
1901-10  12.6  17.9 1901-10-04    8.3 1901-10-31
1901-11   4.7  11.1 1901-11-14   -0.2 1901-11-26
1901-12   4.2   8.4 1901-12-22   -1.4 1901-12-07
1902-01   3.4   7.5 1902-01-25   -2.2 1902-01-15
1902-02   2.8   6.6 1902-02-09   -2.8 1902-02-06
1902-03   5.3  13.3 1902-03-22   -3.5 1902-03-13
1902-04  10.5  15.8 1902-04-21    6.1 1902-04-08
1902-05  12.5  20.6 1902-05-31    8.5 1902-05-10
1902-06  18.5  23.8 1902-06-30   14.4 1902-06-19
1902-07  20.2  25.2 1902-07-01   15.5 1902-07-03
1902-08  21.1  25.4 1902-08-07   14.7 1902-08-13
1902-09  16.1  23.8 1902-09-05    9.5 1902-09-24
1902-10  10.8  15.4 1902-10-12    4.9 1902-10-25
1902-11   2.4   9.1 1902-11-01   -4.2 1902-11-18
1902-12  -3.1   7.2 1902-12-27  -17.6 1902-12-15
1903-01  -0.5   8.3 1903-01-11  -11.5 1903-01-23
1903-02   4.6  13.4 1903-02-23   -2.7 1903-02-17
1903-03   9.0  16.1 1903-03-28    4.9 1903-03-09
1903-04   9.0  16.5 1903-04-29    2.6 1903-04-19
1903-05  16.4  21.2 1903-05-03   11.3 1903-05-19
1903-06  19.0  23.1 1903-06-03   15.6 1903-06-07
...       ...   ...        ...    ...        ...
1998-07  22.5  30.7 1998-07-23   15.0 1998-07-09
1998-08  22.3  30.5 1998-08-03   14.8 1998-08-29
1998-09  16.0  21.0 1998-09-12   10.4 1998-09-14
1998-10  11.9  17.2 1998-10-07    8.2 1998-10-27
1998-11   3.8   8.4 1998-11-05   -1.6 1998-11-21
1998-12  -1.6   6.2 1998-12-14   -8.2 1998-12-26
1999-01   0.6   4.7 1999-01-15   -4.8 1999-01-31
1999-02   1.5   6.9 1999-02-05   -4.8 1999-02-01
1999-03   8.2  15.5 1999-03-31    3.0 1999-03-16
1999-04  13.1  17.1 1999-04-16    6.1 1999-04-18
1999-05  17.2  25.2 1999-05-31   11.1 1999-05-06
1999-06  19.8  24.4 1999-06-07   12.2 1999-06-22
1999-07  22.3  28.0 1999-07-06   16.3 1999-07-23
1999-08  20.6  26.7 1999-08-09   17.3 1999-08-23
1999-09  19.3  22.9 1999-09-26   15.0 1999-09-02
1999-10  11.5  19.0 1999-10-03    5.7 1999-10-18
1999-11   3.9  12.6 1999-11-04   -2.2 1999-11-21
1999-12   1.3   6.4 1999-12-13   -8.1 1999-12-25
2000-01  -0.7   8.7 2000-01-31   -6.6 2000-01-25
2000-02   4.5  10.2 2000-02-01   -0.1 2000-02-23
2000-03   6.7  11.6 2000-03-09    0.6 2000-03-17
2000-04  14.8  22.1 2000-04-21    5.8 2000-04-09
2000-05  18.7  23.9 2000-05-27   12.3 2000-05-22
2000-06  21.9  29.3 2000-06-14   15.4 2000-06-17
2000-07  20.3  26.6 2000-07-03   14.0 2000-07-16
2000-08  23.8  29.7 2000-08-20   18.5 2000-08-31
2000-09  16.1  21.5 2000-09-14   12.7 2000-09-24
2000-10  14.1  18.7 2000-10-04    8.0 2000-10-23
2000-11   9.0  14.9 2000-11-15    3.7 2000-11-30
2000-12   3.0   9.4 2000-12-14   -6.8 2000-12-24

[1200 rows x 5 columns]

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

Fetching data from a web page to a C# application

Error fetching source code from web page using eclipse

fetching data from different dataframe

Fetching data from REST Web Service using Angular 2 Http

fetching data from web site

How to store a web page/ data from web page using JavaScript?

Fetching information from a web page and and writing into a .xls file using pandas and bs4

Fetching JSON data with .netcore from a web api

Fetching data from XML incl. children and display in page using php

Data fetching issue from excel sheet to selenium web driver using core java

Issue with fetching data from excel (Using Java)

Fetching Data from DB using PDO with Class

Fetching data from facebook using graph api

Fetching data from server with promise using angularjs

Fetching data from json using jsonArrayRequest

fetching data from phpmyadmin using php

Fetching data from Api in Reactjs using Axios

Fetching data from Two tables using Range

React : Fetching data from inputs using createRef

Fetching Data from API using UseEffect

Fetching data from msssql server using python

NextJS fetching DATA from MongoDB using getServerSideProps

Fetching data from models using OneToOneField in Django

Fetching JSON data from @RestContoller using JavaScript

Fetching table data along with blob data using web api

fetching data from api and pass it to next page but showing error

Python - Extracting data from web page using Beautifulsoup

Grabbing Data from Web Page using python 3

Can't get the data using importXML from Dynamic Web Page?