从数据框的列创建元组

基蒂·普罗希特(Kirti Purohit)

在此处输入图片说明

我有一个这样的数据集-我想创建一个List of tuplesas

(Name_of_State , Literacy_rate)
(JAMMU&KASHMIR, 89.78) #example

我不得不做一些清理工作,搬迁地区,只保留州。

data=data[data['Name']!='India']    #removing the India's row 
data=data[data['TRU']=='Total']    
 #Only keeping total and excluding the rural and urban rows
states_group=data[data['Level']=='State']
states_group

之后,这是我要关注的主要代码-

literacy_rate=[]
total_state_pop=0
total_literate_pop=0
for key,group in states_group.iterrows():
    total_state_pop+=states_group['TOT_P']
    
    total_literate_pop+=states_group['P_LIT']
    total_literate_pop+=states_group['F_LIT']
    rate=(total_literate_pop/total_state_pop)*100
    literacy_rate.append((states_group['Name'],rate))
    
print(literacy_rate) 

但是我得到的输出是-

(3            JAMMU & KASHMIR
72          HIMACHAL PRADESH
111                   PUNJAB
174               CHANDIGARH
180              UTTARAKHAND
222                  HARYANA
288             NCT OF DELHI
318                RAJASTHAN
420            UTTAR PRADESH
636                    BIHAR
753                   SIKKIM
768                  MANIPUR
798                  MIZORAM
825                  TRIPURA
840                MEGHALAYA
864                    ASSAM
948              WEST BENGAL
1008               JHARKHAND
1083                  ODISHA
1176            CHHATTISGARH
1233          MADHYA PRADESH
1386                 GUJARAT
1467             DAMAN & DIU
1476    DADRA & NAGAR HAVELI
1482             MAHARASHTRA
1590          ANDHRA PRADESH
1662               KARNATAKA
1755                     GOA
1764                  KERALA
1809              TAMIL NADU
1908              PUDUCHERRY
Name: Name, dtype: object, 3        85.484832
72       99.946393
111      80.810862
174      93.793637
180      89.689123
222      79.608418
288      97.531743
318      67.745833
420      69.971651
636      52.937273
753      98.691424
768      96.236438
798     109.113300
825     116.065370
840      84.108326
864      96.451609
948      87.437511
1008     63.211190
1083     85.260257
1176     85.104889
1233     78.055310
1386     99.236215
1467    121.848465
1476    112.301972
1482    100.968386
1590     79.671587
1662     81.400129
1755    110.110417
1764    120.140132
1809     94.529868
1908    101.165414
dtype: float64), (3            JAMMU & KASHMIR
72          HIMACHAL PRADESH
111                   PUNJAB
174               CHANDIGARH
180              UTTARAKHAND
222                  HARYANA
288             NCT OF DELHI
318                RAJASTHAN
420            UTTAR PRADESH
636                    BIHAR
753                   SIKKIM
768                  MANIPUR
798                  MIZORAM
825                  TRIPURA
840                MEGHALAYA
864                    ASSAM
948              WEST BENGAL
1008               JHARKHAND
1083                  ODISHA
1176            CHHATTISGARH
1233          MADHYA PRADESH
1386                 GUJARAT
1467             DAMAN & DIU
1476    DADRA & NAGAR HAVELI
1482             MAHARASHTRA
1590          ANDHRA PRADESH
1662               KARNATAKA
1755                     GOA
1764                  KERALA
1809              TAMIL NADU
1908              PUDUCHERRY
Name: Name, dtype: object, 3        85.484832
72       99.946393
111      80.810862
174      93.793637
180      89.689123
222      79.608418
288      97.531743
318      67.745833
420      69.971651
636      52.937273
753      98.691424
768      96.236438
798     109.113300
825     116.065370
840      84.108326
864      96.451609
948      87.437511
1008     63.211190
1083     85.260257
1176     85.104889
1233     78.055310
1386     99.236215
1467    121.848465
1476    112.301972
1482    100.968386
1590     79.671587
1662     81.400129
1755    110.110417
1764    120.140132
1809     94.529868
1908    101.165414
dtype: float64), (3            JAMMU & KASHMIR
72          HIMACHAL PRADESH
111                   PUNJAB
174               CHANDIGARH
180              UTTARAKHAND
222                  HARYANA
288             NCT OF DELHI
318                RAJASTHAN
420            UTTAR PRADESH
636                    BIHAR
753                   SIKKIM
768                  MANIPUR
798                  MIZORAM
825                  TRIPURA
840                MEGHALAYA
864                    ASSAM
948              WEST BENGAL
1008               JHARKHAND
1083                  ODISHA
1176            CHHATTISGARH
1233          MADHYA PRADESH
1386                 GUJARAT
1467             DAMAN & DIU
1476    DADRA & NAGAR HAVELI
1482             MAHARASHTRA
1590          ANDHRA PRADESH
1662               KARNATAKA
1755                     GOA
1764                  KERALA
1809              TAMIL NADU
1908              PUDUCHERRY
Name: Name, dtype: object, 3        85.484832
72       99.946393
111      80.810862
174      93.793637
180      89.689123
222      79.608418
288      97.531743
318      67.745833
420      69.971651
636      52.937273
753      98.691424
768      96.236438
798     109.113300
825     116.065370
840      84.108326
864      96.451609
948      87.437511
1008     63.211190
1083     85.260257
1176     85.104889
1233     78.055310
1386     99.236215
1467    121.848465
1476    112.301972

而且它的距离还很长。这是整个链接的数据集。我在哪里弄错了?提前致谢。

爱德华·卡伦

尽可能避免迭代,因为这是熊猫的反模式。好读

import pandas as pd
data = pd.read_excel('state_dist_sc.xls')
data=data[data['Name']!='India']
data=data[data['TRU']=='Total']
states_group=data[data['Level']=='State']

#create a copy of data on which we will be calculating literacy rate.
states_group = states_group.copy()

#Calculate litracy rate using vector formula which is faster and more.
states_group['literacy_rate'] = 100*(states_group['P_LIT'] + states_group['F_LIT'])/states_group['TOT_P']

# use to_records to get list of tuples
ans = states_group[['Name','literacy_rate']].to_records(index=False)
ans

输出:

rec.array([('JAMMU & KASHMIR',  85.48483174),
           ('HIMACHAL PRADESH',  99.94639301), ('PUNJAB',  80.81086172),
           ('CHANDIGARH',  93.79363692), ('UTTARAKHAND',  89.68912284),
           ('HARYANA',  79.60841792), ('NCT OF DELHI',  97.53174349),
           ('RAJASTHAN',  67.74583313), ('UTTAR PRADESH',  69.97165068),
           ('BIHAR',  52.93727261), ('SIKKIM',  98.69142352),
           ('MANIPUR',  96.23643761), ('MIZORAM', 109.11330049),
           ('TRIPURA', 116.06537002), ('MEGHALAYA',  84.10832613),
           ('ASSAM',  96.45160871), ('WEST BENGAL',  87.43751069),
           ('JHARKHAND',  63.21118996), ('ODISHA',  85.26025661),
           ('CHHATTISGARH',  85.10488906),
           ('MADHYA PRADESH',  78.05530967), ('GUJARAT',  99.23621537),
           ('DAMAN & DIU', 121.84846506),
           ('DADRA & NAGAR HAVELI', 112.3019722 ),
           ('MAHARASHTRA', 100.96838647),
           ('ANDHRA PRADESH',  79.67158709), ('KARNATAKA',  81.40012899),
           ('GOA', 110.11041691), ('KERALA', 120.14013153),
           ('TAMIL NADU',  94.529868  ), ('PUDUCHERRY', 101.16541449)],
          dtype=[('Name', 'O'), ('literacy_rate', '<f8')])

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章