公平警告,此问题确实需要非标准的Python软件包nba_api
。我有一个包含3个元素的列表,列表中的每个元素都包含另一个包含2个元素的列表:一个player
数据框和一个team
数据框。为达到以下预期结果,推荐的方法是:1个组合player
数据帧和1个组合team
数据帧?来自R背景,我将通过以下方法解决此问题:1.将players
数据框与team
数据框连接到中joined_list
,2.do.call(rbind, joined_list)
将结果行绑定到一个数据框。我了解这对于许多经验丰富的Python用户而言可能是非常基本的,但是在这里进行了许多搜索之后,我一直在努力寻找正确的方法,真是太费劲了。
import nba_api
import requests
import pandas as pd
from nba_api.stats.endpoints import boxscoreadvancedv2
# vector of game ids (test purposes)
gameids = ['0021900001','0021900002','0021900012']
headers1 = {
'Host': 'stats.nba.com',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:61.0) Gecko/20100101 Firefox/61.0',
'Accept': 'application/json, text/plain, */*',
'Accept-Language': 'en-US,en;q=0.5',
'Referer': 'https://stats.nba.com/',
'Accept-Encoding': 'gzip, deflate, br',
'Connection': 'keep-alive',
}
# store player and team results for each gameids as elements of list temp
temp = list()
for i in range(len(gameids)):
temp.append(boxscoreadvancedv2.BoxScoreAdvancedV2(game_id = gameids[i], headers=headers1))
# manually access elements of list and output to data frame
## there has to be an easier way to access list elements and rowbind the results!!!
df_out0 = temp[0].get_data_frames()
df_player0 = df_out0[0]
df_team0 = df_out0[1]
df_out1 = temp[1].get_data_frames()
df_player1 = df_out1[0]
df_team1 = df_out1[1]
首先,恭喜您坚持并自己找到解决方案!:D
lst_1 = [1, 2, 3, 4]
for i in range(len(lst_1)):
print(i)
可以写成
lst_1 = [1, 2, 3, 4]
for item in lst_1:
print(item)
奖励:注意我对变量名所做的更改。有关Python样式的一般参考,请参见PEP 8。
gameids = ['0021900001','0021900002','0021900012']
headers1 = {
'Host': 'stats.nba.com',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:61.0) Gecko/20100101 Firefox/61.0',
'Accept': 'application/json, text/plain, */*',
'Accept-Language': 'en-US,en;q=0.5',
'Referer': 'https://stats.nba.com/',
'Accept-Encoding': 'gzip, deflate, br',
'Connection': 'keep-alive',
}
# store player and team results for each gameids as elements of list temp
temp = list()
for i in range(len(gameids)):
temp.append(boxscoreadvancedv2.BoxScoreAdvancedV2(game_id = gameids[i], headers=headers1))
可以写成
game_ids = ['0021900001','0021900002','0021900012']
api_headers = {
'Host': 'stats.nba.com',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:61.0) Gecko/20100101 Firefox/61.0',
'Accept': 'application/json, text/plain, */*',
'Accept-Language': 'en-US,en;q=0.5',
'Referer': 'https://stats.nba.com/',
'Accept-Encoding': 'gzip, deflate, br',
'Connection': 'keep-alive',
}
api_results = [boxscoreadvancedv2.BoxScoreAdvancedV2(game_id=curr_game_id, headers=api_headers) for curr_game_id in game_ids]
# output player frames
i=0
df_out=[]
df_players=[]
for i in range(len(temp)):
df_out = temp[i].get_data_frames()
df_players.append(df_out[0]) # index 0 will always contain player frame
df_players = pd.concat(df_players)
print(df_players)
# output team frames
i=0
df_out=[]
df_team=[]
for i in range(len(temp)):
df_out = temp[i].get_data_frames()
df_team.append(df_out[1]) # index 1 will always contain team frame
df_team = pd.concat(df_team)
print(df_team)
使用前两个技巧,我们将得出以下结论:
players_lst = []
team_lst = []
for curr_res in api_results:
curr_dfs = curr_res.get_data_frames()
players_lst.append(curr_dfs[0])
team_lst.append(curr_dfs[1])
players_df = pd.concat(players_lst)
team_df = pd.concat(team_lst)
在这里,为了清楚起见将其略微分解。
import pandas as pd
from nba_api.stats.endpoints.boxscoreadvancedv2 import BoxScoreAdvancedV2
game_ids = ['0021900001', '0021900002', '0021900012']
api_headers = {
'Host': 'stats.nba.com',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:61.0) Gecko/20100101 Firefox/61.0',
'Accept': 'application/json, text/plain, */*',
'Accept-Language': 'en-US,en;q=0.5',
'Referer': 'https://stats.nba.com/',
'Accept-Encoding': 'gzip, deflate, br',
'Connection': 'keep-alive',
}
# generator of results from the API
api_results = (BoxScoreAdvancedV2(game_id=curr_game_id, headers=api_headers) for curr_game_id in game_ids)
# generator of lists of DataFrames from the API results
# think of it like: [[Player DF, Team DF], [Player DF, Team DF], ...]
api_res_dfs = (curr_res.get_data_frames() for curr_res in api_results)
# unpacking the size 2 lists of DataFrames into 2 flat lists
# [[Player DF, Team DF], [Player DF, Team DF], ...] -> [Player DF, Player DF, ...], [Team DF, Team DF, ...]
# see https://stackoverflow.com/q/2921847/11301900 for more on the use of the asterisk (*)
players_tupe, team_tupe = zip(*api_res_dfs)
# concatenating the various DataFrames, exactly the same as in your original code
players_df = pd.concat(players_tupe)
team_df = pd.concat(team_tupe)
print(players_df)
print(team_df)
它取决于这样一个事实,不仅如您所指出的那样,玩家数据框架始终在列表中始终排在第一位,而团队数据框架始终在列表中排在第二,而且那是结果列表中仅有的两项。
有任何问题请告诉我:)
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句