熊猫：根据行中的多个条件，向DF添加带有随机数的新列

ad蝇

我是初学者。我四处张望，读了一堆相关的问题，但还不太清楚。我知道我是问题所在，我想念一些东西，但我希望有人能和be并帮助我。我正在尝试将一个视频游戏（大学篮球模拟）中的数据转换为与另一视频游戏（职业篮球模拟）格式一致的数据。

我有一个包含以下各列的DF：名称，位置，高度，重量，射击，点数

值包括：乔恩·史密斯，C，84、235，Exc，19.4格雷格·琼斯，PG，72、187，一般，12.0

我想为“ InsideScoring”创建一个新列。我想做的就是根据玩家的位置，身高，体重，射门得分和得分，在一定范围内为玩家分配一个随机生成的数字。

我尝试了很多尝试，例如：

df1['InsideScoring'] = 0
df1.loc[(df1.Pos == "C") &
        (df1.Height > 82) &
        (df1.Points > 19.0) &
        (df1.Weight > 229), 'InsideScoring'] = np.random.randint(85,100)

当我这样做时，所有符合条件的玩家（在“ InsideScoring”列中的行）都被分配了85到100之间的相同值，而不是85到100之间的数字随机组合。

最终，我要做的是浏览玩家列表，并根据这四个条件分配不同范围的值。任何想法表示赞赏。

熊猫：根据条件创建具有随机值的新列

脾气暴躁的“哪里”有多个条件

用户名

我的建议是在np.select这里使用。您设置好条件，输出并且一切顺利。但是，为避免迭代，也要避免为满足条件的每一列分配相同的随机值，请创建等于DataFrame长度的随机值，然后从中选择：

建立

df = pd.DataFrame({
    'Name': ['Chris', 'John'],
    'Height': [72, 84],
    'Pos': ['PG', 'C'],
    'Weight': [165, 235], 
    'Shot': ['Amazing', 'Fair'],
    'Points': [999, 25]
})

    Name  Height Pos  Weight     Shot  Points
0  Chris      72  PG     165  Amazing     999
1   John      84   C     235     Fair      25

现在设置范围和条件（根据需要创建尽可能多的条件）：

cond1 = df.Pos.eq('C') & df.Height.gt(80) & df.Weight.gt(200)
cond2 = df.Pos.eq('PG') & df.Height.lt(80) & df.Weight.lt(200)

range1 = np.random.randint(85, 100, len(df))
range2 = np.random.randint(50, 85, len(df))

df.assign(InsideScoring=np.select([cond1, cond2], [range1, range2]))

    Name  Height Pos  Weight     Shot  Points  InsideScoring
0  Chris      72  PG     165  Amazing     999             72
1   John      84   C     235     Fair      25             89

现在验证这不会多次分配值：

df = pd.concat([df]*5)

... # Setup the ranges and conditions again

df.assign(InsideScoring=np.select([cond1, cond2], [range1, range2]))

    Name  Height Pos  Weight     Shot  Points  InsideScoring
0  Chris      72  PG     165  Amazing     999             56
1   John      84   C     235     Fair      25             96
0  Chris      72  PG     165  Amazing     999             74
1   John      84   C     235     Fair      25             93
0  Chris      72  PG     165  Amazing     999             63
1   John      84   C     235     Fair      25             97
0  Chris      72  PG     165  Amazing     999             55
1   John      84   C     235     Fair      25             95
0  Chris      72  PG     165  Amazing     999             60
1   John      84   C     235     Fair      25             90

我们可以看到分配了随机值，即使它们都匹配两个条件之一。尽管与迭代和选择随机值相比，这种方法的内存效率较低，但是由于我们创建了许多未使用的数字，所以由于这些是矢量化操作，因此它仍会更快。

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。