when using group_by: TypeError: incompatible index of inserted column with frame index

A7DC Published at Dev

A7DC

I have a df that I've read from sql:

      id  stock_id symbol        date   open   high      low  close  volume
0      1        35   ABSI  2022-09-28   3.06   3.33   3.0400   3.27  217040
1      2        35   ABSI  2022-09-29   3.19   3.19   3.0300   3.12  187309
2      3        35   ABSI  2022-09-30   3.11   3.27   3.0700   3.13  196566
3      4        35   ABSI  2022-10-03   3.16   3.16   2.8600   2.97  310441
4      5        35   ABSI  2022-10-04   3.04   3.37   2.9600   3.27  361082
..   ...       ...    ...         ...    ...    ...      ...    ...     ...
383  384        16    VVI  2022-10-03  31.93  33.85  31.3050  33.60  151357
384  385        16    VVI  2022-10-04  34.41  35.46  34.1900  35.39  105773
385  386        16    VVI  2022-10-05  34.67  35.30  34.5000  34.86   59605
386  387        16    VVI  2022-10-06  34.80  35.14  34.3850  34.50   55323
387  388        16    VVI  2022-10-07  33.99  33.99  33.3409  33.70   45187

[388 rows x 9 columns]

I then try and get the average of the last 5 days and add it to a new column:

df['volume_5_day'] = df.groupby('stock_id')['volume'].rolling(5).mean()

Which gives me the following error:

Traceback (most recent call last):
  File "/home/dan/.local/lib/python3.10/site-packages/pandas/core/frame.py", line 11003, in _reindex_for_setitem
    reindexed_value = value.reindex(index)._values
  File "/home/dan/.local/lib/python3.10/site-packages/pandas/core/series.py", line 4672, in reindex
    return super().reindex(**kwargs)
  File "/home/dan/.local/lib/python3.10/site-packages/pandas/core/generic.py", line 4966, in reindex
    return self._reindex_axes(
  File "/home/dan/.local/lib/python3.10/site-packages/pandas/core/generic.py", line 4981, in _reindex_axes
    new_index, indexer = ax.reindex(
  File "/home/dan/.local/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 4237, in reindex
    target = self._wrap_reindex_result(target, indexer, preserve_names)
  File "/home/dan/.local/lib/python3.10/site-packages/pandas/core/indexes/multi.py", line 2520, in _wrap_reindex_result
    target = MultiIndex.from_tuples(target)
  File "/home/dan/.local/lib/python3.10/site-packages/pandas/core/indexes/multi.py", line 204, in new_meth
    return meth(self_or_cls, *args, **kwargs)
  File "/home/dan/.local/lib/python3.10/site-packages/pandas/core/indexes/multi.py", line 559, in from_tuples
    arrays = list(lib.tuples_to_object_array(tuples).T)
  File "pandas/_libs/lib.pyx", line 2930, in pandas._libs.lib.tuples_to_object_array
ValueError: Buffer dtype mismatch, expected 'Python object' but got 'long'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/dan/Documents/code/wolfhound/add_indicators_daily.py", line 10, in <module>
    df['volume_10_day'] = df.groupby('stock_id')['volume'].rolling(1).mean()
  File "/home/dan/.local/lib/python3.10/site-packages/pandas/core/frame.py", line 3655, in __setitem__
    self._set_item(key, value)
  File "/home/dan/.local/lib/python3.10/site-packages/pandas/core/frame.py", line 3832, in _set_item
    value = self._sanitize_column(value)
  File "/home/dan/.local/lib/python3.10/site-packages/pandas/core/frame.py", line 4535, in _sanitize_column
    return _reindex_for_setitem(value, self.index)
  File "/home/dan/.local/lib/python3.10/site-packages/pandas/core/frame.py", line 11010, in _reindex_for_setitem
    raise TypeError(
TypeError: incompatible index of inserted column with frame index

Any ideas what's going wrong here? Previously this worked and now it's throwing an error - and I can't seem to figure out why

ouroboros1

Chain Series.to_numpy to add the values as a np.array and make sure to add sort=False inside df.groupby:

df['volume_5_day'] = df.groupby('stock_id', sort=False)['volume']\
    .rolling(5).mean().to_numpy()

print(df)

      id  stock_id symbol        date  ...      low  close  volume  volume_5_day
0      1        35   ABSI  2022-09-28  ...   3.0400   3.27  217040           NaN
1      2        35   ABSI  2022-09-29  ...   3.0300   3.12  187309           NaN
2      3        35   ABSI  2022-09-30  ...   3.0700   3.13  196566           NaN
3      4        35   ABSI  2022-10-03  ...   2.8600   2.97  310441           NaN
4      5        35   ABSI  2022-10-04  ...   2.9600   3.27  361082      254487.6
383  384        16    VVI  2022-10-03  ...  31.3050  33.60  151357           NaN
384  385        16    VVI  2022-10-04  ...  34.1900  35.39  105773           NaN
385  386        16    VVI  2022-10-05  ...  34.5000  34.86   59605           NaN
386  387        16    VVI  2022-10-06  ...  34.3850  34.50   55323           NaN
387  388        16    VVI  2022-10-07  ...  33.3409  33.70   45187       83449.0

Your initial approach fails, because the df.groupby method that you are using, returns a pd.Series with a different index than your df. E.g.:

print(df.groupby('stock_id')['volume'].rolling(5).mean().index)
MultiIndex([(16, 383),
            (16, 384),
            (16, 385),
            (16, 386),
            (16, 387),
            (35,   0),
            (35,   1),
            (35,   2),
            (35,   3),
            (35,   4)],
           names=['stock_id', None])

So, it is saying it is unable to map this onto:

print(df.index)
Int64Index([0, 1, 2, 3, 4, 383, 384, 385, 386, 387], dtype='int64')

With a np.array you don't have this problem. You could also have used:

df['volume_5_day'] = df.groupby('stock_id', as_index=False)['volume']\
    .rolling(5).mean()['volume']

In this case, you don't need to add sort=False, as it will match correctly on the index values.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at2022-10-10

Comments

0 comments

when using group_by: TypeError: incompatible index of inserted column with frame index

when using group_by: TypeError: incompatible index of inserted column with frame index

pump.io port in URL

Loopback Error: connect ECONNREFUSED 127.0.0.1:3306 (MAMP)

Can't pre-populate phone number and message body in SMS link on iPhones when SMS app is not running in the background

How to import an asset in swift using Bundle.main.path() in a react-native native module

Failed to listen on localhost:8000 (reason: Cannot assign requested address)

Spring Boot JPA PostgreSQL Web App - Internal Authentication Error

ngClass error (Can't bind ngClass since it isn't a known property of div) in Angular 11.0.3

Using Response.Redirect with Friendly URLS in ASP.NET

Can a 32-bit antivirus program protect you from 64-bit threats

Double spacing in rmarkdown pdf

How to fix "pickle_module.load(f, **pickle_load_args) _pickle.UnpicklingError: invalid load key, '<'" using YOLOv3?

3D Touch Peek Swipe Like Mail

Bootstrap 5 Static Modal Still Closes when I Click Outside

Assembly definition can't resolve namespaces from external packages

Vector input in shiny R and then use it

Emulator wrong screen resolution in Android Studio 1.3

Svchost high CPU from Microsoft.BingWeather app errors

Graphics Context misaligned on first paint

Python connect to firebird docker database

Is this docker-for-mac password dialog legit?

How to save models trained locally in Amazon SageMaker?