I would like to ffill nan values in a numpy array using the last non-nan values repeating N times. If the number of nan values > N, then fill the rest nan values with zero. How do I do it in pure numpy without iteration?
import numpy as np
n = 2
arr = np.array([np.nan, 0, 0, np.nan, 5, 4, 4, np.nan, np.nan, np.nan, 1, 5, 3, np.nan, 2, np.nan, np.nan])
def ffill(arr: np.array, n: int):
pass
return arr
result = np.array([0.0, 0.0, 0.0, 0.0, 5.0, 4.0, 4.0, 4.0, 4.0, 0.0, 1.0, 5.0, 3.0, 3.0, 2.0, 2.0, 2.0])
Ffill 4 n times (=2) [... 4, np.nan, np.nan, np.nan ...] -> [... 4, 4, 4, 0 ...]
[Solution]
Thanks for @Homer512 's answer. I improved it when n is very large.
def ffill(arr: np.array, n: int):
if np.isnan(arr[0]):
arr[0] = 0
isnan = np.isnan(arr)
notnan = ~isnan
valid = arr[notnan]
indices = np.cumsum(notnan) - 1
arr = valid[indices]
overlimit = np.lib.stride_tricks.sliding_window_view(isnan[:-1][::-1], isnan.size-n)[:, ::-1].all(axis=0)
overlimit &= isnan[n:]
indices = np.flatnonzero(overlimit) + n
arr[indices] = 0
return arr
Here is a trick that works:
if math.isnan(arr[0]):
arr[0] = 0
isnan = np.isnan(arr)
notnan = ~isnan
valid = arr[notnan]
indices = np.cumsum(notnan) - 1
arr = valid[indices]
np.convolve(isnan, (1,) * (n + 1), mode='same') > n
to find the indices. But because convolve is centered, it's a bit complicated to find the correct index from the convolution. Let's do it manually instead. Yes, this will use an iteration, but only a fixed number for Noverlimit = np.copy(isnan[n:])
for i in range(1, n + 1):
overlimit &= isnan[n-i:-i]
indices = np.flatnonzero(overlimit) + n
arr[indices] = 0
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments