I have df like this:
id date
1 01-02-2013
2 01-06-2013
3 05-31-2013
4 07-06-2013
and I want to build a matrix that shows me for each id, the time elapsed between the specific id to all others ( in days). i.e. something like this:
1 2 3 4
1 0 4 -149 -185
2 4 0 -145 -181
....
Thx
df['date'] = pd.to_datetime(df['date'])
df.set_index('id', inplace=True)
You can just subtract all the values of date column from each value in it, End result is:
df['date'].apply(lambda x:x-df['date'])
id 1 2 3 4
id
1 0 days -4 days -149 days -185 days
2 4 days 0 days -145 days -181 days
3 149 days 145 days 0 days -36 days
4 185 days 181 days 36 days 0 days
And if you don't want to display days
string, you can use dt.days
attribute to access the number of days:
df['date'].apply(lambda x:x-df['date']).apply(lambda x: x.dt.days)
id 1 2 3 4
id
1 0 -4 -149 -185
2 4 0 -145 -181
3 149 145 0 -36
4 185 181 36 0
You can finally use .values
attribute if you want to get numpy array:
df['date'].apply(lambda x:x-df['date']).apply(lambda x: x.dt.days).values
array([[ 0, -4, -149, -185],
[ 4, 0, -145, -181],
[ 149, 145, 0, -36],
[ 185, 181, 36, 0]], dtype=int64)
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments