Sort the list of dictionaries based on date column of dataframe in pandas

Danish

I have a input list and dataframe as shown below.

[{"type": "linear",
  "from": "2020-02-04T20:00:00.000Z",
  "to": "2020-02-03T20:00:00.000Z",
  "days":3,
  "coef":[0.1,0.1,0.1,0.1,0.1,0.1]
  },
 {"type": "quadratic",
  "from": "2020-02-03T20:00:00.000Z",
  "to": "2020-02-10T20:00:00.000Z",
  "days":3,
  "coef":[0.1,0.1,0.1,0.1,0.1,0.1]
  },
 {"type": "polynomial",
  "from": "2020-02-05T20:00:00.000Z",
  "to": "2020-02-03T20:00:00.000Z",
  "days":3,
  "coef":[0.1,0.1,0.1,0.1,0.1,0.1]
  }]

df:

Date                t_factor     
2020-02-01             5             
2020-02-02             23              
2020-02-03             14           
2020-02-04             23
2020-02-05             23  
2020-02-06             23          
2020-02-07             30            
2020-02-08             29            
2020-02-09             100
2020-03-10             38
2020-03-11             38               
2020-03-12             38                    
2020-03-13             70           
2020-03-14             70 

Step1: Sort the list based on the value of "from" key in dictionary

[
 {"type": "quadratic",
      "from": "2020-02-03T20:00:00.000Z",
      "to": "2020-02-10T20:00:00.000Z",
      "days":3,
      "coef":[0.1,0.1,0.1,0.1,0.1,0.1]
      },
{"type": "linear",
      "from": "2020-02-04T20:00:00.000Z",
      "to": "2020-02-03T20:00:00.000Z",
      "days":3,
      "coef":[0.1,0.1,0.1,0.1,0.1,0.1]
      },
     {"type": "polynomial",
      "from": "2020-02-05T20:00:00.000Z",
      "to": "2020-02-03T20:00:00.000Z",
      "days":3,
      "coef":[0.1,0.1,0.1,0.1,0.1,0.1]
      }]

Step2:add a dictionary with value of "from" key as minimum date of df and "to" should be "from" date the first dictionary in the sorted list. "days" = 0, "coef":[0.1,0.1,0.1,0.1,0.1,0.1].

{"type": "df_first",
      "from": "2020-02-01T20:00:00.000Z",
      "to": "2020-02-03T20:00:00.000Z",
      "days":0,
      "coef":[0.1,0.1,0.1,0.1,0.1,0.1]
      }

Step3:add a dictionary with value of "from" key as 7 days after minimum date of df and "to" should be one days after from

{"type": "df_mid",
      "from": "2020-02-08T20:00:00.000Z",
      "to": "2020-02-09T20:00:00.000Z",
      "days":0,
      "coef":[0.1,0.1,0.1,0.1,0.1,0.1]
      }

Step4:add a dictionary with value of "from" key as maximum date of df and "to" should be same as well as "from".

{"type": "df_last",
      "from": "2020-02-14T20:00:00.000Z",
      "to": "2020-02-14T20:00:00.000Z",
      "days":0,
      "coef":[0.1,0.1,0.1,0.1,0.1,0.1]
      }

Step 5: Sort all the dictionary based on "from" date.

Expected Output:

[{"type": "df_first",
      "from": "2020-02-01T20:00:00.000Z",
      "to": "2020-02-03T20:00:00.000Z",
      "days":0,
      "coef":[0.1,0.1,0.1,0.1,0.1,0.1]
      },
     {"type": "quadratic",
      "from": "2020-02-03T20:00:00.000Z",
      "to": "2020-02-10T20:00:00.000Z",
      "days":3,
      "coef":[0.1,0.1,0.1,0.1,0.1,0.1]
      },
{"type": "linear",
      "from": "2020-02-04T20:00:00.000Z",
      "to": "2020-02-03T20:00:00.000Z",
      "days":3,
      "coef":[0.1,0.1,0.1,0.1,0.1,0.1]
      },

     {"type": "polynomial",
      "from": "2020-02-05T20:00:00.000Z",
      "to": "2020-02-03T20:00:00.000Z",
      "days":3,
      "coef":[0.1,0.1,0.1,0.1,0.1,0.1]
      },
{"type": "df_mid",
      "from": "2020-02-08T20:00:00.000Z",
      "to": "2020-02-09T20:00:00.000Z",
      "days":0,
      "coef":[0.1,0.1,0.1,0.1,0.1,0.1]
      },

{"type": "df_last",
      "from": "2020-02-14T20:00:00.000Z",
      "to": "2020-02-14T20:00:00.000Z",
      "days":0,
      "coef":[0.1,0.1,0.1,0.1,0.1,0.1]
      }
]

Step 6:

Replace the "to" value of each dictionary with "from" value of next dictionary. "to" value of last dictionary be as it is.

Expected Final output:

[{"type": "df_first",
          "from": "2020-02-01T20:00:00.000Z",
          "to": "2020-02-03T20:00:00.000Z",
          "days":0,
          "coef":[0.1,0.1,0.1,0.1,0.1,0.1]
          },
         {"type": "quadratic",
          "from": "2020-02-03T20:00:00.000Z",
          "to": "2020-02-04T20:00:00.000Z",
          "days":3,
          "coef":[0.1,0.1,0.1,0.1,0.1,0.1]
          },
    {"type": "linear",
          "from": "2020-02-04T20:00:00.000Z",
          "to": "2020-02-05T20:00:00.000Z",
          "days":3,
          "coef":[0.1,0.1,0.1,0.1,0.1,0.1]
          },
    
         {"type": "polynomial",
          "from": "2020-02-05T20:00:00.000Z",
          "to": "2020-02-08T20:00:00.000Z",
          "days":3,
          "coef":[0.1,0.1,0.1,0.1,0.1,0.1]
          },
    {"type": "df_mid",
          "from": "2020-02-08T20:00:00.000Z",
          "to": "2020-02-14T20:00:00.000Z",
          "days":0,
          "coef":[0.1,0.1,0.1,0.1,0.1,0.1]
          },
    
    {"type": "df_last",
          "from": "2020-02-14T20:00:00.000Z",
          "to": "2020-02-14T20:00:00.000Z",
          "days":0,
          "coef":[0.1,0.1,0.1,0.1,0.1,0.1]
          }
    ]
Shubham Sharma

Define a function add_dct that takes arguments as list of dictionaries lst with _type, _from and _to and appends a new dictionary to lst:

dmin, dmax = df['Date'].min(), df['Date'].max()
def add_dct(lst, _type, _from, _to):
    lst.append({
        'type': _type,
        'from': _from if isinstance(_from, str) else _from.strftime("%Y-%m-%dT20:%M:%S.000Z"),
        'to': _to if isinstance(_to, str) else _to.strftime("%Y-%m-%dT20:%M:%S.000Z"),
        'days': 0,
        "coef":[0.1,0.1,0.1,0.1,0.1,0.1]
    })

Follow this steps as according to your predefined requirements:

# STEP 1
lst = sorted(lst, key=lambda d: pd.Timestamp(d['from']))

# STEP 2
add_dct(lst, 'df_first', dmin, lst[0]['from'])

# STEP 3
add_dct(lst, 'df_mid', dmin + pd.Timedelta(days=7), dmin + pd.Timedelta(days=8))

# STEP 4
add_dct(lst, 'df_last', dmax, dmax)

# STEP 5
lst = sorted(lst, key=lambda d: pd.Timestamp(d['from']))

Result:

[{'type': 'df_first',
  'from': '2020-02-01T20:00:00.000Z',
  'to': '2020-02-03T20:00:00.000Z',
  'days': 0,
  'coef': [0.1, 0.1, 0.1, 0.1, 0.1, 0.1]},
 {'type': 'quadratic',
  'from': '2020-02-03T20:00:00.000Z',
  'to': '2020-02-10T20:00:00.000Z',
  'days': 3,
  'coef': [0.1, 0.1, 0.1, 0.1, 0.1, 0.1]},
 {'type': 'linear',
  'from': '2020-02-04T20:00:00.000Z',
  'to': '2020-02-03T20:00:00.000Z',
  'days': 3,
  'coef': [0.1, 0.1, 0.1, 0.1, 0.1, 0.1]},
 {'type': 'polynomial',
  'from': '2020-02-05T20:00:00.000Z',
  'to': '2020-02-03T20:00:00.000Z',
  'days': 3,
  'coef': [0.1, 0.1, 0.1, 0.1, 0.1, 0.1]},
 {'type': 'df_mid',
  'from': '2020-02-08T20:00:00.000Z',
  'to': '2020-02-09T20:00:00.000Z',
  'days': 0,
  'coef': [0.1, 0.1, 0.1, 0.1, 0.1, 0.1]},
 {'type': 'df_last',
  'from': '2020-03-14T20:00:00.000Z',
  'to': '2020-03-14T20:00:00.000Z',
  'days': 0,
  'coef': [0.1, 0.1, 0.1, 0.1, 0.1, 0.1]}]

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

sort pandas dataframe based on list

sort pandas DataFrame with a column with list

apply function to pandas column based on list of dictionaries

Is there a way of unnesting a column with a list of dictionaries into a pandas Dataframe

Pandas Dataframe: Sort list column in dataframe

Sort Pandas dataframe column index by date

Sort pandas dataframe column based on substring

sort Pandas Dataframe based on column value

Pandas DataFrame to List of Dictionaries

Sort date in list of strings in pandas column

Sort values of a pandas Dataframe based on date column but also take into consideration duplicated values of 3 other columns

Increment a Pandas Dataframe Column Based on Date

Add Category Column Based On Date - Pandas Dataframe

Pandas filter DataFrame based on row , column and date

Insert row in pandas Dataframe based on Date Column

How to create a Pandas DataFrame from row-based list of dictionaries

Pandas Dataframe: fastest way of updating multiple rows based on a list of dictionaries

Convert pandas dataframe with column containing list of dictionaries to tuple of tuples

How to flatten a column in a pandas dataframe with a list of nested dictionaries

Convert a column containing a list of dictionaries to multiple columns in pandas dataframe

Best method of converting Pandas dataframe into list of dictionaries for each column

How to sort dataframe based on a column in another dataframe in Pandas?

Interpolate a DataFrame column and sort based on another column in PySpark or Pandas

Filter pandas dataframe based on column list values

Compare column values based on list in Pandas dataframe

Pandas Dataframe - sort list element by date, when date is substring of element

Pandas Dataframe Sort Date

Sort Pandas Dataframe by Date

Appending dictionaries in list to pandas dataframe