How to create a nested dictionary from a csv file with N rows in Python

shayelk

I was looking for a way to read a csv file with an unknown number of columns into a nested dictionary. i.e. for input of the form

file.csv:
1,  2,  3,  4
1,  6,  7,  8
9, 10, 11, 12

I want a dictionary of the form:

{1:{2:{3:4}, 6:{7:8}}, 9:{10:{11:12}}}

This is in order to allow O(1) search of a value in the csv file. Creating the dictionary can take a relatively long time, as in my application I only create it once, but search it millions of times.

I also wanted an option to name the relevant columns, so that I can ignore unnecessary once

shayelk

Here is what I came up with. Feel free to comment and suggest improvements.

import csv
import itertools

def list_to_dict(lst):
    # Takes a list, and recursively turns it into a nested dictionary, where
    # the first element is a key, whose value is the dictionary created from the 
    # rest of the list. the last element in the list will be the value of the
    # innermost dictionary
    # INPUTS:
    #   lst - a list (e.g. of strings or floats)
    # OUTPUT:
    #   A nested dictionary
    # EXAMPLE RUN:
    #   >>> lst = [1, 2, 3, 4]
    #   >>> list_to_dict(lst)
    #   {1:{2:{3:4}}}
    if len(lst) == 1:
        return lst[0]
    else:
        data_dict = {lst[-2]: lst[-1]}
        lst.pop()
        lst[-1] = data_dict
        return list_to_dict(lst)


def dict_combine(d1, d2):
    # Combines two nested dictionaries into one.
    # INPUTS:
    #   d1, d2: Two nested dictionaries. The function might change d1 and d2, 
    #           therefore if the input dictionaries are not to be mutated, 
    #           you should pass copies of d1 and d2.
    #           Note that the function works more efficiently if d1 is the 
    #           bigger dictionary.
    # OUTPUT:
    #   The combined dictionary
    # EXAMPLE RUN:
    #   >>> d1 = {1: {2: {3: 4, 5: 6}}}
    #   >>> d2 = {1: {2: {7: 8}, 9: {10, 11}}}
    #   >>> dict_combine(d1, d2)
    #   {1: {2: {3: 4, 5: 6, 7: 8}, 9: {10, 11}}}

    for key in d2:
        if key in d1:
            d1[key] = dict_combine(d1[key], d2[key])
        else:
            d1[key] = d2[key]
    return d1


def csv_to_dict(csv_file_path, params=None, n_row_max=None):
    # NAME: csv_to_dict
    #
    # DESCRIPTION: Reads a csv file and turns relevant columns into a nested 
    #              dictionary.
    #
    # INPUTS:
    #   csv_file_path: The full path to the data file
    #   params:        A list of relevant column names. The resulting dictionary
    #                  will be nested in the same order as parameters in 'params'.
    #                  Default is None (read all columns)
    #   n_row_max:     The maximum number of rows to read. Default is None
    #                  (read all rows)
    #
    # OUTPUT:
    #   A nested dictionary containing all the relevant csv data

    csv_dictionary = {}

    with open(csv_file_path, 'r') as csv_file:
        csv_data = csv.reader(csv_file, delimiter=',')
        names  = next(csv_data)          # Read title line
        if not params:
            # A list of column indices to read from csv
            relevant_param_indices = list(range(0, len(names) - 1))  
        else:
            # A list of column indices to read from csv
            relevant_param_indices = []  
            for name in params:
                if name not in names:    
                # Parameter name is not found in title line
                    raise ValueError('Could not find {} in csv file'.format(name))
                else:
                # Get indices of the relevant columns
                    relevant_param_indices.append(names.index(name))   
        for row in itertools.islice(csv_data, 1, n_row_max):
            # Get a list containing relevant columns only
            relevant_cols = [row[i] for i in relevant_param_indices] 
            # Turn the string to numbers. Not necessary  
            float_row = [float(element) for element in relevant_cols]  
            # Build nested dictionary
            csv_dictionary = dict_combine(csv_dictionary, list_to_dict(float_row))  

        return csv_dictionary

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

How to create a nested dictionary with csv file

How to create a custom CSV file from Python dictionary?

How to create/read nested dictionary from file?

Create a python dictionary from a .csv file

How to create a nested dictionary from a list in Python?

Create a dictionary from a csv file where a specific column is the key and the rows are the values in Python

How can I create a nested dictionary containing info. from csv file

How to create nested dict from CSV file?

How to read nested dictionary from CSV file for the given file structure?

Creating a nested dictionary from a CSV file with Python with 2 levels

Create dictionary from a csv file

Create nested dictionary from text file using Python 3

Python: Extract values from a text file to create nested dictionary

How to create a nested array from csv rows using pandas?

How to insert multiple rows from a python nested dictionary to a sqlite db

How to create nested dictionary in python

create a csv file from a list with dictionary elements in python

How to create a dictionary from slices of nested dictionary?

How to create a dictionary from a nested list of dictionary

How to create rows and columns in a .csv file from .log file

How to create nested dataframe from nested dictionary?

Importing/Exporting a nested dictionary from a CSV file

Creating a dictionary from specific rows of a .csv file

How to create a Bootstrap accordion from a nested dictionary in Python?

How to create nested dictionary from XML using python?

How to create a nested dictionary from a string list (Python)?

Create nested dictionary with same keys from a file

Create a nested dictionary from a txt file

How to write a list with a nested dictionary to a csv file?

TOP Ranking

  1. 1

    Failed to listen on localhost:8000 (reason: Cannot assign requested address)

  2. 2

    Loopback Error: connect ECONNREFUSED 127.0.0.1:3306 (MAMP)

  3. 3

    How to import an asset in swift using Bundle.main.path() in a react-native native module

  4. 4

    pump.io port in URL

  5. 5

    Compiler error CS0246 (type or namespace not found) on using Ninject in ASP.NET vNext

  6. 6

    BigQuery - concatenate ignoring NULL

  7. 7

    ngClass error (Can't bind ngClass since it isn't a known property of div) in Angular 11.0.3

  8. 8

    ggplotly no applicable method for 'plotly_build' applied to an object of class "NULL" if statements

  9. 9

    Spring Boot JPA PostgreSQL Web App - Internal Authentication Error

  10. 10

    How to remove the extra space from right in a webview?

  11. 11

    java.lang.NullPointerException: Cannot read the array length because "<local3>" is null

  12. 12

    Jquery different data trapped from direct mousedown event and simulation via $(this).trigger('mousedown');

  13. 13

    flutter: dropdown item programmatically unselect problem

  14. 14

    How to use merge windows unallocated space into Ubuntu using GParted?

  15. 15

    Change dd-mm-yyyy date format of dataframe date column to yyyy-mm-dd

  16. 16

    Nuget add packages gives access denied errors

  17. 17

    Svchost high CPU from Microsoft.BingWeather app errors

  18. 18

    Can't pre-populate phone number and message body in SMS link on iPhones when SMS app is not running in the background

  19. 19

    12.04.3--- Dconf Editor won't show com>canonical>unity option

  20. 20

    Any way to remove trailing whitespace *FOR EDITED* lines in Eclipse [for Java]?

  21. 21

    maven-jaxb2-plugin cannot generate classes due to two declarations cause a collision in ObjectFactory class

HotTag

Archive