如何将嵌套字典转换为图表？

丹尼斯·弗拉尔

我有一个看起来像这样的嵌套字典：

{'Track_108': {'Track_3994': [(1, 6)],
               'Track_4118': [(8, 9)],
               'Track_4306': [(25, 26), (28, 30)]},
 'Track_112': {'Track_4007': [(19, 20)]},
 'Track_121': {'Track_4478': [(102, 104)]},
 'Track_130': {'Track_4068': [(132, 134)]},
 'Track_141': {'Track_5088': [(93, 95)],
               'Track_5195': [(103, 104), (106, 107)]}

列表是某个事件的间隔（持续时间）。第一个数字是“开始帧”，第二个数字是“最后一帧”。所以“Track_3994”有一个持续时间为 6 帧的事件。

我想在 x 轴上绘制事件的持续时间，在 y 轴上绘制一个计数的 histplot。我需要一个用于整个字典的 histplot，并且最好还为您在第一列中看到的每个轨道提供一个 histplot。

这将是整个字典的图表。y 轴表示持续时间在字典中的次数。对于我提供的数据，只有一个持续时间为 6 的事件，因此该条的高度为 1。x 轴上 2 的条在 y 轴上的高度为 5，因为有5 个事件，持续时间为 2 帧。

对于每个轨道的图表，直方图将仅显示该轨道的持续时间分布。所以这些图表会小很多。例如。track_108 将有一个图表，其中 x=2 的高度为 2，x=3 的高度为 1，x=6 的高度为 1。

ftjahn8

要解决计算和计数工作，您可以使用以下内容：

from typing import Dict, List, Tuple # just typing hints for used/expected types in functions, could be left out

def calculate_track_event_data(data_dict: Dict[str, List[Tuple[int, int]]]) -> Dict[int, int]:
    """
    Counts the durations afor a single track sub-dict (contains a dict of other tracks with a list of their durations as specified in question).
    Returns a dict with duration to count as key-value pairs.
    """
    hist_plot_data = {}
    for track, track_data in data_dict.items():
        for duration_info in track_data:
            duration = duration_info[1] - duration_info[0] + 1  # calculate duration
            try:
                hist_plot_data[duration] += 1  # count up for calculated duration
            except KeyError:
                hist_plot_data[duration] = 1  # add duration if not added yet
    return hist_plot_data


def calculate_top_layer_event_data(data_dict: Dict[str,  Dict[str, List[Tuple[int, int]]]]) -> Dict[int, int]:
    """
    Counts the durations across the entire dict.
    Returns a dict with duration to count as key-value pairs.
    """

    hist_plot_data = {}

    for top_level_track, top_level_track_data in data_dict.items():
        hist_for_track = calculate_track_event_data(top_level_track_data)
        for duration, count in hist_for_track.items():
            try:
                hist_plot_data[duration] += count  # sum up collected count for calculated duration
            except KeyError:
                hist_plot_data[duration] = count  # add duration if not added yet
    return hist_plot_data

对于给定的dict，它会导致：

# Data definition
data = {'Track_108': {'Track_3994': [(1, 6)],
                      'Track_4118': [(8, 9)],
                      'Track_4306': [(25, 26), (28, 30)]},
        'Track_112': {'Track_4007': [(19, 20)]},
        'Track_121': {'Track_4478': [(102, 104)]},
        'Track_130': {'Track_4068': [(132, 134)]},
        'Track_141': {'Track_5088': [(93, 95)],
                      'Track_5195': [(103, 104), (106, 107)]}}

# Call in code:
print(calculate_track_event_data(data['Track_108']))
print(calculate_top_layer_event_data(data))

# Result on output:
{6: 1, 2: 2, 3: 1}  <-- Result for Track 108
{6: 1, 2: 5, 3: 4}  <-- Result for complete dictionary

为了可视化结果，您可以使用诸如 mathplotlib 之类的 python 库之一（例如，在How to plot a histogram using Matplotlib in Python with a list of data?或https://matplotlib.org/stable/api/_as_gen /matplotlib.pyplot.hist.html )

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。