Seaborn이 특정 기능(타이타닉 데이터 세트)에 대한 히트맵을 플롯하지 못함

독일 브루니니

저는 일부 신경망으로 작업하고 있으며 seaborn을 사용하여 타이타닉 데이터 세트에 대한 상관 히트맵을 플로팅하는 데 어려움을 겪고 있습니다. 요약하자면 플로팅 중에 'n_siblings_spouses' 기능에 문제가 있는 것 같습니다. 문제가 기능 자체(간격, 아마도?) 때문인지 아니면 seaborn에 본질적인 문제가 있는지 모르겠습니다.

데이터 세트에서 기능을 제거하지 않고도 문제를 해결할 수 있습니까?

다음은 MWE입니다. 그리고 미리 감사드립니다!

from __future__ import absolute_import,division,print_function,unicode_literals
import numpy as np 
import pandas as pd
import matplotlib
import matplotlib.pyplot as plt
from matplotlib import rc, font_manager
%matplotlib inline

from IPython.display import clear_output
from six.moves import urllib
import tensorflow.compat.v2.feature_column as fc 
import tensorflow as tf 
import seaborn as sns

rc('text', usetex=True)
matplotlib.rcParams['text.latex.preamble'] = [r'\usepackage{amsmath}']

# only if needed
#!apt install texlive-fonts-recommended texlive-fonts-extra cm-super dvipng
plt.rc('font', family='serif')

# URL address of data

# Downloading data
train_file_path = tf.keras.utils.get_file("train.csv", TRAIN_DATA_URL)

# Setting numpy default values.
np.set_printoptions(precision=3, suppress=True)

# Reading data
data_train = pd.read_csv(train_file_path)

print("\n TRAIN DATA SET")

def heatMap(df):
    #Create Correlation df
    corr = df.corr()
    #Plot figsize
    fig, ax = plt.subplots(figsize=(10, 10))
    #Generate Color Map
    colormap = sns.diverging_palette(220, 10, as_cmap=True)
    #Generate Heat Map, allow annotations and place floats in map
    sns.heatmap(corr, cmap=colormap, annot=True, fmt=".2f")
    #Apply xticks
    plt.xticks(range(len(corr.columns)), corr.columns);
    #Apply yticks
    plt.yticks(range(len(corr.columns)), corr.columns)
    #show plot


다음은 heatMap 기능을 실행하려고 할 때 발생하는 문제입니다(저는 Colab에서 일하고 있습니다. 그러나 이것은 콘솔에서도 발생합니다).

CalledProcessError                        Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/matplotlib/ in _run_checked_subprocess(self, command, tex)
    305                                              cwd=self.texcache,
--> 306                                              stderr=subprocess.STDOUT)
    307         except FileNotFoundError as exc:

22 frames
CalledProcessError: Command '['latex', '-interaction=nonstopmode', '--halt-on-error', '/root/.cache/matplotlib/tex.cache/bf616eae1512bede263889c8e1d8fb21.tex']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

RuntimeError                              Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/matplotlib/ in _run_checked_subprocess(self, command, tex)
    317                     prog=command[0],
    318                     tex=tex.encode('unicode_escape'),
--> 319                     exc=exc.output.decode('utf-8'))) from exc
    320         _log.debug(report)
    321         return report

RuntimeError: latex was not able to process the following string:

Here is the full report generated by latex:
This is pdfTeX, Version 3.14159265-2.6-1.40.18 (TeX Live 2017/Debian) (preloaded format=latex)
 restricted \write18 enabled.
entering extended mode
LaTeX2e <2017-04-15>
Babel <3.18> and hyphenation patterns for 3 language(s) loaded.
Document Class: article 2014/09/29 v1.4h Standard LaTeX document class

Package geometry Warning: Over-specification in `h'-direction.
    `width' (5058.9pt) is ignored.

Package geometry Warning: Over-specification in `v'-direction.
    `height' (5058.9pt) is ignored.

) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsmath.sty
For additional information on amsmath, use the `?' option.
*geometry* driver: auto-detecting
*geometry* detected driver: dvips
! Missing $ inserted.
<inserted text> 
l.19 {\rmfamily n_
No pages of output.
Transcript written on bf616eae1512bede263889c8e1d8fb21.log.

<Figure size 720x720 with 2 Axes>

이 문제를 해결하기 위해 Colab에 Tex 관련 모듈이 필요하다는 정보 를 접했습니다. SO 에 대한 훌륭한 답변도 있었습니다 .

다음을 설치해야 합니다.

  • ! sudo apt-get 설치 texlive-latex-recommended
  • ! sudo apt-get install dvipng texlive-fonts-recommended
  • ! wget
  • ! 압축 해제 -d /tmp/type1cm
  • ! cd /tmp/type1cm/type1cm/ && sudo 라텍스 type1cm.ins
  • ! sudo mkdir /usr/share/texmf/tex/라텍스/type1cm
  • ! sudo cp /tmp/type1cm/type1cm/type1cm.sty /usr/share/texmf/tex/라텍스/type1cm
  • ! sudo texash
  • ! sudo apt install cm-super
from __future__ import absolute_import,division,print_function,unicode_literals
import numpy as np 
import pandas as pd
import matplotlib
import matplotlib.pyplot as plt
# from matplotlib import rc, font_manager
%matplotlib inline

from IPython.display import clear_output
from six.moves import urllib
import tensorflow.compat.v2.feature_column as fc 
import tensorflow as tf 
import seaborn as sns

# rc('text', usetex=True)
# matplotlib.rcParams['text.latex.preamble'] = [r'\usepackage{amsmath}']

# only if needed
#!apt install texlive-fonts-recommended texlive-fonts-extra cm-super dvipng
# plt.rc('font', family='serif')

# URL address of data

# Downloading data
train_file_path = tf.keras.utils.get_file("/content/sample_data/train.csv", TRAIN_DATA_URL)

# Setting numpy default values.
np.set_printoptions(precision=3, suppress=True)

# Reading data
data_train = pd.read_csv(train_file_path)

print("\n TRAIN DATA SET")

def heatMap(df):
    #Create Correlation df
    corr = df.corr()
    #Plot figsize
    fig, ax = plt.subplots(figsize=(10, 10))
    #Generate Color Map
    colormap = sns.diverging_palette(220, 10, as_cmap=True)
    #Generate Heat Map, allow annotations and place floats in map
    sns.heatmap(corr, cmap=colormap, annot=True, fmt=".2f")
    #Apply xticks
    plt.xticks(range(len(corr.columns)), corr.columns);
    #Apply yticks
    plt.yticks(range(len(corr.columns)), corr.columns)
    #show plot


여기에 이미지 설명 입력

