导入模块时,我收到有关未定义某些变量的 NameError(即使是)

哈维尔

所以我试图将模块/脚本(.py 文件)导入 Jupyter 笔记本,主要是为了可读性和简洁性。但是,当我尝试在脚本中运行该类时,我收到以下错误消息:

NameError                                 Traceback (most recent call last)
<ipython-input-48-4d8cbba46ed0> in <module>()
      8 
      9 test_KMeans = KMeans(k=3, maxiter=1000, tol=1e-9)
---> 10 cluster_center = test_KMeans.fit(X)
     11 clusters = test_KMeans.predict(X)
     12 

~/KMeans.py in fit(self, X)
     42         #Choose k random rows of X as the initial cluster centers.
     43         initial_cluster_centers = []
---> 44 
     45         sample = np.random.randint(0,m,size=k)
     46 

NameError: name 'maxiter' is not defined

这是我的脚本:

import numpy as np
from sklearn.decomposition import PCA

k = 3
maxiter = 1000
tol = 1e-9

class KMeans:
    """A K-Means object class. Implements basic k-means clustering.

    Attributes:
        k (int): The number of clusters
        maxiter (int): The maximum number of iterations
        tol (float): A convergence tolerance
    """
    def __init__(self, k, maxiter, tol):
        """Set the paramters.

        Parameters:
            k (int): The number of clusters
            maxiter (int): The maximum number of iterations
            tol (float): A convergence tolerance
        """
        k = 3
        maxiter = 1000
        tol = 1e-9

        self.k = k   # Initialize some attributes.
        self.maxiter = maxiter
        self.tol = tol

    def fit(self, X):
        """Accepts an mxn matrix X of m data points with n features.
        """
        m,n = X.shape
        k = 3
        maxiter = 1000
        tol = 1e-9
        self.m = m
        self.n = n

        #Choose k random rows of X as the initial cluster centers.
        initial_cluster_centers = []

        sample = np.random.randint(0,m,size=k)

        initial_cluster_centers = X[sample, :]

        # Run the k-means iteration until consecutive centers are within the convergence tolerance, or until 
        # iterating the maximum number of times.
        iterations = 0
        old_cluster = np.zeros(initial_cluster_centers.shape)
        new_cluster = initial_cluster_centers

        while iterations < maxiter or np.linalg.norm(old_cluster - new_cluster) >= tol:
            #assign each data point to the cluster center that is closest, forming k clusters
            clusters = np.zeros(m)
            for i in range(0,m):
                distances = np.linalg.norm(X[i] - initial_cluster_centers, ord=2, axis=1) # axis=1 was crucial
                cluster = np.argmin(distances)                                            #in getting this to work
                clusters[i] = cluster
            # Store the old/initial centroid values
            old_cluster = np.copy(new_cluster)
            #Recompute the cluster centers as the means of the new clusters
            for i in range(k):
                points = [X[j] for j in range(m) if clusters[j] == i]
                new_cluster[i] = np.mean(points, axis=0)
                #If a cluster is empty, reassign the cluster center as a random row of X.
                if new_cluster[i] == []:
                    new_cluster[i] = X[np.random.randint(0,m,size=1)]
            iterations += 1

        #Save the cluster centers as attributes.
        self.new_cluster = new_cluster

        #print("New cluster centers:\n", new_cluster)

        return new_cluster

    def predict(self, X):
        """Accept an l × n matrix X of data.
        """
        # Return an array of l integers where the ith entry indicates which 
        # cluster center the ith row of X is closest to.
        clusters = np.zeros(self.m)
        for i in range(0,self.m):
            distances = np.linalg.norm(X[i] - self.new_cluster, ord=2, axis=1)
            cluster = np.argmin(distances)
            clusters[i] = cluster

        print("\nClusters:", clusters)

        return clusters  

然后我尝试执行以下操作:

from KMeans import KMeans

X = features_scaled

# k = 3
# maxiter = 1000
# tol = 1e-9

test_KMeans = KMeans(k=3, maxiter=1000, tol=1e-9)
cluster_center = test_KMeans.fit(X)
clusters = test_KMeans.predict(X)

pca = PCA(n_components=2)

pr_components = pca.fit_transform(X) # these are the first 2 principal components

#plot the first two principal components as a scatter plot, where the color of each point is det by the clusters
plt.scatter(pr_components[:,0], pr_components[:,1],
           c=clusters, edgecolor='none', alpha=0.5, #color by clusters
            cmap=plt.cm.get_cmap('tab10', 3)) 
plt.xlabel('principal component 1')
plt.ylabel('principal component 2')
plt.colorbar()
plt.title("K-Means Clustering:")
plt.show()

运行上面的代码部分后,我得到了我描述的 NameError。我不明白为什么它告诉我maxiter没有定义。你会看到我k, maxiter, tol在脚本中多次定义变量试图让它工作,但没有。我曾经有过self.maxiterself.tol但也没有解决。

我知道此代码有效,因为我现在已经多次使用它。最初我只是定义了这些变量 k、maxiter 和 tol.. 然后实例化了类并调用了 fit 和 predict 方法,因为它们作为属性存储在 self 中,所以一切正常。但是现在我尝试将它作为模块导入我不知道为什么它不起作用。

谢谢你的帮助!

编辑:这是我的代码在 Jupyter 笔记本中的单个单元格中的样子.. 在这种情况下它确实可以运行和工作:

from sklearn.decomposition import PCA

class KMeans:
    """A K-Means object class. Implements basic k-means clustering.

    Attributes:
        k (int): The number of clusters
        maxiter (int): The maximum number of iterations
        tol (float): A convergence tolerance
    """
    def __init__(self, k, maxiter, tol):
        """Set the paramters.

        Parameters:
            k (int): The number of clusters
            maxiter (int): The maximum number of iterations
            tol (float): A convergence tolerance
        """
        self.k = k   # Initialize some attributes.
        self.maxiter = maxiter
        self.tol = tol

    def fit(self, X):
        """Accepts an mxn matrix X of m data points with n features.
        """
        m,n = X.shape
        self.m = m
        self.n = n

        #Choose k random rows of X as the initial cluster centers.
        initial_cluster_centers = []

        sample = np.random.randint(0,m,size=self.k)

        initial_cluster_centers = X[sample, :]

        # Run the k-means iteration until consecutive centers are within the convergence tolerance, or until 
        # iterating the maximum number of times.
        iterations = 0
        old_cluster = np.zeros(initial_cluster_centers.shape)
        new_cluster = initial_cluster_centers

        while iterations < maxiter or np.linalg.norm(old_cluster - new_cluster) >= tol:
            #assign each data point to the cluster center that is closest, forming k clusters
            clusters = np.zeros(m)
            for i in range(0,m):
                distances = np.linalg.norm(X[i] - initial_cluster_centers, ord=2, axis=1) # axis=1 was crucial
                cluster = np.argmin(distances)                                            #in getting this to work
                clusters[i] = cluster
            # Store the old/initial centroid values
            old_cluster = np.copy(new_cluster)
            #Recompute the cluster centers as the means of the new clusters
            for i in range(k):
                points = [X[j] for j in range(m) if clusters[j] == i]
                new_cluster[i] = np.mean(points, axis=0)
                #If a cluster is empty, reassign the cluster center as a random row of X.
                if new_cluster[i] == []:
                    new_cluster[i] = X[np.random.randint(0,m,size=1)]
            iterations += 1

        #Save the cluster centers as attributes.
        self.new_cluster = new_cluster

        #print("New cluster centers:\n", new_cluster)

        return new_cluster

    def predict(self, X):
        """Accept an l × n matrix X of data.
        """
        # Return an array of l integers where the ith entry indicates which 
        # cluster center the ith row of X is closest to.
        clusters = np.zeros(self.m)
        for i in range(0,self.m):
            distances = np.linalg.norm(X[i] - self.new_cluster, ord=2, axis=1)
            cluster = np.argmin(distances)
            clusters[i] = cluster

        print("\nClusters:", clusters)

        return clusters

X = features_scaled

k = 3
maxiter = 1000
tol = 1e-9

test_KMeans = KMeans(k,maxiter,tol)
test_KMeans.fit(X)
clusters = test_KMeans.predict(X)

pca = PCA(n_components=2)

pr_components = pca.fit_transform(X) # these are the first 2 principal components

#plot the first two principal components as a scatter plot, where the color of each point is det by the clusters
plt.scatter(pr_components[:,0], pr_components[:,1],
           c=clusters, edgecolor='none', alpha=0.5, #color by clusters
            cmap=plt.cm.get_cmap('tab10', 3)) 
plt.xlabel('principal component 1')
plt.ylabel('principal component 2')
plt.colorbar()
plt.title("K-Means Clustering:")
plt.show()
忘了它

回溯似乎显示 Jupyter 与 Kmeans.py 中的当前代码状态不同步(因为它指向第 44 行......这是空的)。因此,如果计算时间不会太长,您可以尝试通过退出并重新启动 Jupyter 来解决问题。

Python 在导入模块时执行模块的代码。如果在导入模块后对模块代码进行更改,则这些更改不会反映在 Python 解释器的状态中。这可以解释为什么 Jupyter notebook 的错误似乎与 Kmeans.py 的状态不同步。

除了退出并重新启动 Python,您还可以重新加载模块例如,在 Python3.4 或更新版本中,您可以使用

import sys
import importlib
from Kmeans import Kmeans

# make changes to Kmeans.py
importlib.reload(sys.modules['Kmeans'])
# now the Python interpreter should be aware of changes made to Kmeans.py

但是,使用 IPython,有一种更简单的方法。您可以启用自动重新加载

从命令行运行:

ipython profile create

然后~/.ipython/profile_default/ipython_config.py通过添加编辑

c.InteractiveShellApp.extensions = ['autoreload']     
c.InteractiveShellApp.exec_lines = ['%autoreload 2']

退出并重新启动 IPython 以使此更改生效。现在,当对定义该模块的底层代码进行更改时,IPython 将自动重新加载任何模块。在大多数情况下 autoreload 运行良好,但也有可能无法重新加载模块的情况。有关autoreload 及其警告的更多信息,请参阅文档

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章

当将OnDrop Event Target悬停在子对象上时,即使是使用Capture阶段,它也是子对象

.catch()正在捕获所有错误,即使是通过fetch()从另一个承诺中也是如此

是否可以检索存储队列中的所有消息,即使是分批进行?

即使是针对C#终结器机制设计的,它是否也从根本上没有缺陷?

导入类时未定义NameError DataFrame

为什么即使是严格弱的排序,这个简单的元组自定义比较器也会崩溃?

即使是root用户也没有对/ var / www /文件夹的权限

有没有办法强迫“最烦人的分析”成为错误,即使是在逐个类的基础上?

NameError:未定义名称“用户”;我认为与课程有关

即使是char,C ++字符串在int时也不会串联char

更改所有权:“不允许操作”-即使是root用户!

为所有新用户启用ecryptfs,即使是通过kerberos和ldap进行身份验证的用户

NameError:即使未定义名称“ clean_up_bubs”?

我编译的任何Android应用程序都需要一些权限,即使是Hello World应用程序

即使是一成不变的借贷,也有可能搬家吗?

无法找到模块 'uglifyjs-webpack-plugin',即使是 1.2.7 版

即使是小文件也复制时拼接文件出错

NameError : 变量未定义

将 DialogFragment 保留在所有活动的前面(即使是在显示 DialogFragment 之后创建的活动)

为什么我的初始变量即使是副本也会改变?

NameError:变量未定义

即使我检查它是否存在,jQuery(未定义)中的变量也有问题

NameError: 变量未定义

插入到,即使是空白

NameError 即使变量不再存在

标记所有重复项 - Pandas Dataframe - 即使是输出中没有“NaN”的第一个实例

在 Java 数组中,即使是平均水平,我也有问题作为回报

BodyParser 导致所有 API 请求挂起。即使是基本的 GET 请求

自定義類型的名稱,即使是帶有可變參數的模板