为什么使用CSV文件将第一列作为dtype64 [ns]加载，而将txt作为对象加载

SLE 发表于 Dev

系统性红斑狼疮

多亏了这个论坛，这里的其他人有了我的代码：

names=['Date','Wind Speed','Wind Direction']
df2 = pd.read_csv('test_met.csv', index_col=0, names=names, parse_dates=[0])

aethalometer=['Date','Conc']
df1=pd.read_csv('BC_2012_1min.csv', index_col=0, names=aethalometer, parse_dates=[0])
df1=df1[df1['Conc']>-10]

print(len(df1))

print("here")

df1.index = df1.index.to_period('h')
df2['per'] = df2.index.to_period('h')
pers = df2.loc[(df2['Wind Direction'] > 340) | (df2['Wind Direction'] < 12) , 'per'].unique()

现在有了：我得到：

TypeError：不可排序的类型：str（）> int（）

打印df1.index：

我得到：

Index(['TimeW_1min', '01/04/2012 00:00', '01/04/2012 00:01',
       '01/04/2012 00:02', '01/04/2012 00:03', '01/04/2012 00:04',
       '01/04/2012 00:05', '01/04/2012 00:06', '01/04/2012 00:07',
       '01/04/2012 00:08',
       ...
       '30/09/2012 23:50', '30/09/2012 23:51', '30/09/2012 23:52',
       '30/09/2012 23:53', '30/09/2012 23:54', '30/09/2012 23:55',
       '30/09/2012 23:56', '30/09/2012 23:57', '30/09/2012 23:58',
       '30/09/2012 23:59'],
      dtype='object', name='Date', length=491589)

在这种情况下，csv文件如下所示：（最初是一个文本文件，我将其重新保存为CSV）：

TimeW_1min,CONC_1min
01/04/2012 00:00,17.9
01/04/2012 00:01,-1.2
01/04/2012 00:02,16.8

同时，如果我使用原始的txt文件，则会得到：

TypeError：仅对DatetimeIndex，TimedeltaIndex或PeriodIndex有效，但具有“ Index”的实例

此时：df1.index看起来像：

Index([], dtype='object', name='Date')

但是当我使用另一个数据集时，如下所示：

01-mar-05 12:00,  22.7,  8.1, 0.0214, 1.3727, 0.0214, 1.6969, 1.00,30.603
01-mar-05 12:05, -11.7,  8.1, 0.0214, 1.3725, 0.0214, 1.6965, 1.00,30.5871

它不仅运行程序，df1.index看起来像：

DatetimeIndex(['2005-03-01 12:00:00', '2005-03-01 12:10:00',
               '2005-03-01 12:15:00', '2005-03-01 12:20:00',
etc.

 '2005-03-03 12:00:00'],
              dtype='datetime64[ns]', name='Date', freq=None)

因此，我该如何将第一个转换为txt或csv文件以读取为datetime64 [ns]格式。

非常感谢

这是原始文本文件的链接：我很想让代码起作用：

http://expirebox.com/download/fe01dc85c38e9bf13d477508006d7c94.html

但这给出了一种怪异的格式：所以我进入excel并将其保存为csv ..，可以在这里找到：

http://expirebox.com/download/b984ecf365c4c19387a650eeb17f008f.html

第二个是我正在尝试使用的..但无济于事

将代码更改为： aethalometer=['Date','Conc'] df1=pd.read_csv('BC_2012_1min.txt', names=aethalometer, parse_dates=True,skiprows=1,sep='\t').set_index('Date') df1.index = df1.index.to_period('h')

现在打印为：

2012/9/30 23:58:00 12.40 2012/9/30 23:59:00 2.60

但说： AttributeError: 'Index' object has no attribute 'to_period

df1.index仍然是一个对象：

dtype='object', name='Date', length=491588)

尝试过： df1.index = pd.to_datetime(df1.index)

但这表示未知的字符串格式

埃德·楚姆

好的，您的文件看起来已经被创建该文件的任何方法所欺骗，在行中重复了标题：

43202、87843、132482、174243、186697、231338、274539、319180、363821、407022、448389

如下所示：

2012/4/30 23:59:00  -16.00
TimeW_1min  CONC_1min
2012/8/1 00:00:00   15.10

因此，您可以做的是不要尝试解析date列并使用to_datetime带参数的paramserrors='coerce'进行转换，这会将错误的行转换为NaT您，然后可以过滤掉行并设置索引并PeriodIndex根据需要转换为：

In [126]:
df = pd.read_csv(r'c:\data\BC_2012_1min.txt', sep='\t', names=['Date','Conc'], skiprows=1 )
df['Date'] = pd.to_datetime(df['Date'], errors='coerce')
df = df[df['Date'].notnull()].set_index('Date').to_period('h')
df.index

Out[126]:
PeriodIndex(['2012-04-01 00:00', '2012-04-01 00:00', '2012-04-01 00:00',
             '2012-04-01 00:00', '2012-04-01 00:00', '2012-04-01 00:00',
             '2012-04-01 00:00', '2012-04-01 00:00', '2012-04-01 00:00',
             '2012-04-01 00:00',
             ...
             '2012-09-30 23:00', '2012-09-30 23:00', '2012-09-30 23:00',
             '2012-09-30 23:00', '2012-09-30 23:00', '2012-09-30 23:00',
             '2012-09-30 23:00', '2012-09-30 23:00', '2012-09-30 23:00',
             '2012-09-30 23:00'],
            dtype='int64', name='Date', length=491577, freq='H')

因此，在您的情况下，将我的第一行更改为：

aethalometer=['Date','Conc']
df1=pd.read_csv('BC_2012_1min.csv', names=aethalometer, sep='\t', skiprows=1)

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。

编辑于 2021-04-22

我来说两句

0 条评论

登录后参与评论

上一篇：Android recyclerview在每次活动创建时添加重复元素

为什么使用CSV文件将第一列作为dtype64 [ns]加载，而将txt作为对象加载

为什么使用CSV文件将第一列作为dtype64 [ns]加载，而将txt作为对象加载

Qt Creator Windows 10 - “使用 jom 而不是 nmake”不起作用

使用next.js时出现服务器错误，错误：找不到react-redux上下文值；请确保组件包装在<Provider>中

Swift 2.1-对单个单元格使用UITableView

SQL Server中的非确定性数据类型

如何避免每次重新编译所有文件？

Hashchange事件侦听器在将事件处理程序附加到事件之前进行侦听

在同一Pushwoosh应用程序上Pushwoosh多个捆绑ID

HttpClient中的角度变化检测

在 Avalonia 中是否有带有柱子的 TreeView 或类似的东西？

在Wagtail管理员中，如何禁用图像和文档的摘要项？

通过iwd从Linux系统上的命令行连接到wifi（适用于Linux的无线守护程序）

构建类似于Jarvis的本地语言应用程序

Camunda-根据分配的组过滤任务列表

如何了解DFT结果

Embers js中的更改侦听器上的组合框

ggplot：对齐多个分面图-所有大小不同的分面

使用分隔符将成对相邻的数组元素相互连接

PHP Curl PUT 在 curl_exec 处停止

您如何通过 Nativescript 中的 Fetch 发出发布请求？

错误：找不到存根。请确保已调用spring-cloud-contract：convert

应用发明者仅从列表中选择一个随机项一次