如何在嵌入式模式下从 Drill 访问 HDFS?

灯影漂移器

尝试在单个节点上运行 apache 钻孔,遵循从嵌入式钻孔访问 HDFS文章,但出现错误

➜  Apps /home/hph_etl/Apps/apache-drill-1.16.0/bin/sqlline -u "jdbc:drill:zk=local;schema=dfs"

...

apache drill (dfs)> select * from dfs.`tmp/`;
Error: RESOURCE ERROR: Failed to load schema for "dfs"!

java.net.ConnectException: Call From HW04.ucera.local/172.18.4.49 to localhost:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused


[Error Id: 2fd541ee-2290-4cf8-979b-aca3c77859e2 ] (state=,code=0)
apache drill (dfs)> !q
Closing: org.apache.drill.jdbc.impl.DrillConnectionImpl

dfs 存储插件文件看起来像......

{
  "type": "file",
  "connection": "hdfs://localhost:8020/",
  "config": null,
  "workspaces": {
    "tmp": {
      "location": "/tmp",
      "writable": true,
      "defaultInputFormat": null,
      "allowAccessOutsideWorkspace": false
    },
    "root": {
      "location": "/",
      "writable": false,
      "defaultInputFormat": null,
      "allowAccessOutsideWorkspace": false
    }
  },
  "formats": {
    "psv": {
      "type": "text",
      "extensions": [
        "tbl"
   ....
}

(请注意,我真的不知道如何确定 hdfs 连接应该是哪个端口)并且错误消息的链接(http://wiki.apache.org/hadoop/ConnectionRefused)无处可去。从另一个 SO帖子尝试替代解决方案会引发错误:

➜  Apps /home/hph_etl/Apps/apache-drill-1.16.0/bin/sqlline -u "jdbc:drill:drillbit=localhost:31010;schema=dfs"
Error: Failure in connecting to Drill: org.apache.drill.exec.rpc.RpcException: CONNECTION : io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: localhost/127.0.0.1:31010 (state=,code=0)
java.sql.SQLNonTransientConnectionException: Failure in connecting to Drill: org.apache.drill.exec.rpc.RpcException: CONNECTION : io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: localhost/127.0.0.1:31010
    at org.apache.drill.jdbc.impl.DrillConnectionImpl.<init>(DrillConnectionImpl.java:178)
    at org.apache.drill.jdbc.impl.DrillJdbc41Factory.newDrillConnection(DrillJdbc41Factory.java:67)

不确定此时要检查什么;任何调试建议或修复?

灯影漂移器

最终,有效的是将连接 hdfs IP 设置为 hadoop 集群的 namenode 的 IP(来自另一篇关于一般连接到 HDFS 的SO帖子),因此钻孔 dfs 存储插件配置如下所示:

{
  "type": "file",
  "connection": "hdfs://localhost:8020/",
  "config": null,
  "workspaces": {
    "tmp": {
      "location": "/tmp",
      "writable": true,
      "defaultInputFormat": null,
      "allowAccessOutsideWorkspace": false
    },
   ....
}

我们可以做到

➜  bin /home/hph_etl/Apps/apache-drill-1.16.0/bin/sqlline -u "jdbc:drill:zk=local;schema=dfs"
Apache Drill 1.16.0
"Got Drill?"
apache drill (dfs)> select * from dfs.`tmp/`;
Error: PERMISSION ERROR: Not authorized to read table [tmp/] in schema [dfs.default]


[Error Id: 2e248da5-ba30-43f7-a983-1784d77cf81b ] (state=,code=0)
apache drill (dfs)> 

(请注意,现在我需要修复权限错误,但至少现在可以尝试查询位置)。

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章