在Java中遍历tar.gz

Wolfone：

我有一个tar.gz文件，其中包含大量的小型xml文件（略小于1.5m）（无子目录）。现在，我想遍历这些内容，并且尝试使用apache commons compress来实现这一点。我不想像在类似主题中经常看到的那样将任何内容输出或写入新文件。我只想增量地读取信息（完美的是能够在某一点停止并继续执行程序的另一次运行，但这是次要的）。

所以对于初学者来说，我认为我应该从类似的东西开始（计数器只是为了测试而存在，以减少时间）：

public static void readTar(String in) throws IOException {
    try (TarArchiveInputStream tarArchiveInputStream =
                 new TarArchiveInputStream(
                         new BufferedInputStream(
                                 new GzipCompressorInputStream(
                                         new FileInputStream(in))))){
        TarArchiveEntry entry;
        int counter = 0;
        while ((entry = tarArchiveInputStream.getNextTarEntry()) != null && counter < 1000) {
            counter++;
            System.out.println(entry.getFile());
        }
    }
}

但是entry.getFile（）的结果始终为null，因此我无法使用其内容，而entry.getName（）返回预期的结果。

如果有人能指出我的错误，我会很高兴。

thejavagirl：

对getFile方法的解释基本上是说，它对于从存档读取的条目没有用。

https://commons.apache.org/proper/commons-compress/apidocs/org/apache/commons/compress/archivers/tar/TarArchiveEntry.html#getFile--

我相信您需要使用“阅读”：

https://commons.apache.org/proper/commons-compress/javadocs/api-1.18/org/apache/commons/compress/archivers/tar/TarArchiveInputStream.html#read-byte:A-int-int-

确定库的工作方式时，我要做的另一件事是，我将链接源代码并查看库代码以了解幕后实际发生的事情。

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。