解析电子邮件

阿比利亚:

我正在尝试像这样拆分邮件文件:

Message-ID: <53197.1075859003723.JavaMail.evans@thyme>
Date: Tue, 23 Oct 2001 10:31:09 -0700 (PDT)
From: [email protected]
To: [email protected], [email protected], [email protected]
Subject: RE: CMS Deal #1027152
Cc: [email protected], [email protected]
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Bcc: [email protected], [email protected]
X-From: Dozier, Scott </O=ENRON/OU=NA/CN=RECIPIENTS/CN=SDOZIER>
X-To: Donohoe, Tom </O=ENRON/OU=NA/CN=RECIPIENTS/CN=Tdonoho>, Chang, Bonnie </O=ENRON/OU=NA/CN=RECIPIENTS/CN=Bchang>, Love, Phillip M. </O=ENRON/OU=NA/CN=RECIPIENTS/CN=Plove>
X-cc: Valderrama, Lisa </O=ENRON/OU=NA/CN=RECIPIENTS/CN=Lvalde2>, McFatridge, Thomas </O=ENRON/OU=NA/CN=RECIPIENTS/CN=Tmcfatri>
X-bcc: 
X-Folder: \TDONOHO (Non-Privileged)\Inbox
X-Origin: Donohoe-T
X-FileName: TDONOHO (Non-Privileged).pst

I am not sure if they have confirmed either deal.  However, deal #1034254 was never pathed by us, whereas 1027152 was.  Therefore, nothing billed out under 1034254.

Bonnie - I am including you on this note in case you can add anything about the pathing of the two deals mentioned in this note.  Niether CMS orgaination shows anything on Trunkline that matches this.  We spoke briefly about this last week.

Phillip - I am including you in case you can add any clarity or determine who we did this deal(s) with.

Thank you,
Scott
5-7213

 -----Original Message-----
From:   Donohoe, Tom  
Sent:   Tuesday, October 23, 2001 12:02 PM
To: Dozier, Scott
Subject:    RE: CMS Deal #1027152

if they are not confirming this deal are they confirming 1034254?

 -----Original Message-----
From:   Dozier, Scott  
Sent:   Tuesday, October 23, 2001 9:24 AM
To: Donohoe, Tom
Cc: Valderrama, Lisa; McFatridge, Thomas
Subject:    RE: CMS Deal #1027152
Importance: High

Tom,

In contacting our scheduler and subsequently a CMS scheduler, neither CMS Field Services nor CMS Marketing, Services, and Trading are able to identify the deal.  Currently, I am preparing to fax a copy of our confirmation on the deal to CMS Field Services,  Again, it is not an executed copy, but I am assuming they may not have sent it back.  Furthermore, the CMS Field Services scheduler has told me that they don't even schedule any Trunkline deals.

Considering all of this, I am assuming the worst - that unless we can provide a trader name etc. they will short pay on this deal.  So, do you know who represented us with CMS on this deal any one that might know who their trader is or how this deal was booked?  We are getting ready to settle for Sep prod so any help asap would be appreciated.

Scott
5-7213

 -----Original Message-----
From:   Dozier, Scott  
Sent:   Thursday, October 18, 2001 12:21 PM
To: Donohoe, Tom
Subject:    RE: CMS Deal #1027152

They do not recognize that deal at all.

The most recent name and number is a Conoco trader.  I have a confirmation on this deal with CMS.  However, it is not an executed copy (i.e. sent back or confirmed by CMS).  Is there some one who represented us with CMS on this that might know who their trader is or how this deal was booked?  I will attempt to contact the scheduler in the mean time but any help would be good.

thanks.

在许多这样的文件中:

Header
Body
Original message 1
Original message 2 
...

我已经阅读了一些有关拆分邮件的文章,看来使用Mime4j应该是一个好主意。所以我做到了:

public class test {

    public static void main(String[] args) throws IOException, MimeException {
        // TODO Auto-generated method stub
        MimeTokenStream stream = new MimeTokenStream();
        stream.parse(new FileInputStream("test"));
        File header = new File ("header");
        File body = new File ("body");
        BufferedWriter headerWriter = new BufferedWriter(new FileWriter(header));
        BufferedWriter bodyWriter = new BufferedWriter(new FileWriter(body));
        String str;
        for (EntityState state = stream.getState();
                state != EntityState.T_END_OF_STREAM;
                state = stream.next()) {
            switch (state) {
              case T_BODY:
                  str = stream.getInputStream().toString();
                  bodyWriter.write(str);
                break;
              case T_FIELD:
                  str = stream.getField().toString() + "\n";
                  headerWriter.write(str);
                break;
            }
          }
        headerWriter.close();
        bodyWriter.close();

    }

}

此代码正确地将邮件分为两个文件:标头和正文。可能有更好的方法,但是我发现Mime4j Javadoc没那么有用...好吧,我仍在尝试完全理解它的工作原理。

但是,我遇到两个问题:

1)主体以Mime显然创建的线条开始,如下所示:

[LineReaderInputStreamAdaptor: [pos: 937][limit: 4096][

而且我不知道如何摆脱它。

2)“原始消息”全部在体内。我不知道如何根据那些“原始消息”将身体分成更多部分。而且,所有邮件都没有这种格式。有时原始消息仅通过制表符或每行之前的>字符“显示”,或仅通过小标题“从,到”或另一行------- forwarded -----来“显示”。 ---等等...所以我不能使用格式分割它。

我以为Mime4j应该将那些部分识别为“多部分”消息,但似乎没有(在T_START_MULTIPART的情况下,什么都没找到。)

罗伯特:

您会从stream.getInputStream().toString();写入头文件的中得到奇怪的文本

toString()方法主要用于调试。对其进行调用InputStream并不能获取流的内容(可能很多),而只是对该流的描述,这就是您所看到的。

要获取该流的数据,您需要从输入流中读取数据并将其复制到输出流。请参阅此答案以了解执行此操作的各种方法。

就原始消息而言:您的示例是一封电子邮件。它只有1个MIME部分,即纯文本部分。人们只是复制原始邮件,然后将答案放在要回复的邮件上方。

如果他们将邮件作为附件转发,则MIME结构看起来会有所不同:您会看到Content-Type: multipart/mixed; boundary="...",然后边界文本将分隔各个邮件。Apache James可能会检测到它们并正确处理它们。

MIME multipart用于附件或电子邮件的替代部分(纯文本与html)。它不是指人们将他们的回复张贴在最前面。

由于您的示例电子邮件没有MIME结构,因此最好的选择是手动分析电子邮件正文,寻找-----Original Message-----请注意,这很脆弱(您不知道人们的邮件客户端可能使用什么,人们可能会手动进行修改(可能是偶然的))。

import org.apache.james.mime4j.stream.*;
import static org.apache.james.mime4j.stream.MimeTokenStream.*;
import java.io.*;

public class Library {
    private static final String SEP = " -----Original Message-----";
    private static final String CRLF = "\r\n";

    static int fileNo = 0;

    public static void main(String[] args) throws Exception {
        MimeTokenStream stream = new MimeTokenStream();
        stream.parse(new FileInputStream(args[0]));
        try (BufferedWriter headerWriter = new BufferedWriter(new FileWriter("header"))) {
            for (EntityState state = stream.getState();
                    state != EntityState.T_END_OF_STREAM;
                    state = stream.next()) {
                switch (state) {
                case T_BODY:
                    writePart(new BufferedReader(new InputStreamReader(stream.getInputStream())));
                    break;
                case T_FIELD:
                    headerWriter.write(stream.getField().toString());
                    headerWriter.write(CRLF);
                    break;
                }
            }
        }
    }

    private static void writePart(BufferedReader in) throws Exception {
        BufferedWriter out = null;
        try {
            out = new BufferedWriter(new FileWriter(fileNo + ".eml"));
            String line = in.readLine();
            while (line != null) {
                if (SEP.equals(line)) {
                    out.close();
                    fileNo++;
                    out = new BufferedWriter(new FileWriter(fileNo + ".eml"));
                }
                out.write(line);
                out.write(CRLF);
                line = in.readLine();
            }
        }
        finally {
            out.close();
        }
    }
}

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章