JavaMail IMAP over SSL quite slow - Bulk fetching multiple messages

Justmaker :

I am currently trying to use JavaMail to get emails from IMAP servers (Gmail and others). Basically, my code works: I indeed can get the headers, body contents and so on. My problem is the following: when working on an IMAP server (no SSL), it basically takes 1-2ms to process a message. When I go on an IMAPS server (hence with SSL, such as Gmail) I reach around 250m/message. I ONLY measure the time when processing the messages (the connection, handshake and such are NOT taken into account).

I know that since this is SSL, the data is encrypted. However, the time for decryption should not be that important, should it?

I have tried setting a higher ServerCacheSize value, a higher connectionpoolsize, but am seriously running out of ideas. Anyone confronted with this problem? Solved it one might hope?

My fear is that the JavaMail API uses a different connection each time it fetches a mail from the IMAPS server (involving the overhead for handshake...). If so, is there a way to override this behavior?

Here is my code (although quite standard) called from the Main() class:

 public static int connectTest(String SSL, String user, String pwd, String host) throws IOException,
                                                                               ProtocolException,
                                                                               GeneralSecurityException {

    Properties props = System.getProperties();
    props.setProperty("mail.store.protocol", SSL);
    props.setProperty("mail.imaps.ssl.trust", host);
    props.setProperty("mail.imaps.connectionpoolsize", "10");

    try {


        Session session = Session.getDefaultInstance(props, null);

        // session.setDebug(true);

        Store store = session.getStore(SSL);
        store.connect(host, user, pwd);      
        Folder inbox = store.getFolder("INBOX");

        inbox.open(Folder.READ_ONLY);                
        int numMess = inbox.getMessageCount();
        Message[] messages = inbox.getMessages();

        for (Message m : messages) {

            m.getAllHeaders();
            m.getContent();
        }

        inbox.close(false);
        store.close();
        return numMess;
    } catch (MessagingException e) {
        e.printStackTrace();
        System.exit(2);
    }
    return 0;
}

Thanks in advance.

Justmaker :

after a lot of work, and assistance from the people at JavaMail, the source of this "slowness" is from the FETCH behavior in the API. Indeed, as pjaol said, we return to the server each time we need info (a header, or message content) for a message.

If FetchProfile allows us to bulk fetch header information, or flags, for many messages, getting contents of multiple messages is NOT directly possible.

Luckily, we can write our own IMAP command to avoid this "limitation" (it was done this way to avoid out of memory errors: fetching every mail in memory in one command can be quite heavy).

Here is my code:

import com.sun.mail.iap.Argument;
import com.sun.mail.iap.ProtocolException;
import com.sun.mail.iap.Response;
import com.sun.mail.imap.IMAPFolder;
import com.sun.mail.imap.protocol.BODY;
import com.sun.mail.imap.protocol.FetchResponse;
import com.sun.mail.imap.protocol.IMAPProtocol;
import com.sun.mail.imap.protocol.UID;

public class CustomProtocolCommand implements IMAPFolder.ProtocolCommand {
    /** Index on server of first mail to fetch **/
    int start;

    /** Index on server of last mail to fetch **/
    int end;

    public CustomProtocolCommand(int start, int end) {
        this.start = start;
        this.end = end;
    }

    @Override
    public Object doCommand(IMAPProtocol protocol) throws ProtocolException {
        Argument args = new Argument();
        args.writeString(Integer.toString(start) + ":" + Integer.toString(end));
        args.writeString("BODY[]");
        Response[] r = protocol.command("FETCH", args);
        Response response = r[r.length - 1];
        if (response.isOK()) {
            Properties props = new Properties();
            props.setProperty("mail.store.protocol", "imap");
            props.setProperty("mail.mime.base64.ignoreerrors", "true");
            props.setProperty("mail.imap.partialfetch", "false");
            props.setProperty("mail.imaps.partialfetch", "false");
            Session session = Session.getInstance(props, null);

            FetchResponse fetch;
            BODY body;
            MimeMessage mm;
            ByteArrayInputStream is = null;

            // last response is only result summary: not contents
            for (int i = 0; i < r.length - 1; i++) {
                if (r[i] instanceof IMAPResponse) {
                    fetch = (FetchResponse) r[i];
                    body = (BODY) fetch.getItem(0);
                    is = body.getByteArrayInputStream();
                    try {
                        mm = new MimeMessage(session, is);
                        Contents.getContents(mm, i);
                    } catch (MessagingException e) {
                        e.printStackTrace();
                    }
                }
            }
        }
        // dispatch remaining untagged responses
        protocol.notifyResponseHandlers(r);
        protocol.handleResult(response);

        return "" + (r.length - 1);
    }
}

the getContents(MimeMessage mm, int i) function is a classic function that recursively prints the contents of the message to a file (many examples available on the net).

To avoid out of memory errors, I simply set a maxDocs and maxSize limit (this has been done arbitrarily and can probably be improved!) used as follows:

public int efficientGetContents(IMAPFolder inbox, Message[] messages)
        throws MessagingException {
    FetchProfile fp = new FetchProfile();
    fp.add(FetchProfile.Item.FLAGS);
    fp.add(FetchProfile.Item.ENVELOPE);
    inbox.fetch(messages, fp);
    int index = 0;
    int nbMessages = messages.length;
    final int maxDoc = 5000;
    final long maxSize = 100000000; // 100Mo

    // Message numbers limit to fetch
    int start;
    int end;

    while (index < nbMessages) {
        start = messages[index].getMessageNumber();
        int docs = 0;
        int totalSize = 0;
        boolean noskip = true; // There are no jumps in the message numbers
                                           // list
        boolean notend = true;
        // Until we reach one of the limits
        while (docs < maxDoc && totalSize < maxSize && noskip && notend) {
            docs++;
            totalSize += messages[index].getSize();
            index++;
            if (notend = (index < nbMessages)) {
                noskip = (messages[index - 1].getMessageNumber() + 1 == messages[index]
                        .getMessageNumber());
            }
        }

        end = messages[index - 1].getMessageNumber();
        inbox.doCommand(new CustomProtocolCommand(start, end));

        System.out.println("Fetching contents for " + start + ":" + end);
        System.out.println("Size fetched = " + (totalSize / 1000000)
                + " Mo");

    }

    return nbMessages;
}

Do not that here I am using message numbers, which is unstable (these change if messages are erased from the server). A better method would be to use UIDs! Then you would change the command from FETCH to UID FETCH.

Hope this helps out!

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related