'Duplicate emails - Spring Integration Mail

I am trying to setup a integeration flow to read UNSEEN emails from a mailbox and do some process on those. I have done the initial setup as below and I am having two issues.

   @Bean
   public IntegrationFlow mailListener() {
      return IntegrationFlows.from(Mail.imapInboundAdapter(CONNECTION_URL)
                              .shouldDeleteMessages(false)
                              .shouldMarkMessagesAsRead(true)
                              .searchTermStrategy(this::searchTerm)
                              .simpleContent(true)
                              .autoCloseFolder(false),
                      e -> e.poller(Pollers.fixedRate(5000).maxMessagesPerPoll(100).taskExecutor(taskExecutor())))
              .<Message>handle((payload, header) -> logMail(payload))
              .get();
   }

   @Bean
   private TaskExecutor taskExecutor() {
      ThreadPoolTaskExecutor threadPoolTaskExecutor = new ThreadPoolTaskExecutor();
      threadPoolTaskExecutor.setCorePoolSize(10);
      threadPoolTaskExecutor.setMaxPoolSize(20);
      threadPoolTaskExecutor.setQueueCapacity(200);
      threadPoolTaskExecutor.setRejectedExecutionHandler(new ThreadPoolExecutor.CallerRunsPolicy());
      threadPoolTaskExecutor.setKeepAliveSeconds(500);
      threadPoolTaskExecutor.setWaitForTasksToCompleteOnShutdown(true);
      threadPoolTaskExecutor.initialize();
      return threadPoolTaskExecutor;
   }

   private SearchTerm searchTerm(Flags flags, Folder folder) {
      SearchTerm[] searchTerms = {new FlagTerm(new Flags(Flags.Flag.SEEN), false)};
      return new AndTerm(searchTerms);
   }

Issue 1 I am seeing a lot duplicate messages. The same emails are read again. I have also added a SearchTerm to filter the messages. Am I doing something wrong here?

Issue 2

When there are a lot of emails in mailbox (approx: 600000). ImapInboundAdapter tries to fetch all of them at once even when I have set max-messages-per-poll on the Poller. If I set max-fetch-size on the adapter then, It only fetches the configured number of messages but then processing is really slow around 5-10 sec/email. Fetch process ( Search -> flagging all the emails --> fetching the content ) takes a lot of time . So my second question here is , How can I increase the performance to process around 200000 emails on daily basis?

Any Suggestions would be helpful. Thanks



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source