Differences between revisions 6 and 7
Revision 6 as of 2005-03-22 05:40:21
Size: 12305
Editor: anonymous
Comment: missing edit-log entry for this revision
Revision 7 as of 2009-09-20 22:58:40
Size: 12305
Editor: localhost
Comment: converted to 1.6 markup
No differences found!

Nothing more than brainstorming, for now.

This page is a bit of a hodgepodge, and begs a bit for reorganizing.

  • Matcher Parameters
  • Extensible User Model
  • BSF Matcher/Mailet
  • [DONE v2] Replace Mordred
  • Full Function Mailing Lists
  • Dynamic Reconfiguration
  • Configuration in Database
  • User-specific Pipeline
  • Mailet Security
  • [DONE v2] Mailet Classloaders
  • Remove Finalizers
  • Security
  • Scalable/clusterable pipelining (e.g., via tuplespace)
  • Updated SMTP support (extensions)
  • HypersonicSQL standard so that non-mailstore repositories can count on the availability of some JDBC database
  • [DONE] Sieve (http://libsieve.sourceforge.net/) support.

  • Support for SOAP Services
  • [DONE] Configurable bounce processor esp. for RemoteDelivery

  • Add JMX monitoring of spool queues etc.
  • Use separate interface for Spool and queue repositories vs message store repositories

Introduce mailet application deployment features to make mailet deployment *easy*

  • mailet specific classloading for easy deployment of mailets, and dependancies

[Progress: mailet classloader now looks in apps/james/SAR-INF/classes and apps/james/SAR-INF/lib/*.jar]

  • common base classloader for shared jars
  • classloader seperation between packaged mailet applications, and between James and mailets
  • mailet application packaging, including automatic deployment of configuration "snippets" from application directory

  • DRAC support

If you're familiar with this protocol (or whatever you call it), the idea is when POP3 user is authenticated (could be IMAP too), that IP address gets registered for say the next 30 minutes. Then the SMTP server can check that list and will accept messages from that IP address in that time frame. This is an alternate way of letting remote dial-up users send email without resorting to SMTP-AUTH (and thus possibly deal passwords unencrypted as it's easier to require POP3/IMAP4 connections to be in SSL).

  • Mail and User attributes

Right now the mail attributes has a String error message, which is something of a hack, and there's no way to attach extra objects to a Mail object. This would allow you to have one mailet do some checking and set a flag on a message, without having to add new headers to the message. I prefer to make these attributes rather than properties, and we could support any serializable object.[see JamesMailetV3]

  • Message repository directory structure

Ok, this is my big idea for the week. This does not relate to changing the API of repositories, but instead changing the way we reference a given repository.

Currently, each repository is defined with a URL metaphor, e.g., db://maildb/inboxes or file:/var/mail/inboxes. While workable, it makes it somewhat ornery to support multiple formats (we have file, db, and dbfile, and would hope to have at least mbox, maildir, remote IMAP folders, and probably others). It also makes it more difficult to use tools to copy/move messages between these repository instances because you have to remember these URLs and use the same ones already defined in the configuration file. Finally, it makes it very difficult to pass configuration settings to a given repository implementation. For instance, it would be nice to configure a specific dbfile implementation to have different thresholds before the message body is stored in the file system (instead of just in the DB). This configuration by appending query string parameters to the URL quickly becomes out of hand with this metaphor.

What I propose is a virtual message repository structure that uses a *mount* metaphor. For example, you could mount the db repository implementation to /inboxes and the file repository implementation to /spool. Or instead maybe you mount /inboxes/lokitech.com to the db repository implementation and /inboxes/otherdomain.com to the maildir repository implementation.

Admin tools are much easier to use now as you can simply read messages in /spam, decide whether to delete or move them back into /spool, without having to worry about underlying implementations and URLs. As a more advanced example, you could at run-time mount /tmp/remote1 to a remote IMAP account, move all messages from /tmp/remote1 to /inboxes/sergek, and then unmount /tmp/remote1. This would transfer all my messages (and folder structure) from that remote server into James's store (which of course it doesn't matter whether that's maildir or db or whatever).

This allows us to more easily use the repository abstraction that we've defined (or that we could revise). Our repository architecture already supports child repositories, so it's well understood that you can start with file:/var/mail/inboxes, and if you asked for a child named "sergek", you would get file:/var/mail/inboxes/sergek. This translates very nicely to the mount metaphor and in fact makes it that much easier to do.

While this is a big conceptual and configuration change, it does not impact the existing code very much. The repository implementations can potentially work without any change because we're only changing how they are initially looked up. Existing mailets/matchers would need minimal change since they would be configured to ask for /inboxes rather than file:/var/mail/inboxes (we might change what classes they use to do the lookups, and I would have that's the limit to the changes they would need). The code to do the mounting would be new and relatively separate, thus making it not overly difficult.

[I agree with the premise of why, but I'm not sure about the how. Seems overly complex, but maybe I'm reading something into it that isn't there.

If we're going to have User attributes, then why not associate a repository directly with the user? This would provide flexibility, and given the presence of a tool for copying between repositories, seems to provide a nice admin capability. And an admin can switch repository types without accidentally screwing up existing users, OR migrate users to another repository type.

What do you think of handling mail repositories that way?]

  • Use internal mail attributes to record state, and use a mailet to take specified mail attributes and generate X-headers as configured.

[see JamesMailetV3]

Spooler Changes

One thing to consider is that the spool doesn't have to be a store, it just has to use a store. This introduces a number of possibilities:

  • We don't store Mail objects in message stores, just MimeMessages

    • This allows a wider range of messages stores. For example, mbox and maildir only store MimeMessage, not Mail objects, so a) you can't really use them for spools and b) you're in a similar situation with fetchpop3 if you try to move it into a spool.

    • The spool can be persistent; that is a separate issue.
  • We can use the JavaMail API for folders.

    • Then as a rule, we can store NNTP, IMAP, POP3, everything into these folders.
  • We can store the MimeMessage once, and only move the Mail objects around (if at all) before they are moved to the final message store.

  • By separating the spool and store abstractions, we can use something like a t-space for the spooler. The object moving through the spooler could be just for the mail object, which would have a relationship to a stored message.
    • So long as the stored message is available to all t-space clients, we're golden. IOW, if we have a distributed t-space, we'd better have a distributed message store, or that mail object has to have some proxy capability.
    • It increases performance, especially with large objects. Question: How do you move X amount of data in Y time? when that would be quite challenging? Answer: you don't. You don't move it. You store it once, and just move the reference (Mail) object.

  • By separating the Mail object from the Message, we better enable the necessary ability to support spool processing for other types of messages.

Protocol-based "fast-fail"

All would be configuration options:

  • If a matcher discovers spam from a currently active connection, then we could kill that connection (with a proper 5xx code) ASAP.
    • Caveat: This is not to say that we want to add matcher/mailets within the SMTP session. That is hard to configure, and we believe that we could hit 90% of the use cases with half a dozen fast-fail configuration options. This mechanism is simply for outright rejection, and is predicated upon an IpFilter mechanism, and a mailet being able to educate the IpFilter mechanism.

  • Limit parallel connections from a single IP address
  • The consensus is that the right track is to add various fast-fail cases. Sendmail allows you to limit the number of messages that can come from a given IP address at once. Peter Goldstein listed other approaches (cap # of invalid commands, cap # of rsets, etc.)
  • A suggestion was made to reject all non-local addresses, however, enabling SMTP AUTH appears to make simply rejection of non-local domains redundant.

Planned fast-fail types:

  • per-IP address
  • per-session restrictions (# command, # total bytes)

  • per-message restrictions (# recipients)

Note: RFC2821/4.5.1 mandates the ability to deliver e-mail to the local postmaster. One option is to restrict a "rejected" IP from being able to do anything other than send a single, limited length, e-mail to postmaster.

Currently implemented fast fail allows rejecting relay based upon sender IP. One thought is to add a few more calls that provide a fast fail framework, e.g.,

  - acceptRCPT_TO(), returns null or an error [code, message] 

The acceptance logic would be opaque to the handler. The handler would simply implement the protocol response.

Vincenzo's list:

1) Extend AttachmentFileNameIs to optionally recurse one level in zip attachments. 2) Code an AttachmentFileNameIsRegex matcher (also recursing zips). 3) Polish up and finally commit my Bayesian stuff. 4) Polish up and finally commit my antivirus matcher (perhaps having it become a mailet setting attributes). 5) Polish up and finally commit my AddServerSignature mailet. 6) Start designing and coding a set of functionalities oriented towards server signing and checking, timestamping, logging, whitelisting and more generally "certified electronic mail" (with also the goal of supporting a new legislation just set in my country, that could become of interest for others, but first of all I'm thinking on a set of common stuff that can be customised).

Jason's list:

0) Debug the mbox random access file class, update the mbox file handler and commit them both 1) Sort out user attributes. This is done for file user stores, but the db stuff needs to be re-worked as I'm not happy with it 2) Get the current mailbox system to support folders 3) Get the IMAP server to work with the new folder support 4) Tie in the current IMAP2 proposal into the main source tree 5) Get my twins to sleep through the night (easy stuff first :))

Other things I'd like to look at are (but probably won't): a) Logging needs to be consistent so I can track mails across the system b) JMX support. We still need to be able to authenticate users in JMX using the James user dbs. Until this is done JMX is a security hazard c) A scalable mail queue system. Dumping 150k files into one flat folder is a bad idea on any OS. A folder/file system a la Qmail might be fun

Sorens list:

  • allowing RemoteDelivery to use SMTP-SSL (port 465)

  • support for STARTTLS in SMTPHandler
  • handling source routes by stripping them
  • RemoteDelivery uses HELO not EHLO, due to a bug in JavaMail back in 2001, so I believe it is time to revisit that.

JamesV3/Plans (last edited 2009-09-20 22:58:40 by localhost)