The mail receive tool automatically picks up the contents of incoming email messages as metadata. Specifically, for each incoming message the mail receive tool creates a metadata dataset that contains the relevant components of the message, such as the sender, to, cc, and reply-to addresses and the complete email body text. This dataset is then associated with the job ticket for all files delivered by the message under the standard dataset name "Email".
The email message dataset uses the XML data model with a simple schema without namespaces.
The XML document element name is "email". It contains an element for each message component as described in the table below; each element contains the text of the corresponding message component. If a message component is not present or it contains only white space, the corresponding element is missing.
Leading and trailing white space is removed (except for the body text, where it may be significant). Otherwise, no parsing or reformatting is performed. For example, a list of email addresses (including its separators) is stored exactly as it was provided in the incoming message.
Element name |
Message component |
---|---|
message-id |
An identifier for the message (assigned by the host which generated the message) |
subject |
The subject line of the message |
date |
The date and time when the message was sent (formatted as it appeared in the message header) |
from |
The email address(es) of the author(s) of the message |
sender |
The single email address of the sender of the message; if there is a single author the sender is identical to the author, otherwise the sender should be one of the authors |
reply-to |
The email addresses to which a response to this message should be sent |
to |
The email addresses of the primary recipients of this message |
cc |
The email addresses of the secondary recipients of this message |
body |
A plain-text rendition (without markup) of the body text of this message |