Camel.FolderSummary

The CamelFolderSummary manages lists of information about messages. It is a class used by implementors to create and store the information accessed by client code, and also store any relevent information they might need on each message.

All information is based on a Evolution/#Camel.MessageInfo structure, which is a sort-of light-weight object, attached to a specific CamelFolderSummary instance, which also provides virtual method accessors.

The CamelFolderSummary class provides almost all of the required functionality itself, it only needs to be subclassed to provide backend-specific information.

Base class

 struct _CamelFolderSummary {
        CamelObject parent;
 
        struct _CamelFolderSummaryPrivate *priv;
 
        guint32 version;
        guint32 flags;
        guint32 nextuid;
        time_t time;
        guint32 saved_count;
        guint32 unread_count;
        guint32 deleted_count;
        guint32 junk_count;
 
        guint32 message_info_size;
        guint32 content_info_size;
 
        struct _EMemChunk *message_info_chunks;
        struct _EMemChunk *content_info_chunks;
 
        char *summary_path;
        gboolean build_content;
 
        GPtrArray *messages;
        GHashTable *messages_uid;
 
        struct _CamelFolder *folder;
 };
 
 CamelFolderSummary *camel_folder_summary_new(struct _CamelFolder *folder);

Most of this stuff is just used internally or can be used by implementors to access required information.

Summary storage

The summary is designed to be stored on disk, so that the values can be computed - which may be expensive, and then retrieved later on more efficiently. Note that it needn't be stored on disk either, as is the case with CamelVeeSummary.

 void camel_folder_summary_set_filename(CamelFolderSummary *summary, const char *filename);
 
 int camel_folder_summary_load(CamelFolderSummary *summary);
 int camel_folder_summary_save(CamelFolderSummary *summary);
 
 int camel_folder_summary_header_load(CamelFolderSummary *summary);

You need to set the filename to a non-NULL value before loading or saving the summary. The header can be loaded separately, which will just initialise any values from the header into the CamelFolderSummary object. This can be used to find out information about a folder without having to load all of the message information in; which could take a significant amount of time.

 void camel_folder_summary_touch(CamelFolderSummary *summary);

When a messageinfo changes, the summary may not know about it. So if an implementation makes its own changes directly to the CamelMessageInfo, it needs to notify the summary using the touch command. This will mark the summary dirty, and force a physical write next time save is invoked.

File format

The file basically consists of a versioned header, which may contain multiple levels, one for each sub-class that wishes to add extra information; followed by a count, and then count records.

The Evolution/Camel.Misc#Camel.FileUtils binary encoding mechanisms are used to encode the records in the file.

See the source-code for examples, particularly of how versioning and subclassing works (i'm not going to explain in depth because the mechanism is radically different in the disksummary branch.)

Options

A couple of options exist for controlling the summariser that is run when new_from_parser or new_from_message is called.

 void camel_folder_summary_set_index(CamelFolderSummary *summary, CamelIndex *index);

If the index is non-NULL, then the summariser will also generate a Evolution/Camel.Index entry for a name based on the UID of the message. Only text/* parts are processed (no snooping to verify or calculate the type is done). Encrypted parts are not indexed (well on purpose for security but also because that may require user interaction). text/html parts have their tags stripped.

 void camel_folder_summary_set_build_content(CamelFolderSummary *summary, gboolean state);

If build_content is true, then the summariser will also create CamelMessageContentInfo data in the base CamelMessageInfoBase structure. This must only be called in the initialisation code for the summary based on the implementation, as it also affects whether this data is loaded or saved as well, which cannot be detected from the file format.

Camel.MessageInfo

The CamelMessageInfo contains a whole swathe of information about an RFC822 message and message in a folder. As well as the basic read-only envelope information, it includes read-write state information, and can also store information on the full mime tree of the message.

 struct _CamelMessageInfo {
        CamelFolderSummary *summary;
 
        guint32 refcount;
        char *uid;
 };

The basic CamelMessageInfo object contains no information, it is up to implementations to store the information however it sees fit, and to override the accessor methods appropriately. The uid must be a g_malloc'd string.

However, and this is a little bit tricky, if an implementation doesn't override the base class accessors, it will implement a standard CamelMessageInfo structure with all of the normal content in it. This also allows implementations to inherit from it if they wish.

 struct _CamelMessageInfoBase {
        CamelFolderSummary *summary;
 
        guint32 refcount;
        char *uid;
 
        const char *subject;
        const char *from;
        const char *to;
        const char *cc;
        const char *mlist;
 
        guint32 flags;
        guint32 size;
 
        time_t date_sent;
        time_t date_received;
 
        CamelSummaryMessageID message_id;
        CamelSummaryReferences *references;
 
        struct _CamelFlag *user_flags;
        struct _CamelTag *user_tags;
        CamelMessageContentInfo *content;
 };

This strange setup is so that methods may operate on a 'class-less' CamelMessageInfo, who's summary is NULL, and still share the code with implementations that dont want to have to implement every function.

If you are subclassing CamelMessageInfoBase, then all of the string fields apart from uidmust be allocated using the camel_pstring functions.

Creating

These interfaces are used both by implementation and client code. Client code may need to create a 'dummy' CamelMessageInfo for operations like Evolution/Camel.Folder.append_message(), and client implementations will need to build CamelMessageInfo structures as appropriate.

These functions (and only these functions) may take a NULL summary pointer, in which case an 'anonymous' CamelMessageInfo will be created. It can be passed around to various other

 void *camel_message_info_new(CamelFolderSummary *summary);
 void camel_message_info_ref(void *info);
 CamelMessageInfo *camel_message_info_new_from_header(CamelFolderSummary *summary, struct _camel_header_raw *header);
 void camel_message_info_free(void *info);
 void *camel_message_info_clone(const void *info);

If you need to extend the CamelMessageInfo or, CamelMessageInfoBase, then you also need to override the various virtual methods and chain them back to the parent class as appropriate.

Camel.MessageContentInfo

This can be used to store information about the mime structure of a message, without having to instantiate the message itself. It has been used for example by the IMAP implementation to store the result of a BODY fetch, so that individual message parts can be retrieved directly.

 struct _CamelMessageContentInfo {
        struct _CamelMessageContentInfo *next;
        
        struct _CamelMessageContentInfo *childs;
        struct _CamelMessageContentInfo *parent;
        
        CamelContentType *type;
        char *id;
        char *description;
        char *encoding;
        guint32 size;
 };

In most cases however, it just isn't worth the effort.

Camel.Flag

CamelFlags are used to store simple set/unset values based on arbitrary string names. A Evolution/Camel.Folder will store these permanently if it's permanent_flags contains CAMEL_MESSAGE_USER.

 typedef struct _CamelFlag {
        struct _CamelFlag *next;
        char name[1];
 } CamelFlag;
 
 gboolean camel_flag_get(CamelFlag **list, const char *name);
 gboolean camel_flag_set(CamelFlag **list, const char *name, gboolean state);
 gboolean camel_flag_list_copy(CamelFlag **to, CamelFlag **from);
 int camel_flag_list_size(CamelFlag **list);
 void camel_flag_list_free(CamelFlag **list);

The above are utility functions used by implementations if they need to access or create user flags; normally client code will just use the camel_message_info_ accessors.

Camel.Tag

CamelTags aresimilar to CamelFlags, although they contain arbitrary name-value pairs instead of boolean values. Note that almost no implementation stores these tags anywhere permanently beyond the summary file.

 typedef struct _CamelTag {
        struct _CamelTag *next;
        char *value;
        char name[1];
 } CamelTag;
 
 const char *camel_tag_get(CamelTag **list, const char *name);
 gboolean camel_tag_set(CamelTag **list, const char *name, const char *value);
 gboolean camel_tag_list_copy(CamelTag **to, CamelTag **from);
 int camel_tag_list_size(CamelTag **list);
 void camel_tag_list_free(CamelTag **list);

Again, these functions are for implementations that want to do their own thing, otherwise the client code will use the accessor methods.

Note that currently in Evolution, labels are implemented using CamelTag, although they should be using CamelFlag (or even system flags), since that would allow them to interoperate with other mail clients properly.

System Flags

Apart from the arbitrary user_flags, there are also specific system flags which have more defined semantics. Some of these will map in an interoperable way directly to the backend storage format, others may not.

 typedef enum _CamelMessageFlags {
        CAMEL_MESSAGE_ANSWERED = 1<<0,
        CAMEL_MESSAGE_DELETED = 1<<1,
        CAMEL_MESSAGE_DRAFT = 1<<2,
        CAMEL_MESSAGE_FLAGGED = 1<<3,
        CAMEL_MESSAGE_SEEN = 1<<4,
 
        CAMEL_MESSAGE_ATTACHMENTS = 1<<5,
        CAMEL_MESSAGE_ANSWERED_ALL = 1<<6,
        CAMEL_MESSAGE_JUNK = 1<<7,
        CAMEL_MESSAGE_SECURE = 1<<8,
 
        CAMEL_MESSAGE_FOLDER_FLAGGED = 1<<16,
 
        CAMEL_MESSAGE_JUNK_LEARN = 1<<30,
        CAMEL_MESSAGE_USER = 1<<31
 } CamelMessageFlags;
 
 #define CAMEL_MESSAGE_SYSTEM_MASK (0xffff << 16)

First, the basic system flags:

; CAMEL_MESSAGE_ANSWERED : The message has been replied to. ; CAMEL_MESSAGE_DELETED : The message has been deleted. It will be removed at the next expunge. ; CAMEL_MESSAGE_DRAFT : This is a draft message. ; CAMEL_MESSAGE_FLAGGED : This message is 'flagged'. In Evolution this means the message has a ! next to it, or is 'important'. ; CAMEL_MESSAGE_SEEN : This message has been read.

Then there are some other Camel-specific flags which are just used to spruce up the UI a little bit.

; CAMEL_MESSAGE_ATTACHMENTS : The message probably has attachments. This is a little hard to guage as the various multipart/ and encrypted types make it difficult if not impossible to automatically guage if the message has any. It is a hint only. ; CAMEL_MESSAGE_ANSWERED_ALL : The message has been replied to all, rather than just the sender. ; CAMEL_MESSAGE_JUNK : The message is junk. ; CAMEL_MESSAGE_SECURE : The message is signed or encrypted. This is only partially implemented. ; CAMEL_MESSAGE_FOLDER_FLAGGED : This is up to the implementation to use. Actually it is set whenever the CamelMessageInfo changes. Implementations can use it to determine if anything needs to be done with the CamelMessageInfo, if they are operating in a batch-update mode. ; CAMEL_MESSAGE_JUNK_LEARN : This is a special flag to be used by the client. If set, it means that the CAMEL_MESSAGE_JUNK flag should be used to determine if the message is 'learned' or 'unlearned' through the junk processor. Once the junk has been learned or unlearned then this flag is automatically cleared. ; CAMEL_MESSAGE_USER : This is never set on a CamelMessageInfo, but is used by Evolution/Camel.Folder to indicate through permanent_flags that user Evolution/#Camel.Flags are supported. This is a waste of a bit i guess ...

Message IDs

Because MessageID's are arbitrarily long strings, aren't much use in client code, and inefficient to process, the message id's in CamelMessageInfo are not stored as strings. They are stored as the first 64 bits of an md5 hash of the messageid. This simplifies the Evolution/Camel.FolderThread implementation a bit, and there isn't really much use for these otherwise.

 typedef struct _CamelSummaryMessageID {
        union {
                guint64 id;
                unsigned char hash[8];
                struct {
                        guint32 hi;
                        guint32 lo;
                } part;
        } id;
 } CamelSummaryMessageID;

The references information, used to build the conversation thread, is just stored as an array of these. References are listed from parent to root of the conversation.

 typedef struct _CamelSummaryReferences {
        int size;
        CamelSummaryMessageID references[1];
 } CamelSummaryReferences;

Note that because a hash is used, there is a very small, but possible likelyhood that different message id's will result in the same hash code. Any algorithm that depends on absolute matching would probably have to load the messages in question to verify the message id's are identical - but it would have to compare content anyway since MessageId's aren't guaranteed to be unique.

Accessors

To simplify the api and to provide automatic extensibility, most of the basic accessors are done through vectored interfaces:

 const void *camel_message_info_ptr(const CamelMessageInfo *mi, int id);
 guint32 camel_message_info_uint32(const CamelMessageInfo *mi, int id);
 time_t camel_message_info_time(const CamelMessageInfo *mi, int id);

But there are also special accessors for the more complex types:

 gboolean camel_message_info_user_flag(const CamelMessageInfo *mi, const char *id);
 const char *camel_message_info_user_tag(const CamelMessageInfo *mi, const char *id);
 
 gboolean camel_message_info_set_flags(CamelMessageInfo *mi, guint32 mask, guint32 set);
 gboolean camel_message_info_set_user_flag(CamelMessageInfo *mi, const char *id, gboolean state);
 gboolean camel_message_info_set_user_tag(CamelMessageInfo *mi, const char *id, const char *val);

set_flags needs some more explanation since nobody seems to understand how it works. Basically mask is used to select which bits are to change. Any 0 bits in mask will not be changed in the message info flags. Then set sets the remaining bits directly. In this way, a single call can set or clear any combination of bits efficiently.

So for example, if you wanted to clear the deleted bit, you would use:

 camel_message_info_set_flags(mi, CAMEL_MESSAGE_DELETED, 0);

To set the deleted bit, you could use:

 camel_message_info_set_flags(mi, CAMEL_MESSAGE_DELETED, CAMEL_MESSAGE_DELETED);

Or, since all the 0 bits in mask are ignored, you could equally use:

 camel_message_info_set_flags(mi, CAMEL_MESSAGE_DELETED, ~0);

Then, there are macros which should be used to access each type. For convenience the incoming pointer is type-cast to the correct type, to avoid messy casts in code.

 #define camel_message_info_uid(mi) ((const char *)((const CamelMessageInfo *)mi)->uid)
 
 #define camel_message_info_subject(mi) ((const char *)camel_message_info_ptr((const CamelMessageInfo *)mi, CAMEL_MESSAGE_INFO_SUBJECT))
 #define camel_message_info_from(mi) ((const char *)camel_message_info_ptr((const CamelMessageInfo *)mi, CAMEL_MESSAGE_INFO_FROM))
 #define camel_message_info_to(mi) ((const char *)camel_message_info_ptr((const CamelMessageInfo *)mi, CAMEL_MESSAGE_INFO_TO))
 #define camel_message_info_cc(mi) ((const char *)camel_message_info_ptr((const CamelMessageInfo *)mi, CAMEL_MESSAGE_INFO_CC))
 #define camel_message_info_mlist(mi) ((const char *)camel_message_info_ptr((const CamelMessageInfo *)mi, CAMEL_MESSAGE_INFO_MLIST))
 
 #define camel_message_info_flags(mi) camel_message_info_uint32((const CamelMessageInfo *)mi, CAMEL_MESSAGE_INFO_FLAGS)
 #define camel_message_info_size(mi) camel_message_info_uint32((const CamelMessageInfo *)mi, CAMEL_MESSAGE_INFO_SIZE)
 
 #define camel_message_info_date_sent(mi) camel_message_info_time((const CamelMessageInfo *)mi, CAMEL_MESSAGE_INFO_DATE_SENT)
 #define camel_message_info_date_received(mi) camel_message_info_time((const CamelMessageInfo *)mi, CAMEL_MESSAGE_INFO_DATE_RECEIVED)
 
 #define camel_message_info_message_id(mi) ((const CamelSummaryMessageID *)camel_message_info_ptr((const CamelMessageInfo *)mi, CAMEL_MESSAGE_INFO_MESSAGE_ID))
 #define camel_message_info_references(mi) ((const CamelSummaryReferences *)camel_message_info_ptr((const CamelMessageInfo *)mi, CAMEL_MESSAGE_INFO_REFERENCES))
 #define camel_message_info_user_flags(mi) ((const CamelFlag *)camel_message_info_ptr((const CamelMessageInfo *)mi, CAMEL_MESSAGE_INFO_USER_FLAGS))
 #define camel_message_info_user_tags(mi) ((const CamelTag *)camel_message_info_ptr((const CamelMessageInfo *)mi, CAMEL_MESSAGE_INFO_USER_TAGS))

Accessing the summary

There's another bucket of stuff for implementation and client code to use to access the individual CamelMessageInfo items in the summary.

Building

The basic interface is to add a new CamelMessageInfo to the end of the summary. The info supplied MUST be allocated using camel_message_info_new functions from THIS summary. The info must also have a valid and unique UID set before it is added; otherwise a new one may be assigned which is probably not what you want. This rather annoying behaviour is just a plain bad design decision.

 void camel_folder_summary_add(CamelFolderSummary *summary, CamelMessageInfo *info);

But there are also a bunch of helpers to add a summary item from more structured sources. These are a little tricky because of the way they may need to assign UIDs. This makes the implementation tricky - you either need to override these methods and calculate the UID some other way, or override the get_next_uid method. Neither are very good, you'd probably be best off avoiding them - although if you want to use the automatic indexing, then you still need to assign a unique UID as part of the new functions anyway.

 CamelMessageInfo *camel_folder_summary_add_from_header(CamelFolderSummary *summary, struct _camel_header_raw *headers);
 CamelMessageInfo *camel_folder_summary_add_from_parser(CamelFolderSummary *summary, CamelMimeParser *parser);
 CamelMessageInfo *camel_folder_summary_add_from_message(CamelFolderSummary *summary, CamelMimeMessage *message);

And apart from a new blank summary item, you can also create them from the same sources. The UID or other details can then be set and it added to the summary.

 CamelMessageInfo *camel_folder_summary_info_new_from_header(CamelFolderSummary *summary, struct _camel_header_raw *headers);
 CamelMessageInfo *camel_folder_summary_info_new_from_parser(CamelFolderSummary *summary, CamelMimeParser *parser);
 CamelMessageInfo *camel_folder_summary_info_new_from_message(CamelFolderSummary *summary, CamelMimeMessage *message);

Again, any CamelMessageContentInfos created by an implementation must be allocated on the summary.

 CamelMessageContentInfo *camel_folder_summary_content_info_new(CamelFolderSummary *summary);
 void camel_folder_summary_content_info_free(CamelFolderSummary *summary, CamelMessageContentInfo *ci);

Then a whole host of functions to remove summary items. remove_range is much more efficient if you need to remove a number of CamelMessageInfos, since the implementation just uses an array to store them.

 void camel_folder_summary_remove(CamelFolderSummary *summary, CamelMessageInfo *info);
 void camel_folder_summary_remove_uid(CamelFolderSummary *summary, const char *uid);
 void camel_folder_summary_remove_index(CamelFolderSummary *summary, int index);
 void camel_folder_summary_remove_range(CamelFolderSummary *summary, int start, int end);

And a cheap way to clear out the summary entirely.

 void camel_folder_summary_clear(CamelFolderSummary *summary);

UID assignment

There are a couple of functions used to calculate new UIDs, which are just sequential numbers. These will also be called automatically if there is a UID clash; so next_uid_string may need to do more work if the UIDs are not numerically based for this backend. set_uid can be used to force a minimum UID value - useful if recreating a summary from data that already has assigned UIDs.

 guint32 camel_folder_summary_next_uid(CamelFolderSummary *summary);
 char *camel_folder_summary_next_uid_string(CamelFolderSummary *summary);
 void camel_folder_summary_set_uid(CamelFolderSummary *summary, guint32 uid);

Retrieving

Again, there are a bunch of various accessors to get information from within the summary. These are used by Evolution/Camel.Folder implementations to lookup working information.

 int camel_folder_summary_count(CamelFolderSummary *summary);
 CamelMessageInfo *camel_folder_summary_index(CamelFolderSummary *summary, int index);
 CamelMessageInfo *camel_folder_summary_uid(CamelFolderSummary *summary, const char *uid);
 GPtrArray *camel_folder_summary_array(CamelFolderSummary *summary);
 void camel_folder_summary_array_free(CamelFolderSummary *summary, GPtrArray *array);

Utilities

A couple of special file-utils type encoders that encode some common mail strings as tokens. e.g. mime types. Unknown tokens are encoded like strings. Probably not worth using in any new code.

 int camel_folder_summary_encode_token(FILE *out, const char *str);
 int camel_folder_summary_decode_token(FILE *in, char **str);

A couple of basically unrelated functions which convert a standard 'token' into a system flag name. I think these are helpers just for the filter and search code, and don't really belong in this class.

 guint32 camel_system_flag(const char *name);
 gboolean camel_system_flag_get(guint32 flags, const char *name);

And finally some debugging functions for printing out the contents of a CamelMessageInfo.

 void camel_content_info_dump(CamelMessageContentInfo *ci, int depth);
 void camel_message_info_dump(CamelMessageInfo *mi);

Events

CamelFolderSummary itself doesn't send out any events, but because it is manipulating objects that represent folder items, it can create its own events on the folder object. Strictly speaking it should probably just create the events on itself, and then the the Evolution/Camel.Folder class could proxy the events to client code.

But why bother; it is just overhead which will always be required, and these objects are too tightly integrated to worry about it.

So, any changes to the flags, tags, and user flags are automatically transformed into the appropriate Evolution/Camel.Folder folder_changed events.

Note that added and removed events are not automatically created; it is up to any implementation to do these itself. Yes this is silly ... it has been changed on disksummary branch to be more consistent.

Notes

This is another fairly complex object, but most of the complexity is warranted, and hidden from view of the client.

Some of the APIs are a little strange - the API is geared toward append-only operation, and includes some details like UID assignment which belong only to specific implementations. Although at first glance this makes sense, this is actually a backend working object, so the API should allow more flexible and useful control of the list of messages, which may not always be processed in delivery order.

This object tries to do too much for efficiency of specific implementations. e.g. it includes the full body indexing code, so that messages can be 'summarised' and indexed at the same time. The problem is only the local backends need this, so it is just clutter for a base class.

This object is the main memory hog in Evolution mail. Although the structures are allocated efficiently and contain minimal amounts of data, this starts to fall down when you have upwards of 100 000 messages to deal with. This was the primary driving reason for the disksummary branch to exist in the first place.

Apps/Evolution/Camel.FolderSummary (last edited 2013-08-08 22:50:05 by WilliamJonMcCann)