Package CedarBackup3 :: Package extend :: Module mbox
[hide private]
[frames] | no frames]

Module mbox

source code

Provides an extension to back up mbox email files.

Backing up email

Email folders (often stored as mbox flatfiles) are not well-suited being backed up with an incremental backup like the one offered by Cedar Backup. This is because mbox files often change on a daily basis, forcing the incremental backup process to back them up every day in order to avoid losing data. This can result in quite a bit of wasted space when backing up large folders. (Note that the alternative maildir format does not share this problem, since it typically uses one file per message.)

One solution to this problem is to design a smarter incremental backup process, which backs up baseline content on the first day of the week, and then backs up only new messages added to that folder on every other day of the week. This way, the backup for any single day is only as large as the messages placed into the folder on that day. The backup isn't as "perfect" as the incremental backup process, because it doesn't preserve information about messages deleted from the backed-up folder. However, it should be much more space-efficient, and in a recovery situation, it seems better to restore too much data rather than too little.

What is this extension?

This is a Cedar Backup extension used to back up mbox email files via the Cedar Backup command line. Individual mbox files or directories containing mbox files can be backed up using the same collect modes allowed for filesystems in the standard Cedar Backup collect action: weekly, daily, incremental. It implements the "smart" incremental backup process discussed above, using functionality provided by the grepmail utility.

This extension requires a new configuration section <mbox> and is intended to be run either immediately before or immediately after the standard collect action. Aside from its own configuration, it requires the options and collect configuration sections in the standard Cedar Backup configuration file.

The mbox action is conceptually similar to the standard collect action, except that mbox directories are not collected recursively. This implies some configuration changes (i.e. there's no need for global exclusions or an ignore file). If you back up a directory, all of the mbox files in that directory are backed up into a single tar file using the indicated compression method.


Author: Kenneth J. Pronovici <pronovic@ieee.org>

Classes [hide private]
  MboxFile
Class representing mbox file configuration..
  MboxDir
Class representing mbox directory configuration..
  MboxConfig
Class representing mbox configuration.
  LocalConfig
Class representing this extension's configuration document.
Functions [hide private]
 
executeAction(configPath, options, config)
Executes the mbox backup action.
source code
 
_getCollectMode(local, item)
Gets the collect mode that should be used for an mbox file or directory.
source code
 
_getCompressMode(local, item)
Gets the compress mode that should be used for an mbox file or directory.
source code
 
_getRevisionPath(config, item)
Gets the path to the revision file associated with a repository.
source code
 
_loadLastRevision(config, item, fullBackup, collectMode)
Loads the last revision date for this item from disk and returns it.
source code
 
_writeNewRevision(config, item, newRevision)
Writes new revision information to disk.
source code
 
_getExclusions(mboxDir)
Gets exclusions (file and patterns) associated with an mbox directory.
source code
 
_getBackupPath(config, mboxPath, compressMode, newRevision, targetDir=None)
Gets the backup file path (including correct extension) associated with an mbox path.
source code
 
_getTarfilePath(config, mboxPath, compressMode, newRevision)
Gets the tarfile backup file path (including correct extension) associated with an mbox path.
source code
 
_getOutputFile(backupPath, compressMode)
Opens the output file used for saving backup information.
source code
 
_backupMboxFile(config, absolutePath, fullBackup, collectMode, compressMode, lastRevision, newRevision, targetDir=None)
Backs up an individual mbox file.
source code
 
_backupMboxDir(config, absolutePath, fullBackup, collectMode, compressMode, lastRevision, newRevision, excludePaths, excludePatterns)
Backs up a directory containing mbox files.
source code
Variables [hide private]
  logger = logging.getLogger("CedarBackup3.log.extend.mbox")
  GREPMAIL_COMMAND = ['grepmail']
  REVISION_PATH_EXTENSION = 'mboxlast'
  __package__ = 'CedarBackup3.extend'
Function Details [hide private]

executeAction(configPath, options, config)

source code 

Executes the mbox backup action.

Parameters:
  • configPath (String representing a path on disk.) - Path to configuration file on disk.
  • options (Options object.) - Program command-line options.
  • config (Config object.) - Program configuration.
Raises:
  • ValueError - Under many generic error conditions
  • IOError - If a backup could not be written for some reason.

_getCollectMode(local, item)

source code 

Gets the collect mode that should be used for an mbox file or directory. Use file- or directory-specific value if possible, otherwise take from mbox section.

Parameters:
  • local - LocalConfig object.
  • item - Mbox file or directory
Returns:
Collect mode to use.

_getCompressMode(local, item)

source code 

Gets the compress mode that should be used for an mbox file or directory. Use file- or directory-specific value if possible, otherwise take from mbox section.

Parameters:
  • local - LocalConfig object.
  • item - Mbox file or directory
Returns:
Compress mode to use.

_getRevisionPath(config, item)

source code 

Gets the path to the revision file associated with a repository.

Parameters:
  • config - Cedar Backup configuration.
  • item - Mbox file or directory
Returns:
Absolute path to the revision file associated with the repository.

_loadLastRevision(config, item, fullBackup, collectMode)

source code 

Loads the last revision date for this item from disk and returns it.

If this is a full backup, or if the revision file cannot be loaded for some reason, then None is returned. This indicates that there is no previous revision, so the entire mail file or directory should be backed up.

Parameters:
  • config - Cedar Backup configuration.
  • item - Mbox file or directory
  • fullBackup - Indicates whether this is a full backup
  • collectMode - Indicates the collect mode for this item
Returns:
Revision date as a datetime.datetime object or None.

Note: We write the actual revision object to disk via pickle, so we don't deal with the datetime precision or format at all. Whatever's in the object is what we write.

_writeNewRevision(config, item, newRevision)

source code 

Writes new revision information to disk.

If we can't write the revision file successfully for any reason, we'll log the condition but won't throw an exception.

Parameters:
  • config - Cedar Backup configuration.
  • item - Mbox file or directory
  • newRevision - Revision date as a datetime.datetime object.

Note: We write the actual revision object to disk via pickle, so we don't deal with the datetime precision or format at all. Whatever's in the object is what we write.

_getExclusions(mboxDir)

source code 

Gets exclusions (file and patterns) associated with an mbox directory.

The returned files value is a list of absolute paths to be excluded from the backup for a given directory. It is derived from the mbox directory's relative exclude paths.

The returned patterns value is a list of patterns to be excluded from the backup for a given directory. It is derived from the mbox directory's list of patterns.

Parameters:
  • mboxDir - Mbox directory object.
Returns:
Tuple (files, patterns) indicating what to exclude.

_getBackupPath(config, mboxPath, compressMode, newRevision, targetDir=None)

source code 

Gets the backup file path (including correct extension) associated with an mbox path.

We assume that if the target directory is passed in, that we're backing up a directory. Under these circumstances, we'll just use the basename of the individual path as the output file.

Parameters:
  • config - Cedar Backup configuration.
  • mboxPath - Path to the indicated mbox file or directory
  • compressMode - Compress mode to use for this mbox path
  • newRevision - Revision this backup path represents
  • targetDir - Target directory in which the path should exist
Returns:
Absolute path to the backup file associated with the repository.

Note: The backup path only contains the current date in YYYYMMDD format, but that's OK because the index information (stored elsewhere) is the actual date object.

_getTarfilePath(config, mboxPath, compressMode, newRevision)

source code 

Gets the tarfile backup file path (including correct extension) associated with an mbox path.

Along with the path, the tar archive mode is returned in a form that can be used with BackupFileList.generateTarfile.

Parameters:
  • config - Cedar Backup configuration.
  • mboxPath - Path to the indicated mbox file or directory
  • compressMode - Compress mode to use for this mbox path
  • newRevision - Revision this backup path represents
Returns:
Tuple of (absolute path to tarfile, tar archive mode)

Note: The tarfile path only contains the current date in YYYYMMDD format, but that's OK because the index information (stored elsewhere) is the actual date object.

_getOutputFile(backupPath, compressMode)

source code 

Opens the output file used for saving backup information.

If the compress mode is "gzip", we'll open a GzipFile, and if the compress mode is "bzip2", we'll open a BZ2File. Otherwise, we'll just return an object from the normal open() method.

Parameters:
  • backupPath - Path to file to open.
  • compressMode - Compress mode of file ("none", "gzip", "bzip").
Returns:
Output file object, opened in binary mode for use with executeCommand()

_backupMboxFile(config, absolutePath, fullBackup, collectMode, compressMode, lastRevision, newRevision, targetDir=None)

source code 

Backs up an individual mbox file.

Parameters:
  • config - Cedar Backup configuration.
  • absolutePath - Path to mbox file to back up.
  • fullBackup - Indicates whether this should be a full backup.
  • collectMode - Indicates the collect mode for this item
  • compressMode - Compress mode of file ("none", "gzip", "bzip")
  • lastRevision - Date of last backup as datetime.datetime
  • newRevision - Date of new (current) backup as datetime.datetime
  • targetDir - Target directory to write the backed-up file into
Raises:
  • ValueError - If some value is missing or invalid.
  • IOError - If there is a problem backing up the mbox file.

_backupMboxDir(config, absolutePath, fullBackup, collectMode, compressMode, lastRevision, newRevision, excludePaths, excludePatterns)

source code 

Backs up a directory containing mbox files.

Parameters:
  • config - Cedar Backup configuration.
  • absolutePath - Path to mbox directory to back up.
  • fullBackup - Indicates whether this should be a full backup.
  • collectMode - Indicates the collect mode for this item
  • compressMode - Compress mode of file ("none", "gzip", "bzip")
  • lastRevision - Date of last backup as datetime.datetime
  • newRevision - Date of new (current) backup as datetime.datetime
  • excludePaths - List of absolute paths to exclude.
  • excludePatterns - List of patterns to exclude.
Raises:
  • ValueError - If some value is missing or invalid.
  • IOError - If there is a problem backing up the mbox file.