nobodd.fs
The nobodd.fs
module contains the FatFileSystem
class which is
the primary entry point for reading FAT file-systems. Constructed with a buffer
object representing a memory mapping of the file-system, the class will
determine whether the format is FAT12, FAT16, or FAT32. The
root
attribute provides a Path-like object representing
the root directory of the file-system.
>>> from nobodd.disk import DiskImage
>>> from nobodd.fs import FatFileSystem
>>> img = DiskImage('test-gpt.img')
>>> fs = FatFileSystem(img.partitions[1].data)
>>> fs.fat_type
'fat16'
>>> fs.root
FatPath(<FatFileSystem label='TEST' fat_type='fat16'>, '/')
Warning
At the time of writing, the implementation is strictly not thread-safe. Attempting to write to the file-system from multiple threads (whether in separate instances or not) is likely to result in corruption. Attempting to write to the file-system from one thread, while reading from another will result in undefined behaviour including incorrect reads.
Warning
The implementation will not handle certain “obscure” extensions to FAT, such as sub-directory style roots on FAT-12/16. It will attempt to warn about these and abort if they are found.
FatFileSystem
- class nobodd.fs.FatFileSystem(mem, atime=False, encoding='iso-8859-1')[source]
Represents a FAT file-system, contained at the start of the buffer object mem.
This class supports the FAT-12, FAT-16, and FAT-32 formats, and will automatically determine which to use from the headers found at the start of mem. The type in use may be queried from
fat_type
. Of primary use is theroot
attribute which provides aFatPath
instance representing the root directory of the file-system.Instances can (and should) be used as a context manager; exiting the context will call the
close()
method implicitly. If certain header bits are set,DamagedFileSystem
andDirtyFileSystem
warnings may be generated upon opening.If atime is
False
, the default, then accesses to files will not update the atime field in file meta-data (when the underlying mem mapping is writable). Finally, encoding specifies the character set used for decoding and encoding DOS short filenames.- close()[source]
Releases the memory references derived from the buffer the instance was constructed with. This method is idempotent.
- open_dir(cluster)[source]
Opens the sub-directory in the specified cluster, returning a
FatDirectory
instance representing it.Warning
This method is intended for internal use by the
FatPath
class.
- open_entry(index, entry, mode='rb')[source]
Opens the specified entry, which must be a
DirectoryEntry
instance, which must be a member of index, an instance ofFatDirectory
. Returns aFatFile
instance associated with the specified entry. This permits writes to the file to be properly recorded in the corresponding directory entry.Warning
This method is intended for internal use by the
FatPath
class.
- open_file(cluster, mode='rb')[source]
Opens the file at the specified cluster, returning a
FatFile
instance representing it with the specified mode. Note that theFatFile
instance returned by this method has no directory entry associated with it.Warning
This method is intended for internal use by the
FatPath
class, specifically for “files” underlying the sub-directory structure which do not have an associated size (other than that dictated by their FAT chain of clusters).
- property atime
If the underlying mapping is writable, then atime (last access time) will be updated upon reading the content of files, when this property is
True
(the default isFalse
).
- property clusters
A
FatClusters
sequence representing the clusters containing the data stored in the file-system.Warning
This attribute is intended for internal use by the
FatFile
class, but may be useful for low-level exploration or manipulation of FAT file-systems.
- property fat
A
FatTable
sequence representing the FAT table itself.Warning
This attribute is intended for internal use by the
FatFile
class, but may be useful for low-level exploration or manipulation of FAT file-systems.
- property fat_type
Returns a
str
indicating the type of FAT file-system present. Returns one of “fat12”, “fat16”, or “fat32”.
- property label
Returns the label from the header of the file-system. This is an ASCII string up to 11 characters long.
- property root
Returns a
FatPath
instance (aPath
-like object) representing the root directory of the FAT file-system. For example:from nobodd.disk import DiskImage from nobodd.fs import FatFileSystem with DiskImage('test.img') as img: with FatFileSystem(img.partitions[1].data) as fs: print('ls /') for p in fs.root.iterdir(): print(p.name)
- property sfn_encoding
The encoding used for short (8.3) filenames. This defaults to “iso-8859-1” but unfortunately there’s no way of determining the correct codepage for these.
FatFile
- class nobodd.fs.FatFile(fs, start, mode='rb', index=None, entry=None)[source]
Represents an open file from a
FatFileSystem
.You should never need to construct this instance directly. Instead it (or wrapped variants of it) is returned by the
open()
method ofFatPath
instances. For example:from nobodd.disk import DiskImage from nobodd.fs import FatFileSystem with DiskImage('test.img') as img: with FatFileSystem(img.partitions[1].data) as fs: path = fs.root / 'bar.txt' with path.open('r', encoding='utf-8') as f: print(f.read())
Instances can (and should) be used as context managers to implicitly close references upon exiting the context. Instances are readable and seekable, and writable, depending on their opening mode and the nature of the underlying
FatFileSystem
.As a derivative of
io.RawIOBase
, all the usual I/O methods should be available.- close()[source]
Flush and close the IO object.
This method has no effect if the file is already closed.
- classmethod from_cluster(fs, start, mode='rb')[source]
Construct a
FatFile
from aFatFileSystem
, fs, and a start cluster. The optional mode is equivalent to the built-inopen()
function.Files constructed via this method do not have an associated directory entry. As a result, their size is assumed to be the full size of their cluster chain. This is typically used for the “file” backing a
FatSubDirectory
.Warning
This method is intended for internal use by the
FatPath
class.
- classmethod from_entry(fs, index, entry, mode='rb')[source]
Construct a
FatFile
from aFatFileSystem
, fs, aFatDirectory
, index, and aDirectoryEntry
, entry. The optional mode is equivalent to the built-inopen()
function.Files constructed via this method have an associated directory entry which will be updated if/when reads or writes occur (updating atime, mtime, and size fields).
Warning
This method is intended for internal use by the
FatPath
class.
- readable()[source]
Return whether object was opened for reading.
If False, read() will raise OSError.
- seek(pos, whence=0)[source]
Change the stream position to the given byte offset.
- offset
The stream position, relative to ‘whence’.
- whence
The relative position to seek from.
The offset is interpreted relative to the position indicated by whence. Values for whence are:
os.SEEK_SET or 0 – start of stream (the default); offset should be zero or positive
os.SEEK_CUR or 1 – current stream position; offset may be negative
os.SEEK_END or 2 – end of stream; offset is usually negative
Return the new absolute position.
- seekable()[source]
Return whether object supports random access.
If False, seek(), tell() and truncate() will raise OSError. This method may need to do a test seek().
Exceptions and Warnings
- exception nobodd.fs.FatWarning[source]
Base class for warnings issued by
FatFileSystem
.
- exception nobodd.fs.DirtyFileSystem[source]
Raised when opening a FAT file-system that has the “dirty” flag set in the second entry of the FAT.
- exception nobodd.fs.DamagedFileSystem[source]
Raised when opening a FAT file-system that has the I/O errors flag set in the second entry of the FAT.
- exception nobodd.fs.OrphanedLongFilename[source]
Raised when a
LongFilenameEntry
is found with a mismatched checksum, terminal flag, out of order index, etc. This usually indicates an orphaned entry as the result of a non-LFN aware file-system driver manipulating a directory.
- exception nobodd.fs.BadLongFilename[source]
Raised when a
LongFilenameEntry
is unambiguously corrupted, e.g. including a non-zero cluster number, in a way that would not be caused by a non-LFN aware file-system driver.
Internal Classes and Functions
You should never need to interact with these classes directly; use
FatFileSystem
instead. These classes exist to enumerate and manipulate
the FAT, and different types of root directory under FAT-12, FAT-16, and
FAT-32, and sub-directories (which are common across FAT types).
- class nobodd.fs.FatTable[source]
Abstract
MutableSequence
class representing the FAT table itself.This is the basis for
Fat12Table
,Fat16Table
, andFat32Table
. While all the implementations are potentially mutable (if the underlying memory mapping is writable), only direct replacement of FAT entries is valid. Insertion and deletion will raiseTypeError
.A concrete class is constructed by
FatFileSystem
(based on the type of FAT format found). Thechain()
method is used byFatFile
(and indirectlyFatSubDirectory
) to discover the chain of clusters that make up a file (or sub-directory). Thefree()
method is used by writableFatFile
instances to find the next free cluster to write to. Themark_free()
andmark_end()
methods are used to mark a clusters as being free or as the terminal cluster of a file.- chain(start)[source]
Generator method which yields all the clusters in the chain starting at start.
- free()[source]
Generator that scans the FAT for free clusters, yielding each as it is found. Iterating to the end of this generator raises
OSError
with the code ENOSPC (out of space).
- abstract get_all(cluster)[source]
Returns the value of cluster in all copies of the FAT, as a
tuple
(naturally, under normal circumstances, these should all be equal).
- class nobodd.fs.Fat12Table(mem, fat_size, info_mem=None)[source]
Concrete child of
FatTable
for FAT-12 file-systems.- min_valid = 2
- max_valid = 4079
- end_mark = 4095
- class nobodd.fs.Fat16Table(mem, fat_size, info_mem=None)[source]
Concrete child of
FatTable
for FAT-16 file-systems.- min_valid = 2
- max_valid = 65519
- end_mark = 65535
- class nobodd.fs.Fat32Table(mem, fat_size, info_mem=None)[source]
Concrete child of
FatTable
for FAT-32 file-systems.- min_valid = 2
- max_valid = 268435439
- end_mark = 268435455
- class nobodd.fs.FatClusters(mem, cluster_size)[source]
MutableSequence
representing the clusters of the file-system itself.While the sequence is mutable, clusters cannot be deleted or inserted, only read and (if the underlying buffer is writable) re-written.
- property size
Returns the size (in bytes) of clusters in the file-system.
- class nobodd.fs.FatDirectory[source]
An abstract
MutableMapping
representing a FAT directory. The mapping is ostensibly from filename toDirectoryEntry
instances, but there are several oddities to be aware of.In VFAT, many files effectively have two filenames: the original DOS “short” filename (SFN hereafter) and the VFAT “long” filename (LFN hereafter). All files have an SFN; any file may optionally have an LFN. The SFN is stored in the
DirectoryEntry
which records details of the file (mode, size, cluster, etc). The optional LFN is stored in leadingLongFilenameEntry
records.Even when
LongFilenameEntry
records do not precede aDirectoryEntry
, the file may still have an LFN that differs from the SFN in case only, recorded by flags in theDirectoryEntry
. Naturally, some files still only have one filename because the LFN doesn’t vary in case from the SFN, e.g. the special directory entries “.” and “..”, and anything which conforms to original DOS naming rules like “README.TXT”.For the purposes of listing files, most FAT implementations (including this one) ignore the SFNs. Hence, iterating over this mapping will not yield the SFNs as keys (unless the SFN is equal to the LFN), and they are not counted in the length of the mapping. However, for the purposes of testing existence, opening, etc., FAT implementations allow the use of SFNs. Hence, testing for membership, or manipulating entries via the SFN will work with this mapping, and will implicitly manipulate the associated LFNs (e.g. deleting an entry via a SFN key will also delete the associated LFN key).
In other words, if a file has a distinct LFN and SFN, it has two entries in the mapping (a “visible” LFN entry, and an “invisible” SFN entry). Further, note that FAT is case retentive (for LFNs; SFNs are folded uppercase), but not case sensitive. Hence, membership tests and retrieval from this mapping are case insensitive with regard to keys.
Finally, note that the values in the mapping are always instances of
DirectoryEntry
.LongFilenameEntry
instances are neither accepted nor returned; these are managed internally.- MAX_SFN_SUFFIX = 65535
- _clean_entries()[source]
Find and remove all deleted entries from the directory.
The method scans the directory for all directory entries and long filename entries which start with 0xE5, indicating a deleted entry, and overwrites them with later (not deleted) entries. Trailing entries are then zeroed out. The return value is the new offset of the terminal entry.
- _get_names(filename)[source]
Given a filename, generate an appropriately encoded long filename (encoded in little-endian UCS-2), short filename (encoded in the file-system’s SFN encoding), extension, and the case attributes. The result is a 4-tuple:
lfn, sfn, ext, attr
.lfn
,sfn
, andext
will bebytes
strings, andattr
will be anint
. If filename is capable of being represented as a short filename only (potentially with non-zero case attributes),lfn
in the result will be zero-length.
- _get_unique_sfn(prefix, ext)[source]
Given prefix and ext, which are
str
, of the short filename prefix and extension, find a suffix that is unique in the directory (amongst both long and short filenames, because these are still in the same namespace).For example, in a directory containing
default.config
(which has shortnameDEFAUL~1.CON
), given the filename and extensiondefault.conf
, this function will return thestr
DEFAUL~2.CON
.Because the search requires enumeration of the whole directory, which is expensive, an artificial limit of
MAX_SFN_SUFFIX
is enforced. If this is reached, the search will terminate with anOSError
with code ENOSPC (out of space).
- _group_entries()[source]
Generator which yields an offset, and a sequence of either
LongFilenameEntry
andDirectoryEntry
instances.Each tuple yielded represents a single (extant, non-deleted) file or directory with its long-filename entries at the start, and the directory entry as the final element. The offset associated with the sequence is the offset of the directory entry (not its preceding long filename entries). In other words, for a file with three long-filename entries, the following might be yielded:
(160, [ <LongFilenameEntry>), <LongFilenameEntry>), <LongFilenameEntry>), <DirectoryEntry>) ])
This indicates that the directory entry is at offset 160, preceded by long filename entries at offsets 128, 96, and 64.
- abstract _iter_entries()[source]
Abstract generator that is expected to yield successive offsets and the entries at those offsets as
DirectoryEntry
instances orLongFilenameEntry
instances, as appropriate.All instances must be yielded, in the order they appear on disk, regardless of whether they represent deleted, orphaned, corrupted, terminal, or post-terminal entries.
- _join_lfn_entries(entries, checksum, sequence=0, lfn=b'')[source]
Given entries, a sequence of
LongFilenameEntry
instances, decode the long filename encoded within them, ensuring that all the invariants (sequence number, checksums, terminal flag, etc.) are obeyed.Returns the decoded (
str
) long filename, orNone
if no valid long filename can be found. Emits various warnings if invalid entries are encountered during decoding, includingOrphanedLongFilename
andBadLongFilename
.
- _prefix_entries(filename, entry)[source]
Given entry, a
DirectoryEntry
, generate the necessaryLongFilenameEntry
instances (if any), that are necessary to associate entry with the specified filename.This function merely constructs the instances, ensuring the (many, convoluted!) rules are followed, including that the short filename, if one is generated, is unique in this directory, and the long filename is encoded and check-summed appropriately.
Note
The filename and ext fields of entry are ignored by this method. The only filename that is considered is the one explicitly passed in which becomes the basis for the long filename entries and the short filename stored within the entry itself.
The return value is the sequence of long filename entries and the modified directory entry in the order they should appear on disk.
- _split_entries(entries)[source]
Given entries, a sequence of
LongFilenameEntry
instances, ending with a singleDirectoryEntry
(as would typically be found in a FAT directory index), return the decoded long filename, short filename, and the directory entry record as a 3-tuple.If no long filename entries are present, the long filename will be equal to the short filename (but may have lower-case parts).
Note
This function also carries out several checks, including the filename checksum, that all checksums match, that the number of entries is valid, etc. Any violations found may raise warnings including
OrphanedLongFilename
andBadLongFilename
.
- abstract _update_entry(offset, entry)[source]
Abstract method which is expected to (re-)write entry (a
DirectoryEntry
orLongFilenameEntry
instance) at the specified offset in the directory.
- class nobodd.fs.FatRoot(mem, encoding)[source]
An abstract derivative of
FatDirectory
representing the (fixed-size) root directory of a FAT-12 or FAT-16 file-system. Must be constructed with mem, which is a buffer object covering the root directory clusters, and encoding, which is taken fromFatFileSystem.sfn_encoding
. TheFat12Root
andFat16Root
classes are (trivial) concrete derivatives of this.
- class nobodd.fs.FatSubDirectory(fs, start, encoding)[source]
A concrete derivative of
FatDirectory
representing a sub-directory in a FAT file-system (of any type). Must be constructed with fs (aFatFileSystem
instance), start (the first cluster of the sub-directory), and encoding, which is taken fromFatFileSystem.sfn_encoding
.
- class nobodd.fs.Fat12Root(mem, encoding)[source]
Concrete, trivial derivative of
FatRoot
which simply declares the root as belonging to a FAT-12 file-system.- fat_type = 'fat12'
- class nobodd.fs.Fat16Root(mem, encoding)[source]
Concrete, trivial derivative of
FatRoot
which simply declares the root as belonging to a FAT-16 file-system.- fat_type = 'fat16'
- class nobodd.fs.Fat32Root(fs, start, encoding)[source]
This is a trivial derivative of
FatSubDirectory
because, in FAT-32, the root directory is represented by the same structure as a regular sub-directory.
- nobodd.fs.fat_type(mem)[source]
Given a FAT file-system at the start of the buffer mem, determine its type, and decode its headers. Returns a four-tuple containing:
one of the strings “fat12”, “fat16”, or “fat32”
a
BIOSParameterBlock
instancea
ExtendedBIOSParameterBlock
instancea
FAT32BIOSParameterBlock
, if one is present, orNone
otherwise
- nobodd.fs.fat_type_from_count(bpb, ebpb, ebpb_fat32)[source]
Derives the type of the FAT file-system when it cannot be determined directly from the bpb and ebpb headers (the
BIOSParameterBlock
, andExtendedBIOSParameterBlock
respectively).Uses known limits on the number of clusters to derive the type of FAT in use. Returns one of the strings “fat12”, “fat16”, or “fat32”.