- Source: Design of the FAT file system
The FAT file system is a file system used on MS-DOS and Windows 9x family of operating systems. It continues to be used on mobile devices and embedded systems, and thus is a well-suited file system for data exchange between computers and devices of almost any type and age from 1981 through to the present.
Structure
A FAT file system is composed of four regions:
FAT uses little-endian format for all entries in the header (except for, where explicitly mentioned, some entries on Atari ST boot sectors) and the FAT(s). It is possible to allocate more FAT sectors than necessary for the number of clusters. The end of the last sector of each FAT copy can be unused if there are no corresponding clusters. The total number of sectors (as noted in the boot record) can be larger than the number of sectors used by data (clusters × sectors per cluster), FATs (number of FATs × sectors per FAT), the root directory (n/a for FAT32), and hidden sectors including the boot sector: this would result in unused sectors at the end of the volume. If a partition contains more sectors than the total number of sectors occupied by the file system it would also result in unused sectors, at the end of the partition, after the volume.
Reserved sectors area
= Boot Sector
=On non-partitioned storage devices, such as floppy disks, the Boot Sector (VBR) is the first sector (logical sector 0 with physical CHS address 0/0/1 or LBA address 0). For partitioned storage devices such as hard disks, the Boot Sector is the first sector of a partition, as specified in the partition table of the device.
BIOS Parameter Block
DOS 3.0 BPB:
The following extensions were documented since DOS 3.0, however, they were already supported by some issues of DOS 2.11. MS-DOS 3.10 still supported the DOS 2.0 format, but could use the DOS 3.0 format as well.
DOS 3.2 BPB:
Officially, MS-DOS 3.20 still used the DOS 3.0 format, but SYS and FORMAT were adapted to support a 6 bytes longer format already (of which not all entries were used).
DOS 3.31 BPB:
Officially introduced with DOS 3.31 and not used by DOS 3.2, some DOS 3.2 utilities were designed to be aware of this new format already. Official documentation recommends to trust these values only if the logical sectors entry at offset 0x013 is zero.
A simple formula translates a volume's given cluster number CN to a logical sector number LSN:
Determine (once) SSA=RSC+FN×SF+ceil((32×RDE)/SS), where the reserved sector count RSC is stored at offset 0x00E, the number of FATsFN at offset 0x010, the sectors per FAT SF at offset 0x016 (FAT12/FAT16) or 0x024 (FAT32), the root directory entries RDE at offset 0x011, the sector size SS at offset 0x00B, and ceil(x) rounds up to a whole number.
Determine LSN=SSA+(CN−2)×SC, where the sectors per cluster SC are stored at offset 0x00D.
On unpartitioned media the volume's number of hidden sectors is zero and therefore LSN and LBA addresses become the same for as long as a volume's logical sector size is identical to the underlying medium's physical sector size. Under these conditions, it is also simple to translate between CHS addresses and LSNs as well:
LSN=SPT×(HN+(NOS×TN))+SN−1, where the sectors per track SPT are stored at offset 0x018, and the number of sides NOS at offset 0x01A. Track number TN, head number HN, and sector number SN correspond to Cylinder-head-sector: the formula gives the known CHS to LBA translation.
Extended BIOS Parameter Block
Further structure used by FAT12 and FAT16 since OS/2 1.0 and DOS 4.0, also known as Extended BIOS Parameter Block (EBPB) (bytes below sector offset 0x024 are the same as for the DOS 3.31 BPB):
FAT32 Extended BIOS Parameter Block
In essence FAT32 inserts 28 bytes into the EBPB, followed by the remaining 26 (or sometimes only 7) EBPB bytes as shown above for FAT12 and FAT16. Microsoft and IBM operating systems determine the type of FAT file system used on a volume solely by the number of clusters, not by the used BPB format or the indicated file system type, that is, it is technically possible to use a "FAT32 EBPB" also for FAT12 and FAT16 volumes as well as a DOS 4.0 EBPB for small FAT32 volumes. Since such volumes were found to be created by Windows operating systems under some odd conditions, operating systems should be prepared to cope with these hybrid forms.
Exceptions
Versions of DOS before 3.2 totally or partially relied on the media descriptor byte in the BPB or the FAT ID byte in cluster 0 of the first FAT in order to determine FAT12 diskette formats even if a BPB is present. Depending on the FAT ID found and the drive type detected they default to use one of the following BPB prototypes instead of using the values actually stored in the BPB.
Originally, the FAT ID was meant to be a bit flag with all bits set except for bit 2 cleared to indicate an 80 track (vs. 40 track) format, bit 1 cleared to indicate a 9 sector (vs. 8 sector) format, and bit 0 cleared to indicate a single-sided (vs. double-sided) format, but this scheme was not followed by all OEMs and became obsolete with the introduction of hard disks and high-density formats. Also, the various 8-inch formats supported by 86-DOS and MS-DOS do not fit this scheme.
Microsoft recommends to distinguish between the two 8-inch formats for FAT ID 0xFE by trying to read of a single-density address mark. If this results in an error, the medium must be double-density.
The table does not list a number of incompatible 8-inch and 5.25-inch FAT12 floppy formats supported by 86-DOS, which differ either in the size of the directory entries (16 bytes vs. 32 bytes) or in the extent of the reserved sectors area (several whole tracks vs. one logical sector only).
The implementation of a single-sided 315 KB FAT12 format used in MS-DOS for the Apricot PC and F1e had a different boot sector layout, to accommodate that computer's non-IBM compatible BIOS. The jump instruction and OEM name were omitted, and the MS-DOS BPB parameters (offsets 0x00B-0x017 in the standard boot sector) were located at offset 0x050. The Portable, F1, PC duo and Xi FD supported a non-standard double-sided 720 KB FAT12 format instead. The differences in the boot sector layout and media IDs made these formats incompatible with many other operating systems. The geometry parameters for these formats are:
315 KB: Bytes per logical sector: 512 bytes, logical sectors per cluster: 1, reserved logical sectors: 1, number of FATs: 2, root directory entries: 128, total logical sectors: 630, FAT ID: 0xFC, logical sectors per FAT: 2, physical sectors per track: 9, number of heads: 1.
720 KB: Bytes per logical sector: 512 bytes, logical sectors per cluster: 2, reserved logical sectors: 1, number of FATs: 2, root directory entries: 176, total logical sectors: 1440, FAT ID: 0xFE, logical sectors per FAT: 3, physical sectors per track: 9, number of heads: 2.
Later versions of Apricot MS-DOS gained the ability to read and write disks with the standard boot sector in addition to those with the Apricot one. These formats were also supported by DOS Plus 2.1e/g for the Apricot ACT series.
The DOS Plus adaptation for the BBC Master 512 supported two FAT12 formats on 80-track, double-sided, double-density 5.25" drives, which did not use conventional boot sectors at all. 800 KB data disks omitted a boot sector and began with a single copy of the FAT. The first byte of the relocated FAT in logical sector 0 was used to determine the disk's capacity. 640 KB boot disks began with a miniature ADFS file system containing the boot loader, followed by a single FAT. Also, the 640 KB format differed by using physical CHS sector numbers starting with 0 (not 1, as common) and incrementing sectors in the order sector-track-head (not sector-head-track, as common). The FAT started at the beginning of the next track. These differences make these formats unrecognizable by other operating systems. The geometry parameters for these formats are:
800 KB: Bytes per logical sector: 1024 bytes, logical sectors per cluster: 1, reserved logical sectors: 0, number of FATs: 1, root directory entries: 192, total logical sectors: 800, FAT ID: 0xFD, logical sectors per FAT: 2, physical sectors per track: 5, number of heads: 2.
640 KB: Bytes per logical sector: 256 bytes, logical sectors per cluster: 8, reserved logical sectors: 16, number of FATs: 1, root directory entries: 112, total logical sectors: 2560, FAT ID: 0xFF, logical sectors per FAT: 2, physical sectors per track: 16, number of heads: 2.
DOS Plus for the Master 512 could also access standard PC disks formatted to 180 KB or 360 KB, using the first byte of the FAT in logical sector 1 to determine the capacity.
The DEC Rainbow 100 (all variations) supported one FAT12 format on 80-track, single-sided, quad-density 5.25" drives. The first two tracks were reserved for the boot loader, but didn't contain an MBR nor a BPB (MS-DOS used a static in-memory BPB instead). The boot sector (track 0, side 0, sector 1) was Z80 code beginning with DI 0xF3. The 8088 bootstrap was loaded by the Z80. Track 1, side 0, sector 2 starts with the Media/FAT ID byte 0xFA. Unformatted disks use 0xE5 instead. The file system starts on track 2, side 0, sector 1. There are 2 copies of the FAT and 96 entries in the root directory. In addition, there is a physical to logical track mapping to effect a 2:1 sector interleaving. The disks were formatted with the physical sectors in order numbered 1 to 10 on each track after the reserved tracks, but the logical sectors from 1 to 10 were stored in physical sectors 1, 6, 2, 7, 3, 8, 4, 9, 5, 10.
= FS Information Sector
=The "FS Information Sector" was introduced in FAT32 for speeding up access times of certain operations (in particular, getting the amount of free space). It is located at a logical sector number specified in the FAT32 EBPB boot record at position 0x030 (usually logical sector 1, immediately after the boot record itself).
The sector's data may be outdated and not reflect the current media contents, because not all operating systems update or use this sector, and even if they do, the contents is not valid when the medium has been ejected without properly unmounting the volume or after a power-failure. Therefore, operating systems should first inspect a volume's optional shutdown status bitflags residing in the FAT entry of cluster 1 or the FAT32 EBPB at offset 0x041 and ignore the data stored in the FS information sector, if these bitflags indicate that the volume was not properly unmounted before. This does not cause any problems other than a possible speed penalty for the first free space query or data cluster allocation; see fragmentation.
If this sector is present on a FAT32 volume, the minimum allowed logical sector size is 512 bytes, whereas otherwise it would be 128 bytes. Some FAT32 implementations support a slight variation of Microsoft's specification by making the FS information sector optional by specifying a value of 0xFFFF (or 0x0000) in the entry at offset 0x030.
FAT region
= File Allocation Table
=Cluster map
A volume's data area is divided into identically sized clusters—small blocks of contiguous space. Cluster sizes vary depending on the type of FAT file system being used and the size of the drive; typical cluster sizes range from 2 to 32 KiB.
Each file may occupy one or more clusters depending on its size. Thus, a file is represented by a chain of clusters (referred to as a singly linked list). These clusters are not necessarily stored adjacent to one another on the disk's surface but are often instead fragmented throughout the Data Region.
Each version of the FAT file system uses a different size for FAT entries. Smaller numbers result in a smaller FAT, but waste space in large partitions by needing to allocate in large clusters.
The FAT12 file system uses 12 bits per FAT entry, thus two entries span 3 bytes. It is consistently little-endian: if those three bytes are considered as one little-endian 24-bit number, the 12 least significant bits represent the first entry (e.g. cluster 0) and the 12 most significant bits the second (e.g. cluster 1). In other words, while the low eight bits of the first cluster in the row are stored in the first byte, the top four bits are stored in the low nibble of the second byte, whereas the low four bits of the subsequent cluster in the row are stored in the high nibble of the second byte and its higher eight bits in the third byte.
The FAT16 file system uses 16 bits per FAT entry, thus one entry spans two bytes in little-endian byte order:
The FAT32 file system uses 32 bits per FAT entry, thus one entry spans four bytes in little-endian byte order. The four top bits of each entry are reserved for other purposes; they are cleared during formatting and should not be changed otherwise. They must be masked off before interpreting the entry as 28-bit cluster address.
The File Allocation Table (FAT) is a contiguous number of sectors immediately following the area of reserved sectors. It represents a list of entries that map to each cluster on the volume. Each entry records one of four things:
the cluster number of the next cluster in a chain
a special end of cluster-chain (EOC) entry that indicates the end of a chain
a special entry to mark a bad cluster
a zero to note that the cluster is unused
For very early versions of DOS to recognize the file system, the system must have been booted from the volume or the volume's FAT must start with the volume's second sector (logical sector 1 with physical CHS address 0/0/2 or LBA address 1), that is, immediately following the boot sector. Operating systems assume this hard-wired location of the FAT in order to find the FAT ID in the FAT's cluster 0 entry on DOS 1.0-1.1 FAT diskettes, where no valid BPB is found.
Special entries
The first two entries in a FAT store special values:
The first entry (cluster 0 in the FAT) holds the FAT ID since MS-DOS 1.20 and PC DOS 1.1 (allowed values 0xF0-0xFF with 0xF1-0xF7 reserved) in bits 7-0, which is also copied into the BPB of the boot sector, offset 0x015 since DOS 2.0. The remaining 4 bits (if FAT12), 8 bits (if FAT16) or 20 bits (if FAT32, the 4 MSB bits are zero) of this entry are always 1. These values were arranged so that the entry would also function as a "trap-all" end-of-chain marker for all data clusters holding a value of zero. Additionally, for FAT IDs other than 0xFF (and 0x00) it is possible to determine the correct nibble and byte order (to be) used by the file system driver, however, the FAT file system officially uses a little-endian representation only and there are no known implementations of variants using big-endian values instead. 86-DOS 0.42 up to MS-DOS 1.14 used hard-wired drive profiles instead of a FAT ID, but used this byte to distinguish between media formatted with 32-byte or 16-byte directory entries, as they were used prior to 86-DOS 0.42.
The second entry (cluster 1 in the FAT) nominally stores the end-of-cluster-chain marker as used by the formater, but typically always holds 0xFFF / 0xFFFF / 0x0FFFFFFF, that is, with the exception of bits 31-28 on FAT32 volumes these bits are normally always set. Some Microsoft operating systems, however, set these bits if the volume is not the volume holding the running operating system (that is, use 0xFFFFFFFF instead of 0x0FFFFFFF here). (In conjunction with alternative end-of-chain markers the lowest bits 2-0 can become zero for the lowest allowed end-of-chain marker 0xFF8 / 0xFFF8 / 0x?FFFFFF8; bit 3 should be reserved as well given that clusters 0xFF0 / 0xFFF0 / 0x?FFFFFF0 and higher are officially reserved. Some operating systems may not be able to mount some volumes if any of these bits are not set, therefore the default end-of-chain marker should not be changed.) For DOS 1 and 2, the entry was documented as reserved for future use.
Since DOS 7.1 the two most-significant bits of this cluster entry may hold two optional bitflags representing the current volume status on FAT16 and FAT32, but not on FAT12 volumes. These bitflags are not supported by all operating systems, but operating systems supporting this feature would set these bits on shutdown and clear the most significant bit on startup:
If bit 15 (on FAT16) or bit 27 (on FAT32) is not set when mounting the volume, the volume was not properly unmounted before shutdown or ejection and thus is in an unknown and possibly "dirty" state. On FAT32 volumes, the FS Information Sector may hold outdated data and thus should not be used. The operating system would then typically run SCANDISK or CHKDSK on the next startup (but not on insertion of removable media) to ensure and possibly reestablish the volume's integrity.
If bit 14 (on FAT16) or bit 26 (on FAT32) is cleared, the operating system has encountered disk I/O errors on startup, a possible indication for bad sectors. Operating systems aware of this extension will interpret this as a recommendation to carry out a surface scan (SCANDISK) on the next boot. (A similar set of bitflags exists in the FAT12/FAT16 EBPB at offset 0x1A or the FAT32 EBPB at offset 0x36. While the cluster 1 entry can be accessed by file system drivers once they have mounted the volume, the EBPB entry is available even when the volume is not mounted and thus easier to use by disk block device drivers or partitioning tools.)
If the number of FATs in the BPB is not set to 2, the second cluster entry in the first FAT (cluster 1) may also reflect the status of a TFAT volume for TFAT-aware operating systems. If the cluster 1 entry in that FAT holds the value 0, this may indicate that the second FAT represents the last known valid transaction state and should be copied over the first FAT, whereas the first FAT should be copied over the second FAT if all bits are set.
Some non-standard FAT12/FAT16 implementations utilize the cluster 1 entry to store the starting cluster of a variable-sized root directory (typically 2). This may occur when the number of root directory entries in the BPB holds a value of 0 and no FAT32 EBPB is found (no signature 0x29 or 0x28 at offset 0x042). This extension, however, is not supported by mainstream operating systems, as it conflicts with other possible uses of the cluster 1 entry. Most conflicts can be ruled out if this extension is only allowed for FAT12 with less than 0xFEF and FAT16 volumes with less than 0x3FEF clusters and 2 FATs.
Because these first two FAT entries store special values, there are no data clusters 0 or 1. The first data cluster (after the root directory if FAT12/FAT16) is cluster 2, marking the beginning of the data area.
Cluster values
FAT entry values:
FAT32 uses 28 bits for cluster numbers. The remaining 4 bits in the 32-bit FAT entry are usually zero, but are reserved and should be left untouched. A standard conformant FAT32 file system driver or maintenance tool must not rely on the upper 4 bits to be zero and it must strip them off before evaluating the cluster number in order to cope with possible future expansions where these bits may be used for other purposes. They must not be cleared by the file system driver when allocating new clusters, but should be cleared during a reformat.
Root directory region
The root directory table in FAT12 and FAT16 file systems occupies the special Root Directory Region location.
Data region
Aside from the root directory table in FAT12 and FAT16 file systems, which occupies the special Root Directory Region location, all directory tables are stored in the data region. The actual number of entries in a directory stored in the data region can grow by adding another cluster to the chain in the FAT.
= Directory table
=A directory table is a special type of file that represents a directory (also known as a folder). Since 86-DOS 0.42, each file or (since MS-DOS 1.40 and PC DOS 2.0) subdirectory stored within it is represented by a 32-byte entry in the table. Each entry records the name, extension, attributes (archive, directory, hidden, read-only, system and volume), the address of the first cluster of the file/directory's data, the size of the file/directory, and the date and (since PC DOS 1.1) also the time of last modification. Earlier versions of 86-DOS used 16-byte directory entries only, supporting no files larger than 16 MB and no time of last modification.
The FAT file system itself does not impose any limits on the depth of a subdirectory tree for as long as there are free clusters available to allocate the subdirectories, however, the internal Current Directory Structure (CDS) under MS-DOS/PC DOS limits the absolute path of a directory to 66 characters (including the drive letter, but excluding the NUL byte delimiter), thereby limiting the maximum supported depth of subdirectories to 32, whatever occurs earlier. Concurrent DOS, Multiuser DOS and DR DOS 3.31 to 6.0 (up to including the 1992-11 updates) do not store absolute paths to working directories internally and therefore do not show this limitation. The same applies to Atari GEMDOS, but the Atari Desktop does not support more than 8 sub-directory levels. Most applications aware of this extension support paths up to at least 127 bytes. FlexOS, 4680 OS and 4690 OS support a length of up to 127 bytes as well, allowing depths down to 60 levels. PalmDOS, DR DOS 6.0 (since BDOS 7.1) and higher, Novell DOS, and OpenDOS sport a MS-DOS-compatible CDS and therefore have the same length limits as MS-DOS/PC DOS.
Each entry can be preceded by "fake entries" to support a VFAT long filename (LFN); see further below.
Legal characters for DOS short filenames include the following:
Upper case letters A–Z
Numbers 0–9
Space (though trailing spaces in either the base name or the extension are considered to be padding and not a part of the file name; also filenames with space in them could not easily be used on the DOS command line prior to Windows 95 because of the lack of a suitable escaping system). Another exception are the internal commands MKDIR/MD and RMDIR/RD under DR-DOS which accept single arguments and therefore allow spaces to be entered.
! # $ % & ' ( ) - @ ^ _ ` { } ~
Characters 128–228
Characters 230–255
This excludes the following ASCII characters:
" * / : < > ? \ | Windows/MS-DOS has no shell escape character
+ , . ; = [ ] Allowed in long file names only
Lower case letters a–zStored as A–Z; allowed in long file names
Control characters 0–31
Character 127 (DEL)
Character 229 (0xE5) was not allowed as first character in a filename in DOS 1 and 2 due to its use as free entry marker. A special case was added to circumvent this limitation with DOS 3.0 and higher.
The following additional characters are allowed on Atari's GEMDOS, but should be avoided for compatibility with MS-DOS/PC DOS:
" + , ; < = > [ ] |
The semicolon (;) should be avoided in filenames under DR DOS 3.31 and higher, PalmDOS, Novell DOS, OpenDOS, Concurrent DOS, Multiuser DOS, System Manager and REAL/32, because it may conflict with the syntax to specify file and directory passwords: "...\DIRSPEC.EXT;DIRPWD\FILESPEC.EXT;FILEPWD". The operating system will strip off one (and also two—since DR-DOS 7.02) semicolons and pending passwords from the filenames before storing them on disk. (The command processor 4DOS uses semicolons for include lists and requires the semicolon to be doubled for password protected files with any commands supporting wildcards.)
The at-sign character (@) is used for filelists by many DR-DOS, PalmDOS, Novell DOS, OpenDOS and Multiuser DOS, System Manager and REAL/32 commands, as well as by 4DOS and may therefore sometimes be difficult to use in filenames.
Under Multiuser DOS and REAL/32, the exclamation mark (!) is not a valid filename character since it is used to separate multiple commands in a single command line.
Under IBM 4680 OS and 4690 OS, the following characters are not allowed in filenames:
? * : . ; , [ ] ! + = < > " - / \ |
Additionally, the following special characters are not allowed in the first, fourth, fifth and eight character of a filename, as they conflict with the host command processor (HCP) and input sequence table build file names:
@ # ( ) { } $ &
The DOS file names are in the current OEM character set: this can have surprising effects if characters handled in one way for a given code page are interpreted differently for another code page (DOS command CHCP) with respect to lower and upper case, sorting, or validity as file name character.
Directory entry
Before Microsoft added support for long filenames and creation/access time stamps, bytes 0x0C–0x15 of the directory entry were used by other operating systems to store additional metadata, most notably the operating systems of the Digital Research family stored file passwords, access rights, owner IDs, and file deletion data there. While Microsoft's newer extensions are not fully compatible with these extensions by default, most of them can coexist in third-party FAT implementations (at least on FAT12 and FAT16 volumes).
32-byte directory entries, both in the Root Directory Region and in subdirectories, are of the following format (see also 8.3 filename):
The FlexOS-based operating systems IBM 4680 OS and IBM 4690 OS support unique distribution attributes stored in some bits of the previously reserved areas in the directory entries:
Local: Don't distribute file but keep on local controller only.
Mirror file on update: Distribute file to server only when file is updated.
Mirror file on close: Distribute file to server only when file is closed.
Compound file on update: Distribute file to all controllers when file is updated.
Compound file on close: Distribute file to all controllers when file is closed.
Some incompatible extensions found in some operating systems include:
Size limits
The FAT12, FAT16, FAT16B, and FAT32 variants of the FAT file systems have clear limits based on the number of clusters and the number of sectors per cluster (1, 2, 4, ..., 128). For the typical value of 512 bytes per sector:
Legend: 268435444+3 is 0x0FFFFFF7, because FAT32 version 0 uses only 28 bits in the 32-bit cluster numbers, cluster numbers 0x0FFFFFF7 up to 0x0FFFFFFF flag bad clusters or the end of a file, cluster number 0 flags a free cluster, and cluster number 1 is not used. Likewise 65524+3 is 0xFFF7 for FAT16, and 4084+3 is 0xFF7 for FAT12. The number of sectors per cluster is a power of 2 fitting in a single byte, the smallest value is 1 (0x01), the biggest value is 128 (0x80). Lines in square brackets indicate the unusual cluster size 128, and for FAT32 the bigger than necessary cluster sizes 32 or 64.
Because each FAT32 entry occupies 32 bits (4 bytes) the maximal number of clusters (268435444) requires 2097152 FAT sectors for a sector size of 512 bytes. 2097152 is 0x200000, and storing this value needs more than two bytes. Therefore, FAT32 introduced a new 32-bit value in the FAT32 boot sector immediately following the 32-bit value for the total number of sectors introduced in the FAT16B variant.
The boot record extensions introduced with DOS 4.0 start with a magic 40 (0x28) or 41 (0x29). Typically FAT drivers look only at the number of clusters to distinguish FAT12, FAT16, and FAT32: the human readable strings identifying the FAT variant in the boot record are ignored, because they exist only for media formatted with DOS 4.0 or later.
Determining the number of directory entries per cluster is straightforward. Each entry occupies 32 bytes; this results in 16 entries per sector for a sector size of 512 bytes. The DOS 5 RMDIR/RD command removes the initial "." (this directory) and ".." (parent directory) entries in subdirectories directly, therefore sector size 32 on a RAM disk is possible for FAT12, but requires 2 or more sectors per cluster. A FAT12 boot sector without the DOS 4 extensions needs 29 bytes before the first unnecessary FAT16B 32-bit number of hidden sectors, this leaves three bytes for the (on a RAM disk unused) boot code and the magic 0x55 0xAA at the end of all boot sectors. On Windows NT the smallest supported sector size is 128.
On Windows NT operating systems the FORMAT command options /A:128K and /A:256K correspond to the maximal cluster size 0x80 (128) with a sector size 1024 and 2048, respectively. For the common sector size 512 /A:64K yields 128 sectors per cluster.
Both editions of each ECMA-107 and ISO/IEC 9293 specify a Max Cluster Number MAX determined by the formula MAX=1+trunc((TS-SSA)/SC), and reserve cluster numbers MAX+1 up to 4086 (0xFF6, FAT12) and later 65526 (0xFFF6, FAT16) for future standardization.
Microsoft's EFI FAT32 specification states that any FAT file system with less than 4085 clusters is FAT12, else any FAT file system with less than 65,525 clusters is FAT16, and otherwise it is FAT32. The entry for cluster 0 at the beginning of the FAT must be identical to the media descriptor byte found in the BPB, whereas the entry for cluster 1 reflects the end-of-chain value used by the formatter for cluster chains (0xFFF, 0xFFFF or 0x0FFFFFFF). The entries for cluster numbers 0 and 1 end at a byte boundary even for FAT12, e.g., 0xF9FFFF for media descriptor 0xF9.
The first data cluster is 2, and consequently the last cluster MAX gets number MAX+1. This results in data cluster numbers 2...4085 (0xFF5) for FAT12, 2...65525 (0xFFF5) for FAT16, and 2...268435445 (0x0FFFFFF5) for FAT32.
The only available values reserved for future standardization are therefore 0xFF6 (FAT12) and 0xFFF6 (FAT16). As noted below "less than 4085" is also used for Linux implementations, or as Microsoft's FAT specification puts it:
...when it says <, it does not mean <=. Note also that the numbers are correct. The first number for FAT12 is 4085; the second number for FAT16 is 65525. These numbers and the "<" signs are not wrong."
Fragmentation
The FAT file system does not contain built-in mechanisms which prevent newly written files from becoming scattered across the partition. On volumes where files are created and deleted frequently or their lengths often changed, the medium will become increasingly fragmented over time.
While the design of the FAT file system does not cause any organizational overhead in disk structures or reduce the amount of free storage space with increased amounts of fragmentation, as it occurs with external fragmentation, the time required to read and write fragmented files will increase as the operating system will have to follow the cluster chains in the FAT (with parts having to be loaded into memory first in particular on large volumes) and read the corresponding data physically scattered over the whole medium reducing chances for the low-level block device driver to perform multi-sector disk I/O or initiate larger DMA transfers, thereby effectively increasing I/O protocol overhead as well as arm movement and head settle times inside the disk drive. Also, file operations will become slower with growing fragmentation as it takes increasingly longer for the operating system to find files or free clusters.
Other file systems, e.g., HPFS or exFAT, use free space bitmaps that indicate used and available clusters, which could then be quickly looked up in order to find free contiguous areas. Another solution is the linkage of all free clusters into one or more lists (as is done in Unix file systems). Instead, the FAT has to be scanned as an array to find free clusters, which can lead to performance penalties with large disks.
In fact, seeking for files in large subdirectories or computing the free disk space on FAT volumes is one of the most resource intensive operations, as it requires reading the directory tables or even the entire FAT linearly. Since the total amount of clusters and the size of their entries in the FAT was still small on FAT12 and FAT16 volumes, this could still be tolerated on FAT12 and FAT16 volumes most of the time, considering that the introduction of more sophisticated disk structures would have also increased the complexity and memory footprint of real-mode operating systems with their minimum total memory requirements of 128 KB or less (such as with DOS) for which FAT has been designed and optimized originally.
With the introduction of FAT32, long seek and scan times became more apparent, particularly on very large volumes. A possible justification suggested by Microsoft's Raymond Chen for limiting the maximum size of FAT32 partitions created on Windows was the time required to perform a "DIR" operation, which always displays the free disk space as the last line. Displaying this line took longer and longer as the number of clusters increased. FAT32 therefore introduced a special file system information sector where the previously computed amount of free space is preserved over power cycles, so that the free space counter needs to be recalculated only when a removable FAT32 formatted medium gets ejected without first unmounting it or if the system is switched off without properly shutting down the operating system, a problem mostly visible with pre-ATX-style PCs, on plain DOS systems and some battery-powered consumer products.
With the huge cluster sizes (16 KB, 32 KB, 64 KB) forced by larger FAT partitions, internal fragmentation in form of disk space waste by file slack due to cluster overhang (as files are rarely exact multiples of cluster size) starts to be a problem as well, especially when there are a great many small files.
Various optimizations and tweaks to the implementation of FAT file system drivers, block device drivers and disk tools have been devised to overcome most of the performance bottlenecks in the file system's inherent design without having to change the layout of the on-disk structures. They can be divided into on-line and off-line methods and work by trying to avoid fragmentation in the file system in the first place, deploying methods to better cope with existing fragmentation, and by reordering and optimizing the on-disk structures. With optimizations in place, the performance on FAT volumes can often reach that of more sophisticated file systems in practical scenarios, while at the same time retaining the advantage of being accessible even on very small or old systems.
DOS 3.0 and higher will not immediately reuse disk space of deleted files for new allocations but instead seek for previously unused space before starting to use disk space of previously deleted files as well. This not only helps to maintain the integrity of deleted files for as long as possible but also speeds up file allocations and avoids fragmentation, since never before allocated disk space is always unfragmented.
DOS accomplishes this by keeping a pointer to the last allocated cluster on each mounted volume in memory and starts searching for free space from this location upwards instead of at the beginning of the FAT, as it was still done by DOS 2.x. If the end of the FAT is reached, it would wrap around to continue the search at the beginning of the FAT until either free space has been found or the original position has been reached again without having found free space. These pointers are initialized to point to the start of the FATs after bootup, but on FAT32 volumes, DOS 7.1 and higher will attempt to retrieve the last position from the FS Information Sector.
This mechanism is defeated, however, if an application often deletes and recreates temporary files as the operating system would then try to maintain the integrity of void data effectively causing more fragmentation in the end. In some DOS versions, the usage of a special API function to create temporary files can be used to avoid this problem.
Additionally, directory entries of deleted files will be marked 0xE5 since DOS 3.0. DOS 5.0 and higher will start to reuse these entries only when previously unused directory entries have been used up in the table and the system would otherwise have to expand the table itself.
Since DOS 3.3 the operating system provides means to improve the performance of file operations with FASTOPEN by keeping track of the position of recently opened files or directories in various forms of lists (MS-DOS/PC DOS) or hash tables (DR-DOS), which can reduce file seek and open times significantly. Before DOS 5.0 special care must be taken when using such mechanisms in conjunction with disk defragmentation software bypassing the file system or disk drivers.
Windows NT will allocate disk space to files on FAT in advance, selecting large contiguous areas, but in case of a failure, files which were being appended will appear larger than they were ever written into, with a lot of random data at the end.
Other high-level mechanisms may read in and process larger parts or the complete FAT on startup or on demand when needed and dynamically build up in-memory tree representations of the volume's file structures different from the on-disk structures. This may, on volumes with many free clusters, occupy even less memory than an image of the FAT itself. In particular on highly fragmented or filled volumes, seeks become much faster than with linear scans over the actual FAT, even if an image of the FAT would be stored in memory. Also, operating on the logically high level of files and cluster-chains instead of on sector or track level, it becomes possible to avoid some degree of file fragmentation in the first place or to carry out local file defragmentation and reordering of directory entries based on their names or access patterns in the background.
Some of the perceived problems with fragmentation of FAT file systems also result from performance limitations of the underlying block device drivers, which becomes more visible the lesser memory is available for sector buffering and track blocking/deblocking:
While the single-tasking DOS had provisions for multi-sector reads and track blocking/deblocking, the operating system and the traditional PC hard disk architecture (only one outstanding input/output request at a time and no DMA transfers) originally did not contain mechanisms which could alleviate fragmentation by asynchronously prefetching next data while the application was processing the previous chunks. Such features became available later. Later DOS versions also provided built-in support for look-ahead sector buffering and came with dynamically loadable disk caching programs working on physical or logical sector level, often utilizing EMS or XMS memory and sometimes providing adaptive caching strategies or even run in protected mode through DPMS or Cloaking to increase performance by gaining direct access to the cached data in linear memory rather than through conventional DOS APIs.
Write-behind caching was often not enabled by default with Microsoft software (if present) given the problem of data loss in case of a power failure or crash, made easier by the lack of hardware protection between applications and the system.
VFAT long file names
VFAT Long File Names (LFNs) are stored on a FAT file system using a trick: adding additional entries into the directory before the normal file entry. The additional entries are marked with the Volume Label, System, Hidden, and Read Only attributes (yielding 0x0F), which is a combination that is not expected in the MS-DOS environment, and therefore ignored by MS-DOS programs and third-party utilities. Notably, a directory containing only volume labels is considered as empty and is allowed to be deleted; such a situation appears if files created with long names are deleted from plain DOS. This method is very similar to the DELWATCH method to utilize the volume attribute to hide pending delete files for possible future undeletion since DR DOS 6.0 (1991) and higher. It is also similar to a method publicly discussed to store long filenames on Ataris and under Linux in 1992.
Because older versions of DOS could mistake LFN names in the root directory for the volume label, VFAT was designed to create a blank volume label in the root directory before adding any LFN name entries (if a volume label did not already exist).
Each phony entry can contain up to 13 UCS-2 characters (26 bytes) by using fields in the record which contain file size or time stamps (but not the starting cluster field, for compatibility with disk utilities, the starting cluster field is set to a value of 0. See 8.3 filename for additional explanations). Up to 20 of these 13-character entries may be chained, supporting a maximum length of 255 UCS-2 characters.
If the position of the LFN's last character is not at a directory entry boundary (13, 26, 39, ...), then a 0x0000 terminator is added in the next character position. Then, if that terminator is also not at the boundary, remaining character positions are filled with 0xFFFF. No directory entry containing a lone terminator will exist.
LFN entries use the following format:
If there are multiple LFN entries required to represent a file name, the entry representing the end of the filename comes first. The sequence number of this entry has bit 6 (0x40) set to represent that it is the last logical LFN entry, and it has the highest sequence number. The sequence number decreases in the following entries. The entry representing the start of the filename has sequence number 1. A value of 0xE5 is used to indicate that the entry is deleted.
On FAT12 and FAT16 volumes, testing for the values at 0x1A to be zero and at 0x1C to be non-zero can be used to distinguish between VFAT LFNs and pending delete files under DELWATCH.
For example, a filename like "File with very long filename.ext" would be formatted like this:
A checksum also allows verification of whether a long file name matches the 8.3 name; such a mismatch could occur if a file was deleted and re-created using DOS in the same directory position. The checksum is calculated using the algorithm below. (pFCBName is a pointer to the name as it appears in a regular directory entry, i.e. the first eight characters are the filename, and the last three are the extension. The dot is implicit. Any unused space in the filename is padded with space characters (ASCII 0x20). For example, "Readme.txt" would be "README␠␠TXT".)
If a filename contains only lowercase letters, or is a combination of a lowercase basename with an uppercase extension, or vice versa; and has no special characters, and fits within the 8.3 limits, a VFAT entry is not created on Windows NT and later versions of Windows such as XP. Instead, two bits in byte 0x0C of the directory entry are used to indicate that the filename should be considered as entirely or partially lowercase. Specifically, bit 4 means lowercase extension and bit 3 lowercase basename, which allows for combinations such as "example.TXT" or "HELLO.txt" but not "Mixed.txt". Few other operating systems support it. This creates a backwards-compatibility problem with older Windows versions (Windows 95 / 98 / 98 SE / ME) that see all-uppercase filenames if this extension has been used, and therefore can change the name of a file when it is transported between operating systems, such as on a USB flash drive. Current 2.6.x versions of Linux will recognize this extension when reading (source: kernel 2.6.18 /fs/fat/dir.c and fs/vfat/namei.c); the mount option shortname determines whether this feature is used when writing.
See also
Comparison of file systems
Drive letter assignment
exFAT
Extended Boot Record (EBR)
FAT filesystem and Linux
List of file systems
Master Boot Record (MBR)
Partition type
Timeline of DOS operating systems
Transaction-Safe FAT File System
Turbo FAT
Volume Boot Record (VBR)
Notes
References
External links
ECMA-107 Volume and File Structure of Disk Cartridges for Information Interchange, identical to ISO/IEC 9293.
Microsoft Extensible Firmware Initiative FAT32 File System Specification, FAT: General Overview of On-Disk Format
Understanding FAT32 file systems (explained for embedded firmware developers)
Understanding FAT including lots of info about LFNs
Detailed Explanation of FAT Boot Sector: Microsoft Knowledge Base Article 140418
Description of the FAT32 File System: Microsoft Knowledge Base Article 154997
FAT12/FAT16/FAT32 file system implementation for *nix: Includes libfat libraries and fusefat, a FUSE file system driver
MS-DOS: Directory and Subdirectory Limitations: Microsoft Knowledge Base Article 39927
Overview of FAT, HPFS, and NTFS File Systems: Microsoft Knowledge Base Article 100108
Volume and file size limits of FAT file systems: Microsoft Technet, copy made by Internet Archive Wayback Machine
Microsoft TechNet: A Brief and Incomplete History of FAT32 by Raymond Chen
FAT32 Formatter Archived 2009-07-21 at the Wayback Machine: allows formatting volumes larger than 32 GB with FAT32 under Windows 2000, Windows XP and Windows Vista
Fdisk does not recognize full size of hard disks larger than 64 GB: Microsoft Knowledge Base Article 263044, copy made by Internet Archive Wayback Machine. Explains inability to work with extremely large volumes under Windows 95/98.
Microsoft Windows XP: FAT32 File System. Copy made by Internet Archive Wayback Machine of an article with summary of limits in FAT32 which is no longer available on Microsoft website.
Visual Layout of a FAT16 drive
Kata Kunci Pencarian:
- Sejarah Microsoft Windows
- PlayStation 3
- Daftar pangram
- Design of the FAT file system
- File Allocation Table
- ExFAT
- Transaction-Safe FAT File System
- NTFS
- 8.3 filename
- FAT filesystem and Linux
- List of file formats
- High Performance File System
- Installable File System