Windows-Server-2003/base/fs/ntfs/specnot2.txt

Here is a list of all NTFS design issues which have come up that effect the
structure, along with current resolution (if there is one) of the issue.  The
resolution of these issues affects the "NTFS Design Specification 1.1" issued
May 29, 1991.  This list will be the final qualification to the spec until
there is time to update it to a form which reflects the actual implementation.
Of course the most precise definition of NTFS will always be in the header
file which describes its structure: ntfs.h.

These issues have been collected primarily from our own internal review and
the feedback received from MarkZ.  They are listed here in no particular
order.

Issue 1:

    Support for nontagged attributes is a pain in the low-level attribute
    routines, as well as in Format and ChkDsk.  They are of very little
    value to the File System in terms of space or performance.

    Resolution:

    Nontagged attributes are being dropped for the purposes of NTFS's own
    use of attributes to implement the disk structure.  Nontagged attributes
    will be supported with the general table support.

Issue 2:

    The EXTERNAL_ATTRIBUTES attribute, should have a better name, and its
    definition should be changed to simplify various NTFS algorithms.

    Resolution:

    The attribute name has been changed to the ATTRIBUTE_LIST attribute.
    It is still only created when a file requires more than one file record
    segment.  At that time it is created to list all attributes (including
    those in the base file record) by type code and (optional) name.  it is
    ordered by Attribute Type Code and Attribute Name.

    One reason for this change is to facilitate the enumeration of all
    attributes for a file with multiple file record segments.  This
    slightly different definition also gives NTFS's attribute placement
    policy more freedom to shuffle attributes around within the file
    record segments.

Issue 3:

    Attribute ordering rules within the file, within each file record segment,
    and within the ATTRIBUTE_LIST were not completely specified.

    Resolution:

    The only rule for the ordering of attributes within each file, if there
    are multiple file record segments, is that STANDARD_INFORMATION must be
    in the base file record segment, and (at least the first part of) the
    ATTRIBUTE_LIST attribute must also be in the base file record segment.
    In general, the system should try to keep the other system-defined
    attributes with the lowest Attribute Type Codes present in the base file
    record segment when possible, for performance reasons.

    Within each file record segment, attributes will be ordered by type code,
    name, and then value.  (If an attribute is not unique in type code and
    name, then it must be indexed and the value must be referenced.)

    The entries of the ATTRIBUTE_LIST will be ordered by attribute code and
    name.

    Reliance on these ordering rules may be used to speed up attribute lookup
    algorithms.

Issue 4:

    NTFS is NOT secure on removeable media without data encryption.

    Resolution:

    Functionality for the encryption of communications and physical media
    is already planned for Product 2 of NT, at which time we will decide
    what the best mechanism will be for integrating this support with
    removeable NTFS volumes.  We must insure now that this can be implemented
    in a upward-compatible manner.

Issue 5:

    It would be very desirable for WINX to have the ability to uniquely
    identify and open files by a small number.

    Resolution:

    Logically the ability to use this functionality must be controlled by
    some privilege, as it is expensive and nearly impossible to come up with a
    consistent strategy on how to do correct path traversal checking, in a
    system such as NTFS which supports multiple directory links to a single
    file.  Once the requirement for a special privilege is accepted, it is
    relatively easy for NTFS to support an API which would allow files to
    be opened by their (64-bit) File Reference number.  The File Reference
    is perfect for this purpose, as it includes a 16-bit cyclically-reused
    sequence number to detect the attempt to use a stale File Reference.
    I.e., the original file with the same 48-bit Base File Record address has
    been deleted, and a new file has been created at the same address.)

    THIS REQUIRES A NEW NT I/O API.

Issue 6:

    Enumeration of files in a directory in NT could be very slow, since
    to get more than just a file's name requires reading (at least) the
    base file record for the file.

    Resolution:

    The initial NT-based implementation of NTFS will come up with a
    strategy for clustering file record segments together in the MFT for
    files created in the same directory.  Current thinking is that this
    will be done *without* change to the NTFS structure definition.  So,
    for example, the first 128 files in a directory might be contiguous in
    the MFT, and then the second 128 will also be contiguous, etc.  This
    will allow the implementation to prefetch files up to 128 file record
    segments at a time with a large spiral read, then expect cache hits during
    the enumeration.

    Secondly, at some point the implementation will cache enumeration
    information, to make subsequent enumeration of the same directory
    extremely fast.

Issue 7:

    Is it an unnecessary complexity to NTFS to support multiple collating
    rules, as opposed to a simple byte-comparison collation?  Note that
    frequently the caller collates himself anyway.

    Resolution:

    This is not resolved yet pending further discussion.

    The current reason NTFS plans to support multiple collating rules,
    is that collating in the caller can have bad performance and response
    characteristics in large directories.  For example, consider a Windows
    App which requests the enumeration of a directory with 200 files (possibly
    over the network to a heavily loaded server), and it is going to
    display this enumeration in a List box with 10 or 20 lines.  If it
    does not have to collate the enumeration, it can start displaying
    as soon as it receives part of the enumeration.  Otherwise it has
    to wait to get the entire enumeration before it can collate and display
    anything.

Issue 8:

    Should there be a bit in STANDARD_INFORMATION to indicate whether a
    file record has an INDEX attribute or not?

    Resolution:

    There is no plan to do this, unless we find additional reasons
    to do so that we are missing.  Currently we see how this bit could
    speed the rejection of illegal path specifications, but it would
    not speed the acceptance of correct ones.  Note that from the structure
    of NTFS, it is legal for a file to have both an INDEX attribute *and*,
    for example, a DATA attribute.

Issue 9:

    The algorithms and consistency rules surrounding the 8.3 indices need to
    be clarified.

    Resolution:

    This will be done by 7/31.

Issue 10:

    Why not eliminate the VERSION attribute and move it to
    STANDARD_INFORMATION?

    Resolution:

    We will do this, and then define an additional file attribute
    and/or field which controls whether or not versioning is enabled and
    possibly how many versions are allowed for a file.

Issue 11:

    There should be a range of system-defined attribute codes which are
    not allowed to be duplicated, as this will speed up some of the
    lookup algorithms.

    Resolution:

    This will be done.

Issue 12:

    Is duplication of the log file the correct way to add redundancy to
    NTFS to allow mounting in the event of read errors.

    Resolution:

    Upon further analysis, it was determined that the needed redundancy
    was incorrectly placed.  It is more important to duplicate the first
    few entries of the MFT, than to duplicate the start of the log file.
    This change will be made.

Issue 13:

    The spec describes how access to individual attribute types may be
    controlled by special ACEs, which is incompatible with the current
    NT APIs and our security strategy.

    Resolution:

    This will be fixed.  Access to user-defined attributes will be controlled
    by the READ_ATTRIBUTES and WRITE_ATTRIBUTES access rights.

Issue 14:

    A file attribute should be added which supports more efficient handling
    of temporary files.

    Resolution:

    An attribute will be added for files, and possibly directories, which
    will enable NTFS to communicate "temporary file" handling to the Cache
    Manager.  Temporary files will never be set dirty in the Cache Manager
    or written to disk by the Lazy Writer, although the File Record will
    be correctly updated to keep the volume consistant.  If a temporary file
    is deleted, then all writes to its data are eliminated.  If MM discovers
    that memory is getting tight, it may choose to flush data to temporary
    files, so that it can free the pages.  In this case the
    correct data for the file will eventually be faulted back in.

    This makes the performance of I/O to temporary files approach the
    performance of putting them on a RAM disk.  An advantage over RAM disk,
    though, is that no one has to specify how much space should be used
    for this purpose.

Issue 15:

    It would be nice to have some flag in each file record segment to say
    if it is in use or not.  This would simplify chkdsk algorithms, although
    it would require the record to be written on deletion.

    Resolution:

    This will be done.  It is difficult to suppress the write of the file
    record on deletion anyway.
Initiall commit 2024-08-04 01:28:15 +02:00			`Here is a list of all NTFS design issues which have come up that effect the`
			`structure, along with current resolution (if there is one) of the issue. The`
			`resolution of these issues affects the "NTFS Design Specification 1.1" issued`
			`May 29, 1991. This list will be the final qualification to the spec until`
			`there is time to update it to a form which reflects the actual implementation.`
			`Of course the most precise definition of NTFS will always be in the header`
			`file which describes its structure: ntfs.h.`

			`These issues have been collected primarily from our own internal review and`
			`the feedback received from MarkZ. They are listed here in no particular`
			`order.`

			`Issue 1:`

			`Support for nontagged attributes is a pain in the low-level attribute`
			`routines, as well as in Format and ChkDsk. They are of very little`
			`value to the File System in terms of space or performance.`

			`Resolution:`

			`Nontagged attributes are being dropped for the purposes of NTFS's own`
			`use of attributes to implement the disk structure. Nontagged attributes`
			`will be supported with the general table support.`

			`Issue 2:`

			`The EXTERNAL_ATTRIBUTES attribute, should have a better name, and its`
			`definition should be changed to simplify various NTFS algorithms.`

			`Resolution:`

			`The attribute name has been changed to the ATTRIBUTE_LIST attribute.`
			`It is still only created when a file requires more than one file record`
			`segment. At that time it is created to list all attributes (including`
			`those in the base file record) by type code and (optional) name. it is`
			`ordered by Attribute Type Code and Attribute Name.`

			`One reason for this change is to facilitate the enumeration of all`
			`attributes for a file with multiple file record segments. This`
			`slightly different definition also gives NTFS's attribute placement`
			`policy more freedom to shuffle attributes around within the file`
			`record segments.`

			`Issue 3:`

			`Attribute ordering rules within the file, within each file record segment,`
			`and within the ATTRIBUTE_LIST were not completely specified.`

			`Resolution:`

			`The only rule for the ordering of attributes within each file, if there`
			`are multiple file record segments, is that STANDARD_INFORMATION must be`
			`in the base file record segment, and (at least the first part of) the`
			`ATTRIBUTE_LIST attribute must also be in the base file record segment.`
			`In general, the system should try to keep the other system-defined`
			`attributes with the lowest Attribute Type Codes present in the base file`
			`record segment when possible, for performance reasons.`

			`Within each file record segment, attributes will be ordered by type code,`
			`name, and then value. (If an attribute is not unique in type code and`
			`name, then it must be indexed and the value must be referenced.)`

			`The entries of the ATTRIBUTE_LIST will be ordered by attribute code and`
			`name.`

			`Reliance on these ordering rules may be used to speed up attribute lookup`
			`algorithms.`

			`Issue 4:`

			`NTFS is NOT secure on removeable media without data encryption.`

			`Resolution:`

			`Functionality for the encryption of communications and physical media`
			`is already planned for Product 2 of NT, at which time we will decide`
			`what the best mechanism will be for integrating this support with`
			`removeable NTFS volumes. We must insure now that this can be implemented`
			`in a upward-compatible manner.`

			`Issue 5:`

			`It would be very desirable for WINX to have the ability to uniquely`
			`identify and open files by a small number.`

			`Resolution:`

			`Logically the ability to use this functionality must be controlled by`
			`some privilege, as it is expensive and nearly impossible to come up with a`
			`consistent strategy on how to do correct path traversal checking, in a`
			`system such as NTFS which supports multiple directory links to a single`
			`file. Once the requirement for a special privilege is accepted, it is`
			`relatively easy for NTFS to support an API which would allow files to`
			`be opened by their (64-bit) File Reference number. The File Reference`
			`is perfect for this purpose, as it includes a 16-bit cyclically-reused`
			`sequence number to detect the attempt to use a stale File Reference.`
			`I.e., the original file with the same 48-bit Base File Record address has`
			`been deleted, and a new file has been created at the same address.)`

			`THIS REQUIRES A NEW NT I/O API.`

			`Issue 6:`

			`Enumeration of files in a directory in NT could be very slow, since`
			`to get more than just a file's name requires reading (at least) the`
			`base file record for the file.`

			`Resolution:`

			`The initial NT-based implementation of NTFS will come up with a`
			`strategy for clustering file record segments together in the MFT for`
			`files created in the same directory. Current thinking is that this`
			`will be done without change to the NTFS structure definition. So,`
			`for example, the first 128 files in a directory might be contiguous in`
			`the MFT, and then the second 128 will also be contiguous, etc. This`
			`will allow the implementation to prefetch files up to 128 file record`
			`segments at a time with a large spiral read, then expect cache hits during`
			`the enumeration.`

			`Secondly, at some point the implementation will cache enumeration`
			`information, to make subsequent enumeration of the same directory`
			`extremely fast.`

			`Issue 7:`

			`Is it an unnecessary complexity to NTFS to support multiple collating`
			`rules, as opposed to a simple byte-comparison collation? Note that`
			`frequently the caller collates himself anyway.`

			`Resolution:`

			`This is not resolved yet pending further discussion.`

			`The current reason NTFS plans to support multiple collating rules,`
			`is that collating in the caller can have bad performance and response`
			`characteristics in large directories. For example, consider a Windows`
			`App which requests the enumeration of a directory with 200 files (possibly`
			`over the network to a heavily loaded server), and it is going to`
			`display this enumeration in a List box with 10 or 20 lines. If it`
			`does not have to collate the enumeration, it can start displaying`
			`as soon as it receives part of the enumeration. Otherwise it has`
			`to wait to get the entire enumeration before it can collate and display`
			`anything.`

			`Issue 8:`

			`Should there be a bit in STANDARD_INFORMATION to indicate whether a`
			`file record has an INDEX attribute or not?`

			`Resolution:`

			`There is no plan to do this, unless we find additional reasons`
			`to do so that we are missing. Currently we see how this bit could`
			`speed the rejection of illegal path specifications, but it would`
			`not speed the acceptance of correct ones. Note that from the structure`
			`of NTFS, it is legal for a file to have both an INDEX attribute and,`
			`for example, a DATA attribute.`

			`Issue 9:`

			`The algorithms and consistency rules surrounding the 8.3 indices need to`
			`be clarified.`

			`Resolution:`

			`This will be done by 7/31.`

			`Issue 10:`

			`Why not eliminate the VERSION attribute and move it to`
			`STANDARD_INFORMATION?`

			`Resolution:`

			`We will do this, and then define an additional file attribute`
			`and/or field which controls whether or not versioning is enabled and`
			`possibly how many versions are allowed for a file.`

			`Issue 11:`

			`There should be a range of system-defined attribute codes which are`
			`not allowed to be duplicated, as this will speed up some of the`
			`lookup algorithms.`

			`Resolution:`

			`This will be done.`

			`Issue 12:`

			`Is duplication of the log file the correct way to add redundancy to`
			`NTFS to allow mounting in the event of read errors.`

			`Resolution:`

			`Upon further analysis, it was determined that the needed redundancy`
			`was incorrectly placed. It is more important to duplicate the first`
			`few entries of the MFT, than to duplicate the start of the log file.`
			`This change will be made.`

			`Issue 13:`

			`The spec describes how access to individual attribute types may be`
			`controlled by special ACEs, which is incompatible with the current`
			`NT APIs and our security strategy.`

			`Resolution:`

			`This will be fixed. Access to user-defined attributes will be controlled`
			`by the READ_ATTRIBUTES and WRITE_ATTRIBUTES access rights.`

			`Issue 14:`

			`A file attribute should be added which supports more efficient handling`
			`of temporary files.`

			`Resolution:`

			`An attribute will be added for files, and possibly directories, which`
			`will enable NTFS to communicate "temporary file" handling to the Cache`
			`Manager. Temporary files will never be set dirty in the Cache Manager`
			`or written to disk by the Lazy Writer, although the File Record will`
			`be correctly updated to keep the volume consistant. If a temporary file`
			`is deleted, then all writes to its data are eliminated. If MM discovers`
			`that memory is getting tight, it may choose to flush data to temporary`
			`files, so that it can free the pages. In this case the`
			`correct data for the file will eventually be faulted back in.`

			`This makes the performance of I/O to temporary files approach the`
			`performance of putting them on a RAM disk. An advantage over RAM disk,`
			`though, is that no one has to specify how much space should be used`
			`for this purpose.`

			`Issue 15:`

			`It would be nice to have some flag in each file record segment to say`
			`if it is in use or not. This would simplify chkdsk algorithms, although`
			`it would require the record to be written on deletion.`

			`Resolution:`

			`This will be done. It is difficult to suppress the write of the file`
			`record on deletion anyway.`