Junction Points and Symbolic Links

NTFS defines the concept of “reparse point” which is an optional attribute of files and directories meant to define some sort of preprocessing before accessing the said file or directory. For instance reparse points can be used to redirect access to files which have been moved to long term storage so that some application would retrieve them and make them directly accessible.

A Junction point is a specific reparse point to redirect a directory access to another directory which can be on the same volume or another volume. There are two sorts of junction points : volume junctions, which redirect directories to a whole volume (for instance to escape the 26 drive letters limit in Windows) and directory junctions, which redirect directories to another directory. In both situations the redirection target is defined by an absolute path.

The similar concept of symbolic link is also available since Windows Vista. The symbolic links can redirect to a file or a directory defined by an absolute or a relative path. When defined on a remote file system, they are processed on the local system, whereas the directory junctions are processed on the file server, which makes a difference when the target is not accessible by the file server. The symbolic links in Vista are different from Interix symbolic links created by ntfs-3g which are also interoperable with Windows XP and Vista.

Junction points were available in Windows 2000 and Windows XP, but they were not widely used until Windows Vista used directory junctions to redirect access to legacy directories (such as \Documents and Settings), in order to avoid breaking older software accessing directories for which Vista defines a new location. The symbolic links are new to Vista and used in paths (such as \Users\All Users) which were not used before Vista.

We will hereafter describe how junction points and symbolic links are made to appear in Linux as symbolic links. Dereferencing junction points and symbolic links created by Windows is thus made possible, so are hard linking, renaming and deleting, but creating new ones is not.

Finally, we will examine the use of reparse points to trigger upper layer features which ntfs-3g implements as plugins. Two of them are currently available : one for reading system compressed files, another for reading deduplicated files.

Directory Junctions

A directory junction, as created by Windows, always defines the full (case-insensitive) path to the target, including a drive letter. Examples of target definitions are :

Notes :

  • Windows does not accept the character '/' as a directory separator in the target definition,
  • when creating a junction, Windows translates a relative target definition to a full target.
  • only void directories can be made directory junctions by setting reparse data.

In order to translate a directory junction to a Linux symbolic link, the following points have to be addressed :

  • translate the drive letter to a mount point
  • translate the case-insensitive path to a case-sensitive one
  • and, as these are not always possible, detect and signal problems

Translating the drive letter

The drive letter is a physical address loosely related to the semantics of the target. A pluggable device (such as a USB key) gets different drive letters on different computers and on a specific computer different devices get the same drive letter if they are plugged in turn into the same slot.

Translating drive letters to Linux paths can probably not be done automatically, but there are two possible ways to deal with them : recognizing directory junctions local to a device, which can be translated to relative paths, and relying on some user defined mapping of drive letters to mount points.

Checking whether the drive letter designates the current volume can be approximated by making sure the target path designates an existing directory in the volume. After validity checks C:\Users can be converted to ./Users and C:\Users\Tom\AppData\Local converted to ../AppData/Local. This is subject to errors as a similar (case-insensitive) path meant for another volume may be found on the current volume. This would be the case for any target defined as the root of a volume, as there would be no directory to be checked, and it is wise to always reject such target guesses.

Another option is to let the user define what a drive letter should be mapped to in Linux. Such definitions should be located in the .NTFS-3G directory of the current file system, as symbolic links to the matching moint point. C:\Users will be converted to ./.NTFS-3G/C:/Users with C: being defined as a symbolic link to some mount point.

Both are implemented in ntfs-3g, according to the following rules :

  • if the drive letter is not defined in /.NTFS-3G, an attempt to interpret the junction point target as a path to an existing directory on the same volume is first made. If such directory is found, the path is converted to a relative symbolic link whose names are translated to match the directory chain exactly.
  • if the drive letter is defined in /.NTFS-3G or the attempt to find a local directory fails (even if there is no drive letter defined in /.NTFS-3G), the junction is translated to a relative symbolic link referring the possible definition. The drive letter should be defined with an upper case followed by a colon, and the path should match the characters used in the junction point definition.

Note that .NTFS-3G is a hidden directory located at the root of the file system containing the junction point. It may have to be replicated if there are several NTFS file systems with junction points in them.

Translating the case-insensitive path

The target is defined in Windows as a case-insensitive path, with chars which may have a different “casing” from those stored in directory levels, but an exact case-sensitive match is required for a symbolic link to be valid on Linux.

This obviously leads to examining the path and adjusting the names to those defined in the directory levels. However walking along a case-insensitive path may lead to ambiguities. For instance both c:\Users and c:\users may be present and designate different directories. Trying to solve such ambiguity is probably useless as the target is supposed to have been created by Windows according to its own rules, and Windows would not be able to make a better guess when faced to the same ambiguity.

Because of the possible ambiguities, the translation of a case insensitive path is only done when searching the target on the current volume. Only the drive letter is translated (and made upper case) when redirecting to a definition in .NTFS-3G, and user definitions should always match the target.


Assuming the C: volume is mounted on /Vista and /Vista/.NTFS-3G/D:/Packages is defined as a symbolic link to /shared/packages :

  • if /Vista/Documents and Settings is a directory junction to C:\USERS and /Vista/Users exists on the same volume, it will be seen as a symbolic link to ./Users
  • if /Vista/global is defined as a directory junction to c:\Shared and there is no directory /Vista/shared to be found whatever the letter case, it will be seen as a symbolic link to ./.NTFS-3G/C:/Shared
  • if /Vista/Users/Tom/TomData is a directory junction to d:\shared\TomData, it will be seen as a symbolic link to ../../.NTFS-3G/D:/shared/Tom/TomData, even if there is no such directory.

Except in the first case, a second symbolic link has to be defined to get to the target directory.

Volume Junctions

A volume junction, as created by Windows, defines a GUID to designate a physical drive. For example a target definition for a volume junction would appear as :
\\?\Volume{cb71f9d2-945f-11dd-8eac-00188b73099c}\ As the GUID is related to the physical drive (or USB port), the relation to the semantics of the data is poorly established, much like a drive letter. The Volume junction itself can apparently not be defined on a pluggable file system, but the target can be, allowing the usage of the same path to mean different data when the media is changed.

The way to make a volume junction appear like a symbolic link is also to define the volumes as symbolic links in the predefined location .NTFS-3G of the volume in which the junction is defined.

For instance, a volume junction in C:\Users\Tom\Data defined as \\?\Volume{[ID]}\ meaning a USB key which mounts in /media/TomData in Linux, will be seen in /Vista/Users/Tom/Data as a symbolic link to ../../.NTFS-3G/Volume{[ID]} which is expected to be defined as a symbolic link to /media/TomData. Of course plugging in another USB key with a different label can only be done if the definition is adjusted.

Symbolic Links defined since Vista

A symbolic link, as created by Windows, is much similar to a directory junction, but unlike a directory junction it can point to a file or a remote network file or directory. The target may be defined as a path relative to the symbolic link position, or an absolute path in the current volume or another one. Also note that symbolic links to files are different from symbolic links to directories and the target must match the definition.

If the target is defined as an absolute path, it is processed like a directory junction :

  • if no drive letter is present in the target definition, an attempt is made to translate the path to a case-sensitive one in the current volume,
  • if a drive letter is present in the target definition and not defined in .NTFS-3G, an attempt is also made to recognize the path in the current volume,
  • if a drive letter is present in the target definition, and defined in .NTFS-3G, the path is interpreted as it were relative to .NTFS-3G. The path is not translated or checked, only the drive letter is capitalized.

In the three situations a symbolic link relative to the current location is generated.

If the target is defined as a relative path, an attempt is made to translate the path to a case-sensitive one. The translation fails if it leads to a loop or leads out of the current volume. If successfull, a new symbolic link with the translated path is generated.

Other Types of Reparse Points

Reparse points may be used by Windows to force some special processing when a file is requested. When such processing is not supported by ntfs-3g, the file or directory which holds the reparse point is made to appear as a symlink to “unsupported reparse point”.

Since ntfs-3g-2016.2.22AR.1, plugins may be used to implement features defined by reparse points :

  • System compression
  • Deduplicated files

System compression is used by Windows 10 (only on powerful computers) to save space in the Windows system directory by compressing the executables and DLLs. The compression algorithms and the data layout used are different from the usual NTFS compression and they are unsuitable for compressing on the fly.

File deduplication is used by Windows Server 2012 to save storage space by sharing the space used by similar files, thus avoiding redundancy. The algorithms used and the data layout are also unsuitable for deduplicating on the fly.

The plugins for reading system compressed or deduplicated files are available on the download page. Creating or updating such files are not supported, only normal files can be created and updated.


Page is maintained by Jean-Pierre André