Unix File Types (WIP)

Working on a custom ls, I needed to print file types symbolically. I did not know how exactly file types are encoded in st_mode, so I read the man page for inode(7), to find the following table:

S_IFMT     0170000   bit mask for the file type bit field

S_IFSOCK   0140000   socket
S_IFLNK    0120000   symbolic link
S_IFREG    0100000   regular file
S_IFBLK    0060000   block device
S_IFDIR    0040000   directory
S_IFCHR    0020000   character device
S_IFIFO    0010000   FIFO

With the usual letter assignment, we have the following lookup table:

"?pc?d?b?f?l?s???"[st.st_mode >> 12 & 15]

Then I noticed that bit 12 of st_mode is unnecessary. Assuming no further extensions to the list of file types, I could have written (for no real benefit, of course):

"pcdbfls?"[st.st_mode >> 13 & 7]

But why are there so many gaps? What were the gaps originally for? And why is FIFO the only file type with bit 12 set? I started by digging The Unix Tree.

I skipped PDP-7 Unix because I do not understand its assembly language. In V1, V2, and V3 Unix, file flags are stored in a 16-bit word. The following bits are defined:

100000  used (always on)
040000  directory
020000  file has been modified (always on)
010000  large file
000040  set user ID
000020  executable
000010  read, owner
000004  write, owner
000002  read, non-owner
000001  write, non-owner

So there were only files (0) and directories (4).

Things changed in V4 Unix, when permissions were elaborated and man pages adopted C syntax:

100000  i-node is allocated
060000  2-bit file type:
000000  plain file
040000  directory
020000  character-type special file
060000  block-type special file.
010000  large file
004000  set user-ID on execution
002000  set group-ID on execution
000400  read (owner)
000200  write (owner)
000100  execute (owner)
000070  read, write, execute (group)
000007  read, write, execute (others)

There are now 4 file types: "plain" files, directories, character-type special files, and block-type special files. V5 and V6 Unix added the sticky bit, but other flags were unchanged. It is clear from the C header that the "large file" bit is not part of the file type.

V7 Unix made the "used" and "large file" bits part of the file type, for a glorious 4-bit file type field. The top bit was assigned to regular files, and the bottom bit marked "multiplexed" variants of character and block special files.

#define S_IFMT  0170000 /* type of file */
#define S_IFDIR 0040000 /* directory */
#define S_IFCHR 0020000 /* character special */
#define S_IFBLK 0060000 /* block special */
#define S_IFREG 0100000 /* regular */
#define S_IFMPC 0030000 /* multiplexed char special */
#define S_IFMPB 0070000 /* multiplexed block special */

(TO BE CONTINUED)