Section 1 Solutions

Sections Wed Jan 18 to Sat Jan 21

Based on a section handout by Jerry Cain and compiled by Parthiv Krishna, with modifications by Nick Troccoli.

Solutions

1) Unix v6 Filesystem Overview

  • inode: a structure that stores information for a single file / directory, including its size, and up to 8 block numbers.
  • inode table: the contiguous space starting at sector 2 on disk where all inodes are stored, in order by inode number ("inumber"). Up to 16 inodes fit in each sector.
  • direct block numbers: block numbers for blocks that store payload data, like file data or directory entries.
  • singly-indirect blocks: blocks that store direct block numbers; in a large file or directory, its first seven block numbers stored in its inode are block numbers of singly-indirect blocks.
  • doubly-indirect blocks: blocks that store singly-indirect block numbers; in a large file or directory, its eighth block number stored in its inode is the block number of a doubly-indirect block.
  • "small" vs. "large" file scheme: the Unix v6 filesystem has two possible "schemes" for how its 8 blocks numbers in i_addr are used. For "small" files/directories, i_addr stores up to 8 direct block numbers. For "large" files/directories, i_addr's up to first seven entries store singly-indirect block numbers, and the eighth entry (if needed) stores a doubly-indirect block number.
  • directory: a directory is a "file container"; it stores other files or directories. It also has an inode associated with it, but its payload data is directory entries.
  • directory entry: a 16-byte structure that stores information for a single file / directory within a directory. Specifically, it stores its name (up to 14 bytes) and its inode number (2 bytes).

2) Accessing Inodes

  • Inode 256 is at block 2 + (256 - 1) / 16 = 17. It is at index (256 - 1) % 16 = 15.
  • Inode 345 is at block 2 + (345 - 1) / 16 = 23. It is at index (345 - 1) % 16 = 8.

Explanation: For the first calculation (which block is the inode in), the reason we do -1 is because for instance inode 16 should be in block 2, inode 32 in block 3, and so on. If we omitted the -1, the answer would be right except inode 16 would be incorrectly said to be in block 3, inode 32 incorrectly in block 4, etc. For the second calculation (which index is the inode within a block), we need to 0-index the inode numbers, which are 1-indexed. Because inode 16 should be at index 15, for instance, not index 16.

3) Accessing Payload Data

  • fileBlockIndex = 2 means we should look at i_addr[0], since the index-2 payload block number would be in the first singly-indirect block.
  • fileBlockIndex = 542 means we should look at i_addr[2], since 256 block numbers are in the first singly indirect block, another 256 are in the second, and this one would be in the third.

4) Direct, Singly Indirect, and Doubly Indirect Block Numbers

  • What's the maximum file size?
    • Maximum file size is (3 * 512) + (2 * 128 * 512) + (128 * 128 * 512), or 8,521,216 bytes. (The first term is the direct blocks, the second term is the singly-indirect blocks, and the third term is the doubly-indirect block)
  • How large does a file need to be before the relevant inode requires the first singly indirect block number be used?
    • 3 * 512 + 1, or 1,537 bytes. (The +1 is the first byte that cannot fit)
  • How large does a file need to be before the relevant inode requires the first doubly indirect block number be used?
    • 3 * 512 + 2 * 128 * 512 + 1, or 132,609 bytes. (The +1 is the first byte that cannot fit)
  • Draw as detailed an inode as you can if it's to represent a regular file that's 2049 bytes in size.

    • (Diagram created by Parthiv Krishna) The first three block numbers would identify three, saturated payload blocks, and those three payload blocks would store the first 1,536 bytes. The fourth block number—a singly, indirect one—would lead to a block of 512 bytes, although only the first eight bytes will be used. The first four bytes would identify the fourth payload block—itself saturated—to store 512 bytes of payload. The next four bytes would identify a fifth payload block where only the first byte is used.
  • Unlike the Unix V6 filesystem we learned about in class, the ext2/3/4 family of filesystems (commonly used on Linux) use variable-length directory entries. What is the benefit to designing directory entries this way? What are some drawbacks? (Consider what happens when deleting files.)

    • Benefit: you can store long filenames without wasting lots of space if some names are long but most are short
    • Drawback: Iterating over directory entries is more complicated. It’s no longer a simple array of directory entries, and we can no longer have random access to dirents; instead, we need to read each entry one by one, since each entry tells you where the next one is (via the name_length)
    • Drawback: Deleting a directory entry leaves a "hole” of a particular size based on the name length, and if you want to create a new directory entry with a longer name, you won’t be able to put it in that "hole”. There is more complexity in choosing where to put directory entries, since you need to find a suitable spot for the directory entries.

Checkoff Questions

  1. [Problem 2] Which sector stores inode 345? At which index in that sector's inodes is inode 345?

    • Inode 345 is at block 2 + (345 - 1) / 16 = 23. It is at index (345 - 1) % 16 = 8.
  2. [Problem 3] What is one benefit of directories being "just files" in the eyes of the Unix v6 filesystem design (ie. also use inodes, small vs. large scheme, etc.)? (you'll reap this benefit on assign1!)

    • The main benefit is we can use the same logic and code to implement both files and directories; we don't need a separate implementation for much of it. Directories can leverage the same designs that files rely on! (e.g. code for inodes, singly-indirect blocks, etc.) On assign1, this means you can imagine directories are "just files" in a sense when you need to e.g. get their directory entries.
  3. [Problem 3] What is the smallest value of fileBlockIndex that would require accessing the second singly-indirect block in an inode?

    • There are 256 block numbers per singly-indirect block, meaning that fileBlockIndex = 256 is the first one stored in the second singly-indirect block.