From The Phantom on Mon, 01 May 2000
Hello,
I'm wondering if you can answer a few questions on the UNIX rm command. I need a response before May 3rd if possible. Your assistance on this matter is greatly appreciated. Thank you for your time and service. Here's the questions
Hmm. Wouldn't want this assignment to be late for the prof, heh?
Well, at least you had the brights to use a hotmail account rather than sending this from your flunkme@someuniv.edu address.
The rm unix command lowers the link of an inode. When the link count goes to zero the inode is made available to the system and cleared of extraneous information.
The 'rm' command is basically a parser and wrapper around the unlink() system call.
BTW: This definition is an oversimplification. When the link count is less than 1 AND THERE ARE NO OPEN FILE DESCRIPTORS ON THAT FILE then the system does some sort of maintenance on the inode and any data blocks that were assigned to it.
Exactly what the filesystem does depends on what type of fs it is, and on how it was implemented for that version of that flavor of UNIX.
Usually the inode is marked as "available" in some way --- so that it can be re-used for new files. Usually the data blocks are added to a free list, so that they can be allocated to other files.
(It is possible for some implementations to mark and reserve these to allow for some sort of "undelete" process --- and it would certainly be possible to have "purge" and "salvage" features for some versions of UNIX).
1) Explain link count?
The link count is one of the elements (fields) of the inode structure. An inode is a data structure that is used to manage most of the metadata for a file on a UNIX like filesystem.
On UNIX filesystems a directory entry is (usually) a link to an inode. (On some forms of UNIX, on some types of filesystems there may be exceptions to this. Some filesystems can store symbolic link data directly in their directory structures without dereferencing that through an inode; some of them can even store the contents of small files there. However --- in most cases the directory entry is a link to an inode.
This allows one to have multiple links to a file. In other words you can have many different names for a file --- and you can have identical names in different directories.
It turns out that most filesystems use this feature extensively to support the directory structure. Directories are just inodes that are mostly just like files. Somewhere you have a parent directory. It contains a link to you. Each of your subdirectories contains a ".." link to its parent (you). Thus each directory must contain a link count that is equal to it's number of sudirectories plus two (one for . and another for ../somelink.to.me).
(Note: On most modern forms of UNIX there is a prohibition against creating additional named hard links to directories -- this is apparently enforced in order to make things easier for fsck).
2) Explain why the name of the command is called remove (rm)?
It seems pretty self explanatory to me. You're removing a link. If that link is the last one to that file, then you've remove the file as well.
3) What hapens to the blocks referenced by the inode when the link count goes to zero?
Normally the data block would be returned to the free list. The free list is another data structure on UNIX filesystems. I think it is usually implemented as a bitmap.
Note: On some forms of UNIX the filesystem driver might implement a secure deleted feature which might implement arbitrarily complex sets of overwritting the data with NULs, with random data, etc. There is a special feature in Linux which is reserved for this use -- but which is not yet implemented. You might find similar features in your favorite form of UNIX.
4) What data is present in these blocks after the inode has been cleared?
That depends on the filesystem implementation. It usually would still contain whatever data was laying around in those blocks at the time that they were freed.
If you're thinking: "Ooooh! That means I can peek at other people's data after they remove it!" Think again. Any decent UNIX implementation will ensure that those data blocks are clear (zero'd out) as they are re-allocated.
5) How does the removal of an inode which is a symbolic link change the answer to 3) and 4)?
Symbolic links may be implemented by storing the "data" in the directory entry. In which case the unlink() simply zeros out that directory entry in whatever way is appropriate to the filesystem on which it is found.
Symbolic links may also be implemented by reference to an inode --- and by storing the target filename in the data blocks that are assigned to that inode. In which case they are treated just like any other file.
Note that removing a symbolic link with 'rm' should NEVER affect the target file links or inodes. The symbolic link is completely independent of the hard links to which they point and the inodes to which those refer.
Thank you for your help.
As I'm sure you noticed this sounds to me like a "do my homework" message. However, I've decided to answer it since it is likely to be of interest to many of my readers.
You may also have noticed that I was a bit vague on a number of points. Keep in mind that there is quite alot of this that depends on which version of UNIX you're using, which filesystem your talking about (Linux, for example supports over a dozen different types of local filesystem), and how you've configured it.
Of course you could learn quite a bit more about it by reading the sources to a Linux or FreeBSD kernel
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18