Why File Lock?

We’ve said in the previous lectures that there are many processes running simultaneously on the machine. If we can’t make the command atomic, then there will be a chance the file we want to access is overwritten between two commands, which might cause crashing

File lock allows us to apply for file lock, which avoid other process to change the file when you are using


File Lock

How it works?

  • When a lock is granted, the process has the right to read/write a file
  • When a lock is denied, the process has to wait until the lock is released by the lock holder

System Call for Locking

  • flock() - lock an entire file
  • fcntl() - lock arbitrary byte ranges in a file
  • lockf() - just a fcntl() wrapper

We can grant lock for certain part of a file which we want to read/write instead of granting lock for the whole process

Types of Lock

Shared read lock

  • Any numbers of processes can have read lock on the same byte at the same time since reading at the same time won’t cause problem
  • When there’s a read lock on the byte, there can’t be any write lock on it

Exclusive write lock

  • Only one process can have write lock on a given byte
  • If there’s an exclusive write lock on a byte, there can’t be any other read/write lock on the byte

These rules are only applied across different processes

UNIX thinks that developer should handle file lock correctly in the same process, thus these rules aren’t applied within the process. We can request a read lock then a write lock to the same region. Although this should never happen in our code

Feature of File Lock

  • File locks are stored in the i-node table (shared across processes)
  • Locks only track the process ID (pid), not which specific file descriptor requested the lock
  • Critical gotcha: If a process opens the same file twice (fd1 and fd2), and fd1 acquires a lock, closing fd2 will also release the lock — because the system only knows “this pid closed a descriptor to this file”

fcntl

Function Prototype

int fcntl(int filedes, int cmd, struct flock *flockptr)

cmd

  • F_GETLK - get the first process which locks on the place specified by the third argument
  • F_SETLK - set or clear a lock (nonblocking)
  • F_SETLKW - similar to F_SETLK, but if the process is blocked by other lock, the caller will wait until the request can be satisfied (blocking)

flockptr

struct flock {
	short I_type;
	off_t I_start;
	short I_whence;
	off_t I_len
	pid_t I_pid;
}
  • I_type - type of lock: F_RDLCK, F_WRLCK, F_UNLCK
  • I_start - offset in bytes, relative to I_whence
  • I_whence - same as lseek
  • I_len - size of the required lock region
    • : the lock extends to I_len bytes from I_whence
    • : the lock extends to the largest possible offset of the file
  • I_pid - will return process id when cmd=F_GETLK

Remarks

File Access

  • To obtain a read lock, the descriptor must be open for reading
  • To obtain a write lock, the descriptor must be open for writing

Non-atomic Operations

  • Test with F_GETLK then trying to obtain lock with F_SETLK isn’t atomic
  • Hence, we tend to just F_SETLK without checking

Implied Inheritance and Release of Locks

  • When a process terminates, all of its locks are released
  • When a file descriptor is closed, any locks on the file (not fd) referenced by that descriptor will be released
  • Locks are never inherited across a fork (since file lock track process)

Advisory & Mandatory Lock

Advisory Lock

  • We assume all processes that access the shared file shall call the file lock function to access file
  • benefit: we don’t need to check lock every time we read/write since the advisory lock assume that we’ll request for lock before read/write
  • drawback: if a rude process doesn’t obey the rule, it can just directly access the file without any block

Since we can control the rwx permission, thus we can control who can access the file. The file lock only prevents us from accidentally crash the file

Mandatory Lock

  • The kernel check every open, read, write to verify if they are violating a lock
  • This is super time-consuming thus we’ll almost never use it