Threadsafe File Consistency in Ruby
A large part of the work in the 0.7.0 release of Acts As Indexed was in guaranteeing the consistency of the index files which may be written to by many processes. I shall split this into two halfs: atomic writes, and locking writes.
I talk mostly of processes here, since most Rails hosting implementations at the moment employ multiple processes, though the same methodologies can be applied to threads.
Atomic Writes
An example: Say we have one process which writes to a file, and many processes which may be reading from that same file. If we do a simple write, it is possible that one of the reading processes may see a half-written file. While digging through the Rails source I discovered a monkey-patch on the Ruby File class which added a method called atomic_write.
The basic operation of this as is follows:
- Write to a temporary file.
- Move that temporary file to be the actual file we want to write to.
Since we are delegating the move operation to a system call, we can almost guarantee that any process reading the file will only see a fully written one, since all that is being changed during the move is a pointer to the file’s physical location on disk. A simple implementation of this would be thus:
require 'fileutils' def atomic_write(path, temp_path, content) File.open(temp_path, 'w+') do |f| f.write(content) end FileUtils.mv(temp_path, path) end
The Rails implementation goes a lot further than this, creating a tempfile in the OS mandated location, and making sure the newly written file has the same permissions as the original file.
Locking Writes
Another example: We have many processes, all of which can write to the same file. Our processes first read the file, and then make some change to it. A race condition for this looks as follows.
- Process A reads the file.
- Process B reads the file.
- A makes changes and writes these.
- B makes changes and writes these.
In this example, changes made by A are lost. The solution to this is to use locks, which are provided by the Ruby File class via the flock method.
def lock(path) # We need to check the file exists before we lock it. if File.exist?(path) File.open(path).flock(File::LOCK_EX) end # Carry out the operations. yield # Unlock the file. File.open(path).flock(File::LOCK_UN) end
We can combine this with the atomic_write method as follows:
lock('my_file') do atomic_write('my_file', 'my_file.tmp', 'Hello, World!') end
Rails’ file store has a great implementation of this pattern, which automatically unlocks the file again in case of an exception while the lock is applied.








