A large part of the work in the 0.7.0 release of Acts As Indexed was in guaranteeing the consistency of the index files which may be written to by many processes. I shall split this into two halfs: atomic writes, and locking writes.

I talk mostly of processes here, since most Rails hosting implementations at the moment employ multiple processes, though the same methodologies can be applied to threads.

Atomic Writes

An example: Say we have one process which writes to a file, and many processes which may be reading from that same file. If we do a simple write, it is possible that one of the reading processes may see a half-written file. While digging through the Rails source I discovered a monkey-patch on the Ruby File class which added a method called atomic_write.

The basic operation of this as is follows:

  1. Write to a temporary file.
  2. Move that temporary file to be the actual file we want to write to.

Since we are delegating the move operation to a system call, we can almost guarantee that any process reading the file will only see a fully written one, since all that is being changed during the move is a pointer to the file’s physical location on disk. A simple implementation of this would be thus:

1
2
3
4
5
6
7
8
9
require 'fileutils'

def atomic_write(path, temp_path, content)
  File.open(temp_path, 'w+') do |f|
    f.write(content)
  end

  FileUtils.mv(temp_path, path)
end

The Rails implementation goes a lot further than this, creating a tempfile in the OS mandated location, and making sure the newly written file has the same permissions as the original file.

Locking Writes

Another example: We have many processes, all of which can write to the same file. Our processes first read the file, and then make some change to it. A race condition for this looks as follows.

  1. Process A reads the file.
  2. Process B reads the file.
  3. A makes changes and writes these.
  4. B makes changes and writes these.

In this example, changes made by A are lost. The solution to this is to use locks, which are provided by the Ruby File class via the flock method.

1
2
3
4
5
6
7
8
9
10
11
12
def lock(path)
  # We need to check the file exists before we lock it.
  if File.exist?(path)
    File.open(path).flock(File::LOCK_EX)
  end

  # Carry out the operations.
  yield

  # Unlock the file.
  File.open(path).flock(File::LOCK_UN)
end

We can combine this with the atomic_write method as follows:

1
2
3
lock('my_file') do
  atomic_write('my_file', 'my_file.tmp', 'Hello, World!')
end

Rails’ file store has a great implementation of this pattern, which automatically unlocks the file again in case of an exception while the lock is applied.