Lab 6: Caching Extent Server and Consistency

Distributed Systems Lab 6: Caching Extent Server and Consistency

Introduction

In this lab you will build a server and client that cache extents at the client, reducing the load on the server and improving client performance. The main challenge is to ensure consistency of extents cached at different clients. To achieve consistency we will use the caching lock service from Lab 5 - Caching Lock Server.

First you'll add a local write-back extent cache to each extent client. The extent client will serve all extent operations from this cache; the extent client will contact the extent server only to fetch an extent that is not present on the client. Then you'll make the caches at different clients consistent by forcing write-back of the dirty cached extent associated with a file handle (and deletion of the clean extent) when you release the lock on that file handle.

Your client will be a success if it manages to operate out of its local extent and lock cache when reading/writing files and directories that other hosts aren't looking at, but maintains correctness when the same files and directories are concurrently read and updated on multiple hosts.

Getting Started

First, merge your solution to Lab 5 with the new code for Lab 6. You can use the following commands to get the Lab 6 sources; see the directions for Lab 2 for more background on git

% cd lab
% git commit -am 'my solution to lab5'
Created commit ...
% git pull
remote: Generating pack...
...
% git checkout -b lab6 origin/lab6
Branch lab6 set up to track remote branch refs/remotes/origin/lab6.
Switched to a new branch "lab6"
% git merge lab5

As before, if git reports any conflicts, edit the files to merge them manually, then run git commit -a. Since you are building on the previous labs, ensure the code in your Lab 3 directory passes all tests for Labs 1, 2, 3, 4 and 5 before starting in on this lab.

Testing Performance

Our measure of performance is the number of put and get RPCs that your extent server receives. You can tell the extent server to print out a line every 25 RPCs telling you the current totals as you did for the lock server in Lab 5, by setting RPC_COUNT to 25.

Then you can start the servers, run the test-lab-4-c script, and look in extent_server.log to see how many RPCs have been received.

% export RPC_COUNT=25
% ./start.sh
% ./test-lab-4-c ./yfs1 ./yfs2
Create/delete in separate directories: tests completed OK
% grep "RPC STATS" extent_server.log
...
RPC STATS: 6001:801 6002:1402 6003:797
% ./stop.sh

The RPC STATS line indicates the number of put, get and getattr RPCs received by the extent server. The above line is the output of our solution for Lab 5. Your goal is to reduce those numbers to about a dozen puts and at most a few hundred gets.

Step One: Extent Cache

In Step One you'll add caching to your extent client, without cache consistency. This cache will make your server fast but incorrect. (You can simply modify extent_client.cc and extent_client.h, or if you'd like, you can add the code to a sub-class in a separate file. Remember to git add any new files you create.)

get() should check if the extent is cached, and if so return the cached copy. Otherwise get() should fetch the extent from the extent server, put it in the local cache, and then return it to the YFS client. put() should just replace the cached copy, and not send it to the extent server. You'll find it helpful for the next section if you keep track of which cached extents have been modified by put() (i.e., are "dirty"). remove() should delete the extent from the local cache.

When you're done, set RPC_COUNT and run test-lab-4-c giving the same directory twice, and watch the statistics printed by the extent server. You should see zero puts and somewhere between zero and a few hundred gets (or perhaps no numbers at all, if the value of RPC_COUNT is more than the number of gets). Your server should pass test-lab-4-a.pl and test-lab-4-b if you give it the same directory twice, but it will probably fail test-lab-4-b with two different directories because it has no cache consistency.

Step Two: Cache Consistency

In Step Two you'll ensure that each get() sees the latest put(), even when the get() and put() are from different YFS clients. You'll arrange this by ensuring that your extent client writes a file's modified (dirty) cached extents back to the extent server before the client releases the lock on that file. Similarly, your server should delete extents from its cache when it releases the lock on the relevant file.

You will need to add a method to the extent client to eliminate an extent from the cache. This flush() method should first check whether the extent is dirty in the cache, in which case it sends it to the extent server. Extents that your server has removed (with the extent client's remove() method) should also be removed from the extent server (if the extent server knows about them).

Your server will need to call flush() just before releasing a lock back to the lock server. You could just add flush() calls to yfs_client.cc before each release(). However, now that your lock client handles the caching of locks, flushing the extents after each release is overkill; what you really want is to flush the extents only once the client is forced to give the lock back to the lock server.

We provide an interface for this in the form of the lock_release_user class, defined in lock_client_cache.h. This is a virtual class supporting only one method: dorelease(std::string lockname). Your job is to subclass lock_release_user and implement that subclass's dorelease method to call flush() on your extent client for whatever data is about to lose its lock. Then, create an instance of this class and pass it into the lock_client_cache object constructed in yfs_client.cc. Finally, your lock_client_cache must call the dorelease() method of its lu object before it releases a lock back to the lock server. (Note that lu was defined and initialized in the code we provided you for Lab 5.) Overall, this will ensure that any dirty extents are flushed back to the cache before the lock is released, so that when the next client gets the lock and fetches the extent, it will see consistent data.

You should also keep extent meta-data cached along with the extents, and flush dirty meta-data back to the extent server along with the extents. If an extent is cached, then any calls that set attributes should change the meta-data in the cache, and need not propagate to the extent server until flush() is called.

When you're done with Step Two your server should pass all the correctness tests (test-lab-4-a.pl, test-lab-4-b, and test-lab-4-c should execute correctly with two separate YFS directories), and you should see a dramatic drop in the number of puts and gets received by the extent server.

Hints

Make sure you use a pthreads mutex to protect the extent cache, in case multiple threads access it at once.
Make sure that if you only read an extent (or its attributes) that you don't flush it back on a release. You probably want to keep a flag in your cache that records if an extent (or its attributes) have been modified.
You should be able to implement the caching in the extent client without modifying the YFS client code. (Of course, if you implemented your caching extent client as a separate class, you will need to instantiate it in yfs_client.cc.)

Handin Procedure

Remember to test your code multiple times as the tests introduce some randomness.

You will need to submit your completed code (without binaries) as a gzipped tar file on the Lab 6 Submission Page by the deadline. To do this, you could first switch to the source directory and execute these commands:

% tar czvf MATR1-MATR2-lab6.tgz lab/

That should produce a file called [MATR1-MATR2]-lab6.tgz in that directory, where MATR1 and MATR2 are the matriculation numbers of the team members. Then follow the instructions and submit the .tgz file on the submission page.

You will receive full credit if your software passes the same tests we gave you when we run your software on the VM.