Summary - Related Work - AshFS: 一個支援離線操作的輕量化行動檔案系統

Chapter 2 Related Work

2.7 Summary

Now we list a table to describe our design goals and make a comparison of our related works. And we explain some terminologies listed in the table here.

z Mobile authentication: The target filesystem can recognize and authenticate a mobile user.

z Data encryption: All transferred data are encrypted for the security concerns.

z Differential file reading/writing: When updating a file, the target filesystem can compare the new and old files, and only transfer the modified portion of the file.

z Suitable for wireless environments: The target filesystem can tolerate the variation of wireless networks and consume less bandwidth.

NFS SMB Coda sshfs RSC MAFS

Persistent cache ╳ ╳ ○ ╳ ○ ○

Disconnected operation ╳ ╳ ○ ╳ ○ ○

Mobile authentication ╳ ○ ○ ○ ○ ○

Data encryption

╳ ╳ ○ ○ ╳ ╳

Standard protocols ○ ○ ╳ ○ ○ ╳

Differential file

reading/writing ╳ ╳ ╳ ╳ ╳ ╳

Suitable for wireless env. ╳ ╳ ╳ ╳ ○ ○

Table 1: Comparison of related works

Chapter 3 Design Issues

To realize a file system, there are some basic operations like read and write needed to be implemented. Because we want AshFS can support two key mechanisms: disconnected operation and automatic synchronization, these operations are slightly different than traditional ones. With a view to easing deployment, we adopt two design aspects used by RSC filesystem to develop AshFS: standardized protocol and no server-side changes, but we changed some designs. Here we are going to discuss them, and explain our design tenets.

3.1 Read Issues

Most mobile users do lots of read operations than write, and want their readings as fast as possible without being interfered with fluctuations of bandwidth.

Figure 1: Read flow

Figure 1 shows program flow of a read operation, and we are going to discuss some issues in detail. AshFS first search its cache for information of the requested object. At first time, there is a cache miss, AshFS will contact to server to get data. If the object has been read recently, it can be found in cache, so there is a cache hit. Then AshFS will check its validity, in order to make sure the cached data are fresh enough. Note that every network requests and replies will be sent to Network manager first. Because of unsteady connection of mobile networks, we have designed the network manager to facilitate administrating.

3.1.1 Cache design

In order to fully support disconnected operation, we designed a cache to store all necessary data and utilized whole-file caching; therefore user can still uses some files when there is no connection. Not only the object data are cached, but also its metadata are cached.

The metadata includes filename, size, owner, assess rights, last modification time, etc. which are needed for a file system to interpret its data. When the client access a file for the first time, the entire file is fetched and cached in the local file system. So next time when the same object is requested again, AshFS can answer it directly without contacting server.

The basic idea of AshFS is caching information which is needed of each filesystem operation and redirecting these operations to local disk first. Usually the operation of reading a new object returns until all needed data are fetched. However, users are always impatient of waiting for a long time and think that the program is crashed when fetching a large file. Hence AshFS times each read operation and returns immediately when timeout occurs. Fetching process will continue in background, and a message will be showed when someone tries to read this incomplete file.

Furthermore, AshFS supports persistent cache: The cache contents in memory will be

written into disk periodically in case of cache miss due to power shutdown or programs exits abnormally. Generally the user can use AshFS smoothly without being aware of the existence of cache, however we have make AshFS more flexible: The user can observe the caching state and control a part of cache.

3.1.2 Cache consistency

Because the client always refers to cached file preferentially, the freshness of the cached file becomes an important issue. If a cached file has been changed on the server, client should update this file before using it to prove data consistency; this process is called cache validation.

Traditionally cache validation can be avoided if the file in the server is locked. However, locking is overkill when the file is only opened for reading [3]. In fact, mobile clients usually perform much more reads than writes. Besides, file locking is not provided in the FUSE system [1], so the validity of the cache should be guaranteed by some other ways. One is server-side “callback”: The server actively notifies the client of changed files. This mechanism has been demonstrated to be a useful optimization in cache consistency management in many prior network file systems like Coda and NFSv4 [13]. Nevertheless, we have decided not to change server-side for easy deployment, also mobile network (like GPRS) operators do not typically allow active incoming packets form Internet [3]. So we have let the client to check validity actively: It check the last modification time of a file object on the server.

On the first time the client gets metadata of a file object (a file or a directory) from the server, it also records its last modification time, we say, time stamp in cache. Different from cache validation mechanism of RSC filesystem, we enhanced AshFS’s cache consistency by mixing both timeout (used in RSC) and check-on-demand mechanisms. The time stamp will be verified under two conditions: One is when the lifetime of the cached file object expires;

the other is when the object is requested. If the time stamp of a cached file object is different from server’s one, it means that server’s file have been modified since client have cached it last time. Maybe this cached file is out-of-date, and we call such file a “stale file”. If a cached file is marked as “stale”, AshFS will abandon it and try to download a fresher one.

Verifying cache needs to communicate with the server for every file operation, and it is bandwidth-consuming, so AshFS only perform it when it is in strongly connected state (discussed latter). In weakly connection state, AshFS’s control messages and file transfer deservedly take higher priority of network usage than cache validation. In addition, it’s apparent that such a communication process will take much more time than just deal with cached files. When considering the file operation speed, waiting until checking process finishes is unreasonable. So we do this process in parallel: create another thread to verify validity, and the main thread performs the real operation simultaneously. Main thread should detach his kid and returns quickly. If there is inconsistency, kid thread will mark this state on the file object, and update will be progressed next time. Checking validity in parallel sacrifices some consistency, but has a great advantage in operation speed. We used POSIX threads package (pthreads) [10] provided by the Linux OS to implement thread management and synchronization.

3.1.3 Network manager

AshFS is aware of the connection status, and divides it into three states: Strong, Weak, and Disconnected. When the connection to the server is lost, AshFS will switch to Disconnected mode automatically. In Disconnected mode, all file system operations will be served by cache, if there is a cache miss, an error message will be displayed. If the connection exists, but bandwidth is not strong enough, AshFS will switch to Weak mode. When AshFS is operated in Weak mode, it will still fetch the uncached file from the server, but all modifications will not be propagated to the server. Hence this stage is also called

write-disconnected mode.

As a matter of fact, AshFS’s every network accessing operation will past to Network manager first. Network manager is mainly responsible for identifying connection status, communicating with the server, and (re)connecting to the server. To classify connection status, Network manager uses timeout mechanism. Before Network manager connects to the server, a default timeout, say 20 seconds, was set. When it is connecting to the server, the response time of every request will be recorded. And their average pluses a little amount will be the new timeout. And the next network accesses will take the new timeout as reference. Once upon Network manager successfully connects to the server, it will use the “ping” utility to decide the current connection status is Strong or Weak. Network manager pings the server, and analyzes the response time to decide current status. The default margin of Strong and Weak is 10 ms. If the server doesn’t receive ICMP packets, i.e. it cannot be pinged, AshFS will set connection status as Strong first.

Adapting to variability of mobile networks, our timeout is dynamic. It means that the timeout may be changed in every communications. Please look at Figure 2. When Network manager sends requests to the connected server, it waits for responses in T seconds. If it can receive response, it will compare current response time with the T. If network manager cannot receive any message in T seconds, it will double current T and requests again. More details are described in the figure, and it shows how we define Weak and Disconnected. If AshFS doesn’t get respond in T over two times, it assumes that the server is out of reach and change to disconnected mode. By the way, all default values used here for Network manager can be changed manually by user.

Figure 2: Connection status and timeout

3.2 Write Issues

We have mentioned that all file operations will be redirected to local disk first, it’s also true for the writing. AshFS uses asynchronous writeback mechanism at all bandwidth levels, which means that AshFS will not propagate the update to server when the writing has just finished. This can reduce some unnecessary communications between client and server and utilize the limited bandwidth more efficiently.

Some of MAFS’s design aspects are adopted in AshFS: we utilized asynchronous writeback at all bandwidth levels and defined priorities for each kind of AshFS’s file operations. However we didn’t intend to adopt MAFS’s invalidation-based propagation

scheme, because it needs server-side changes and consumes more resources. Since AshFS is designed for personal use, we abandoned this propagation scheme which is useful in maintaining file consistency in a multi-client system. Besides using priorities like MAFS to make bandwidth-usage more efficient, we also focused on reducing client-server network traffic when transferring files.

Please look at Figure 3. It illustrates how a write operation progresses. Here we omit the procedures of connecting to the server. In fact, when there is a write miss, AshFS will download it from the server first, and then remaining operations are redirect to the local files.

Just similar to the read operation, before writing to the local file, the validity should be verified to prove consistency. If something causes write failure (like disk full or memory leak), the write process will aborts. On the contrary, if it successfully finished, AshFS will record this write operation and push it into the write queue. Then AshFS will create another thread to start Upload manager.

Figure 3: Write flow

3.2.1 Upload manager

Upload manager is designed for uploading all the updates in write queue to the server.

Figure 4 shows how Upload manager synchronizes these modified items to the server and some important mechanisms. To be briefly, it is responsible for all the updating process, including doing optimization, checking network status, resolving conflicts, avoiding read-write contention, labeling time stamp, and marking the upload status. We’ll discuss these processes soon. When an update has been uploaded to the server successfully, write manager will remove this item from the write queue. It is deserved to be mentioned that before processing each queue item, Upload manager will check the network status first. If it is in Disconnected or Weak state, Network manager will cease current upload process until it changes back to Strong state. Because when the connection is weak, uploading will exhaust the poor and precious bandwidth; therefore cause a high latency of AshFS.

Figure 4: Upload manager

3.2.2 Write optimization and read-write contention

Item in queue can be optimized because that some file operations could be canceled out.

For example, if a user creates a new file, and after writing this file for several times, he finally removes the file. Operations to this file can all be canceled and it’s not necessary to upload it ever.

AshFS write updates asynchronously to the server. When upload manager propagates these updates in background, user can still use AshFS to do other read/write operations in foreground. So it’s usual that there are both read traffic and write traffic at the same time. This situation is called read-write contention [5]. However, most of update processes are deferrable, and they contest available bandwidth with foreground activities. Especially when the available bandwidth is poor, such competition will severely slow down the operation speed of file system. Using a network file system over such a low-bandwidth environment must consume less bandwidth, and avoid monopolizing network links in use for other purposes [26].

To smooth user’s file system operations without being interfered by minor tasks, we divided AshFS’s network operations into several classes, and sorted them based on their importance to the most users. Table 1 is the priorities of AshFS’s network operations. The most important operation is getting the metadata. Because it can let the user know quickly what contents an uncached directory has, or how large of an uncached file is. Fetching a file in background, which has been discussed in part B, we gave it a priority number 3, because the user may want to read it as soon as possible. Synchronization messages and upload are last two in our priority table. Although it may cause higher file inconsistency, it’s acceptable to do such a compromise.

AshFS operations Descriptions

Get metadata Directory contents, file attributes, file type.., etc.

Fetch file Ordinary fetch

Background fetch Fetch large file

Synchronize Check time stamps

Upload Upload file to the server

Table 2: Priorities of network accesses

3.2.3 Conflict detection and resolution

Because the server doesn’t positively notify its clients which files were changed, the consistency can only be checked by the client itself. There are two kinds of file inconsistency:

one is Stale, and the other is Conflict, which we are going to discuss in this section. “Stale”

means that the file on the server has been modified before the client writes it, so the cached file is out-of-date. We have talked about this in 3.1.2. Unfortunately, when Upload manager are trying to propagate a modified file to the server, and it finds that this file has already been changed on the server, then conflict occurs. It probably means that there are two clients have modified the same file simultaneously.

Let’s explain how AshFS detects a conflict here. We compare the time stamps. There are three kinds of time stamps which will be tracked by AshFS:

z Tm: time stamp of the file recorded by AshFS.

z T_s:time stamp of server’s file.

z Tc :time stamp of the file in AshFS’s cache now.

Case 1: T_c= T_m < T_s

Î Server’s file has been modified.

Î Our client doesn’t modify (use) it yet.

Î Stale.

Case 2: Tm < Ts and Tm < Tc

Î Server’s file has been modified.

Î Our client has also modified it but doesn’t upload it yet.

Î Conflict.

In short, when these three time stamps are all unequal, we could say there is a conflict.

To resolve a conflict, AshFS applies the “no lost update” principle [3]. It keeps both server’s and client’s new file, but rename client’s one. The new name of file is similar with old one, with a suffix and a time label appended to it. For example, the original file name is “hello.txt”, and its conflict copy could be renamed as “hello-conflict-20080521.txt”. So user can easily identify it as a conflict file. This principle may be a little conservative, but preserving both sides’ files let user can decide by himself / herself, could be the safest one. Resolution should be automatic and transparent without user intervention. In our implementation, users will not receive messages about conflicts. But they can perceive the existence of conflict files by the filename or the upload log.

3.3 Other Issues

Besides read and write issues, there are some other important mechanisms. We are going to interpret them here.

3.3.1 Directory refresh

Listing is almost the most frequently-used function of a file system. User lists a directory and sees if there is something interesting. But how does AshFS know if the directory is updated or not? Let’s begin with an example.

Figure 5: Readdir flow

Figure 5 is a part of flowchart of customized Readdir() function. As we mentioned, the checking time stamp process will be done by another thread, thread 2. Thread 2 asks server for time stamp of the directory. If it is different from one in AshFS’s cache, it will know that something under the directory was deleted or newly added. So it will mark the directory as

“invalid”. The original thread doesn’t check time stamp itself, but it inspects the valid bit of the directory. If the valid bit is set, main thread will contact to the server to reread the directory, and refresh the cache.

3.3.2 Time stamp synchronization

Now we are going to discuss time stamp synchronization. When upload manager finishes propagating one file to the server, it’s intuitive to think that the server’s file is totally

the same as one in the client’s cache. However, their time stamps are rarely the same, since their finishing time are usually different. So when the client uses the file next time, AshFS will think that it was out-of-date, and try to update the cached file. Hence if a file is being uploading to the server, AshFS will temporarily turn off the synchronization of the file. And when the uploading has finished, AshFS will equalize their time stamps and their parent directory’s time stamps. Only when this process succeeds, the propagation of this item can be seen as completed.

3.3.3 Improve operation speed and bandwidth consuming

Because of the uncertainty and preciousness of the mobile networks, we are trying to accomplish most of operations in the local file system without going through Internet. When accessing Internet, we want to use it more efficiently. In addition, comparing with local operations, communicating with the server is time-consuming. There are four circumstances that AshFS consumes a large amount of network bandwidth:

z Retrieving metadata: especially after reading an uncached directory.

z Cache validation: when checking time stamps.

z Time stamp synchronization: After uploading a file.

z Fetching and storing files.

When a user lists an uncached directory, AshFS library will first try to link and call

“readdir” function of AshFS process to find all entries under this directory. Then for each entry, the library will ask “stat” function for the metadata. After knowing what type of this

在文檔中 AshFS: 一個支援離線操作的輕量化行動檔案系統 (頁 19-0)