• 沒有找到結果。

Bootstrapping the Master

Sets the server ID.

Sets the log severity level to output.

Call to create a session.

The first two lines set the seed for random number generation and set the identifier of this master. We use the server_id to identify different masters. (Recall that we can have one or more backup masters as well as a primary master.) Next, we set the severity level of the log messages. The implementation of logging is homebrewed (see log.h), and we have copied from the ZooKeeper distribution (zookeeper_log.h) for convenience. Fi‐

nally, we have the call to zookeeper_init, which makes main_watcher the function that processes session events.

Bootstrapping the Master

Bootstrapping the master refers to creating a few znodes used in the operation of the master-worker example and running for primary master. We first create four necessary znodes:

void bootstrap() { if(!connected) {

LOG_WARN(("Client not connected to ZooKeeper"));

return;

}

create_parent("/workers", "");

create_parent("/assign", "");

create_parent("/tasks", "");

create_parent("/status", "");

...

}

If not yet connected, log that fact and return.

Create four parent znodes: /workers, /assign, /tasks, and /status.

And here’s the corresponding create_parent function:

void create_parent(const char * path, const char * value) { zoo_acreate(zh,

path, value, 0,

&ZOO_OPEN_ACL_UNSAFE, 0,

create_parent_completion, NULL);

}

Asynchronous call to create a znode. It passes a zhandle_t instance, which is a global static variable in our implementation.

The path is a parameter of the call of type const char*. The path is used to tie a client to a subtree of a znode, as described in “Managing Client Connect Strings” on page 197.

The second parameter of the call is the data to store with the znode. We pass this data to create_parent just to illustrate that we need to pass it as the completion data of zoo_create in case we need to retry the operation. In our example, passing data to create_parent is not strictly necessary because it is empty in all four cases.

This parameter is the length of the value being stored (the previous parameter).

In this case, we set it to zero.

We don’t care about ACLs in this example, so we just set it to be unsafe.

These parent znodes are persistent and not sequential, so we don’t pass any flags.

Because this is an asynchronous call, we pass a completion function that the ZooKeeper client calls upon completion of the operation.

The last parameter is the context of this call, but in this particular case, there is no context to be passed.

Because this is an asynchronous call, we pass a completion function to be called when the operation completes. The definition of the completion function is:

typedef void

(*string_completion_t)(int rc,

const char *value, const void *data);

rc is the return code, which appears in all completion functions.

value is the string returned.

data is context data passed by the caller when making an asynchronous call.

Note that the programmer is responsible for freeing any heap space associated with the data pointer.

For this particular example, we have this implementation:

void create_parent_completion (int rc, const char *value, const void *data) { switch (rc) {

case ZCONNECTIONLOSS:

create_parent(value, (const char *) data);

break;

case ZOK:

LOG_INFO(("Created parent node", value));

break;

case ZNODEEXISTS:

LOG_WARN(("Node already exists"));

break;

default:

LOG_ERROR(("Something went wrong when running for master"));

break;

} }

Check the return code to determine what to do.

Try again in the case of connection loss.

Most of the completion function consists simply of logging to inform us of what is going on. In general, completion functions are a bit more complex, although it is good practice to split functionality across different completion methods as we do in this example. Note that if a connection is lost, this code ends up calling create_parent multiple times. This is not a recursive call because the completion function is not called by create_parent.

Also, create_parent simply calls a ZooKeeper function, so it has no side effects that would come, for example, from allocating memory space. If we do create side effects, it is important to clean up before making another call from the completion function.

The next task is to run for master. Running for master basically involves trying to create the /master znode to lock in the primary master role. There are a few differences from the asynchronous create call we just discussed for parent znodes, though:

void run_for_master() {

snprintf(server_id_string, 9, "%x", server_id);

zoo_acreate(zh,

Store the server identifier in the /master znode.

We have to pass the length of the data being stored. It is an int, as we have declared here.

This znode is ephemeral, so we have to pass the ephemeral flag.

The completion function also has to do a bit more than the earlier one:

void master_create_completion (int rc, const char *value, const void *data) { switch (rc) {

break;

case ZNODEEXISTS:

master_exists();

break;

default:

LOG_ERROR(LOGCALLBACK(zh),

"Something went wrong when running for master.");

break;

} }

Upon connection loss, check whether a master znode has been created by this master or some other master.

If we have been able to create it, then take leadership.

If the master znode already exists (someone else has taken the lock), run a function to watch for the later disappearance of the znode.

If this master finds that /master already exists, it proceeds to set a watch with a call to zoo_awexists:

void master_exists() { zoo_awexists(zh, "/master",

master_exists_watcher, NULL,

master_exists_completion, NULL);

}

Defines the watcher for /master.

Callback for this exists call.

Note that this call allows us to pass a context to the watcher function as well. Although we do not make use of it in this case, the watcher function allows us to pass a (void *) to some structure or variable that represents the context of this call.

Our implementation of the watcher function that processes the notification when the znode is deleted is the following:

void master_exists_watcher (zhandle_t *zh, int type, int state, const char *path, void *watcherCtx) { if( type == ZOO_DELETED_EVENT) {

assert( !strcmp(path, "/master") );

run_for_master();

} else {

LOG_DEBUG(LOGCALLBACK(zh),

"Watched event: ", type2string(type));

} }

If /master gets deleted, run for master.

Back to the master_exists call. The completion function we implement is simple and follows the pattern we have been using thus far. The one small important detail to note is that between the execution of the call to create /master and the execution of the exists request, it is possible that the /master znode has been deleted (i.e., that the previous primary master has gone away). Consequently, the completion function veri‐

fies that the znode exists and, if it does not, the client runs for master again:

void master_exists_completion (int rc,

const struct Stat *stat, const void *data) { switch (rc) {

case ZCONNECTIONLOSS:

case ZOPERATIONTIMEOUT:

master_exists();

break;

case ZOK:

if(stat == NULL) {

LOG_INFO(LOGCALLBACK(zh),

"Previous master is gone, running for master");

run_for_master();

}

break;

default:

LOG_WARN(LOGCALLBACK(zh),

"Something went wrong when executing exists: ", rc2string(rc));

break;

} }

Checks whether the znode exists by checking whether stat is null.

Runs for master again if the znode is gone.

Once the master determines it is the primary, it takes leadership, as we explain next.