Understanding the physical layout: nodes and shards

Diving into the functionality

2.2 Understanding the physical layout: nodes and shards

Understanding how data is physically laid out boils down to understanding how Elasticsearch scales. In this section, we’ll explain how scaling works by looking at how multiple nodes work together in a cluster, how data is divided in shards and replicated, and how indexing and searching works with multiple shards and replicas.

To understand the big picture, let’s review what happens when an Elasticsearch index is created. By default, each index is made up of five primary shards, each with one replica, for a total of ten shards. Each shard is a part of your index, so roughly 20% of your data is found on each shard. Those five primary shards are replicated into five more shards, as illustrated in figure 2.3.

#A Arrow from the left to Node1: A node is an instance of Elasticsearch

#B Arrow from the top to primary shard 2 of Node 2: A primary shard is a chunk of your index

#C Arrow from the right to replica 2 of Node 3: A replica is a copy of a primary shard

Figure 2.3 A three-node cluster with an index divided into five shards, with one replica per shard

As we’ll explore next, replicas are good for reliability and search performance. Technically, a shard is a directory of files where Lucene stores the data for your index. A shard is also the smallest unit that Elasticsearch moves from node to node.

2.2.1 Creating a cluster of one or more nodes

A node is an instance of Elasticsearch. When you start Elasticsearch on your server, you have a node. If you start Elasticsearch on another server, it’s another node. You can even have more nodes on the same server, by starting multiple Elasticsearch processes.

Multiple nodes can join the same cluster. With a cluster of multiple nodes, the same data can be spread across multiple servers. This helps performance because Elasticsearch has more resources to work with. It also helps reliability: if you have at least one replica per shard, any node can disappear, and Elasticsearch will still serve you all the data. For an application that’s

using Elasticsearch, having one or more nodes in a cluster is transparent. Applications typically care only about documents, types and indices, not shards and nodes.

Nodes in a cluster are like dancers in a show

You might prefer skaters, or actors, or any other type of performer. Either way, the same way you can have one or more dancers in a show, you can have one or more nodes in a cluster.

Similar to shows, the minute you add a second node in a cluster, you need to make sure that they can communicate to each other. Only then can they share the work. The same way a show remains a single show, designed to entertain the viewer, a cluster remains a cluster, designed to serve the same piece of data.

With Elasticsearch, you typically start with one node so you can test your application and add more nodes as your data grows and you need more performance. We take a deeper look at how you can add nodes to your cluster in chapter 9.

WHAT HAPPENS WHEN YOU INDEX A DOCUMENT?

By default, when you index a document, it’s first sent to one of the primary shards, which is chosen based on a hash of the document’s ID. Then, the document is sent to be indexed in all of that primary shard’s replicas (see left side of figure 2.4). This keeps replicas in sync with data from the primary shards. Being in sync allows replicas to serve searches and to be automatically promoted to primary shards in case the original primary becomes unavailable.

Figure 2.4 Documents are indexed to random primary shards and their replicas. Searches run on complete sets of shards, regardless of their status as primaries or replicas.

WHAT HAPPENS WHEN YOU SEARCH AN INDEX?

When you search an index, Elasticsearch has to look in a complete set of shards for that index (see right side of figure 2.4). Those shards can be either primary or replicas because primary and replica shards typically contain the same documents. Elasticsearch distributes the search load between the primary and replica shards of the index you’re searching, making replicas useful for both search performance and fault tolerance.

Next, we’ll look at the details of what primary and replica shards are and how they’re allocated in an Elasticsearch cluster.

2.2.2 Understanding primary and replica shards

Let’s start with the smallest unit Elasticsearch deals with, a shard. A shard is a Lucene index:

a directory of files containing an inverted index. An inverted index is a structure that enables Elasticsearch to tell you which document contains a term (a word) without having to look at all the documents.

In Figure 2.5, you can see what sort of information the first primary shard of your get-together index may contain. The shard get-get-together0, as we’ll call it from now on, is a Lucene index—an inverted index. By default, it stores the original document’s content plus additional information, such as term dictionary and term frequencies, which helps searching.

The term dictionary maps each term to identifiers of documents containing that term (see figure 2.5). When searching, Elasticsearch doesn’t have to look through all the documents for that term—it uses this dictionary to quickly identify all the documents that match.

Term frequencies give Elasticsearch quick access to the number of appearances of a term in a document. This is important for calculating the relevancy score of results. For example, if you search for “denver”, documents that contain “denver” many times are typically more relevant. Elasticsearch gives them a higher score, and they appear higher in the list of results.

# Arrow from the top to the black title: A shard is a Lucene index Figure 2.5 Term dictionary and frequencies in a Lucene index

A shard can be either a primary or a replica shard, with replicas being exactly that–copies of the primary shard. A replica is used for searching, or it becomes a new primary shard if the original primary shard is lost.

An Elasticsearch index is made up of one or more primary shards and zero or more replica shards. In Figure 2.6, you can see that the Elasticsearch get-together index is made up of six total shards: two primary shards (darker boxes), and two replicas for each shard (lighter boxes) for a total of four replicas.

Figure 2.6 Multiple primary and replica shards make up the get-together index

NOTE You can change the number of replicas per shard at any time because replicas can always be created or removed. This doesn’t apply to the number of primary shards an index is divided into: you have to decide on the number of shards before creating the index. Keep in mind that too few shards limit how much you can scale, and too many shards impact performance. The default setting of five is typically a good start, and we expand the subject in chapter 9, which is all about scaling.

All the shards and replicas you’ve seen so far are distributed to nodes within an Elasticsearch cluster. Next, we’ll look at some details about how Elasticsearch distributes shards and replicas in a cluster having one or more nodes.

2.2.3 Distributing shards in a cluster

The simplest Elasticsearch cluster is one having one node: one machine running one Elasticsearch process. When you first installed Elasticsearch in chapter 1 and started it, you created a one-node cluster.

As you add more nodes to the same cluster, the existing shards get balanced between all the nodes. As a result, both indexing and search requests that work with those shards benefit from the extra power of your added nodes. Scaling this way (by adding nodes to a cluster) is called horizontal scaling; you add more nodes, and requests are then distributed so they all share the work.

The alternative to horizontal scaling is to scale vertically; you add more resources to your Elasticsearch node, perhaps by dedicating more processors to it if it’s a virtual machine, or adding RAM to a physical machine. Although vertical scaling helps performance almost every time, it’s not always possible or cost-effective. Using shards enables you to scale horizontally.

Suppose you want to scale your get-together index, which currently has two shards and no replicas. The first option is to scale vertically by upgrading the node: for example, adding more RAM, more CPUs, faster disks and so on. The second option is to scale horizontally by adding another node and having your data distributed between the two nodes.

Figure 2.7 To improve performance, scale vertically (top left) or scale horizontally (lower right).

We discuss more about performance in chapter 10. For now, let’s see how indexing and searching work across multiple shards and replicas.

2.2.4 Distributed indexing and searching

At this point you might wonder how indexing and searching works with multiple shards spread across multiple nodes.

Let’s take indexing, as shown in figure 2.8. The Elasticsearch node that receives your indexing request first selects the shard to index the document to. By default, documents are distributed evenly between shards⁵.

Once the target shard is determined, the current node forwards the document to the node holding that shard. Subsequently, that indexing operation is replayed by all the replicas of that

5 By default, for each document, the shard is determined by hashing its ID string. Each shard has an equal chunk of the total hash range and receives, in normal conditions, an equal chunk of documents.

shard. The indexing command successfully returns after all the available replicas finish indexing the document.

Figure 2.8 Indexing operation is forwarded to the responsible shard, and then to its replicas

With searching, the node that receives the request forwards it to a set of shards containing all your data, it doesn’t matter if those shards are primaries or replicas. Elasticsearch uses a round-robin format to forward the request to the cluster’s nodes and shards. As shown in figure 2.9, Elasticsearch then gathers results from those shards, aggregates them into a single reply, and forwards the reply back to the client application.

Figure 2.9 Search request is forwarded to shards/replicas containing a complete set of data. Then, results are aggregated and sent back to the client.

Given that requests are sent to primary shards and replicas in round-robin fashion, Elasticsearch assumes that all nodes in your cluster are equally fast. You typically achieve this with identical hardware and software configurations. If that’s not the case, you can organize your data or configure your shards to prevent the slower nodes from becoming a bottleneck.

We explore more about such options in chapter 9. For now, let’s start indexing documents in the single-node Elasticsearch cluster that you started in chapter 1.

在文檔中 MEAP Edition Manning Early Access Program Elasticsearch in Action Version 11 (頁 34-40)