ZooKeeper Basic Principles
Overview
ZooKeeper is a distributed, highly available coordination service. ZooKeeper is used to provide following functions:
- Prevents the system from SPOFs and provides reliable services for applications.
- Provides distributed coordination services and manages configuration information.
Architecture
Nodes in a ZooKeeper cluster have three roles: Leader, Follower, and Observer, as shown in Figure 1. Generally, an odd number of (2N+1) ZooKeeper services need to be configured in the cluster, and at least (N+1) vote majority is required to successfully perform the write operation.
Table 1 describes the functions of each module shown in Figure 1.
Name |
Description |
---|---|
Leader |
Only one node serves as the Leader in a ZooKeeper cluster. The Leader, elected by Followers using the ZooKeeper Atomic Broadcast (ZAB) protocol, receives and coordinates all write requests and synchronizes written information to Followers and Observers. |
Follower |
Follower has two functions:
|
Observer |
The Observer does not take part in voting for election and write requests. It only processes read requests and forwards write requests to the Leader, increasing system processing efficiency. |
Client |
Reads and writes data from or to the ZooKeeper cluster. For example, HBase can serve as a ZooKeeper client and use the arbitration function of the ZooKeeper cluster to control the active/standby status of HMaster. |
If security services are enabled in the cluster, authentication is required during the connection to ZooKeeper. The authentication modes are as follows:
- Keytab mode: You need to obtain a human-machine user from the MRS cluster administrator for MRS console login and authentication, and obtain the Keytab file of the user.
- Ticket mode: Obtain a human-machine user from the MRS cluster administrator for subsequent secure login, enable the renewable and forwardable functions of the Kerberos service, set the ticket update period, and restart Kerberos and related components.
- By default, the validity period of the user password is 90 days. Therefore, the validity period of the obtained Keytab file is 90 days.
- The parameters for enabling the renewable and forwardable functions and setting the ticket update interval are on the System tab of the Kerberos service configuration page. The ticket update interval can be set to kdc_renew_lifetime or kdc_max_renewable_life based on the actual situation.
Principles
- Write Request
- After the Follower or Observer receives a write request, the Follower or Observer sends the request to the Leader.
- The Leader coordinates Followers to determine whether to accept the write request by voting.
- If more than half of voters return a write success message, the Leader submits the write request and returns a success message. Otherwise, a failure message is returned.
- The Follower or Observer returns the processing results.
- Read-Only Request
The client directly reads data from the Leader, Follower, or Observer.
Typical Specifications
Typical Specifications lists the typical specifications of the ZooKeeper service.
Specification |
Value |
Description |
---|---|---|
Maximum number of ZooKeeper instances in a cluster |
9 |
|
Maximum number of connections per IP address for each ZooKeeper instance |
2000 |
- |
Maximum number of connections for a ZooKeeper instance |
20000 |
- |
Maximum number of ZNodes in the case of default configurations |
2000000 |
If there are too many ZNodes, the service will be unstable and the read and write performance of the component deteriorates. In regular service scenarios, it is recommended that there be no more than 2 million ZNodes. If you deployed only ClickHouse and its dependent components in the cluster, there can be no more than 6 million ZNodes. |
Size of a single ZNode |
4 MB |
- |
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.