Lab Day II: Hack ZooKeeper

Objective

This specialized class event is designed to guide students through understanding and exploring the internals of Apache ZooKeeper, focusing on its core components like SessionTracker and the internal processing of ephemeral nodes. Students will compile ZooKeeper from source and delve into the codebase to gain a deeper understanding of how ZooKeeper functions under the hood.

We recommend you forming groups of 3-4 students to do this lab together. Each team plays against another team.

Prerequisites

  • Intermediate knowledge of Java, as ZooKeeper is primarily Java-based.
  • Basic familiarity with Git for cloning the repository.
  • Apache Maven (Java package manager) installed for building ZooKeeper from source. The latest version can be found on the official website.

Part 1: Setting Up

Install Apache Maven

Please follow the guide from the official website. For example, for installing on Ubuntu, run commandlines:

1
sudo apt install git maven

Compile ZooKeeper from source codes

  1. Clone the latest ZooKeeper codes from github
    1
    git clone https://github.com/apache/zookeeper.git
  2. Compile ZooKeeper codes
    1
    2
    cd zookeeper
    mvn clean install -DskipTests
  3. Follow the same steps in the previous lecture to configure ZooKeeper under conf/
    1
    2
    3
    tickTime=2000
    dataDir=/path/to/zookeeper/data
    clientPort=2181

Starting ZooKeeper Server and Test

  • Run the following command in the terminal:
    1
    2
    bin/zkServer.sh start
    bin/zkServer.sh status
  • Test commands using clients, see if the server can handle requests correctly:
    1
    2
    3
    bin/zkCli.sh -server 127.0.0.1:2181
    create /myfirstznode "My first ZooKeeper data"
    ls /
  • If you are interested, take a look at the log output under logs/ to see what happened inside the server.

Part 2: New Quest: Immortal Ephemeral Nodes

In this part you will learn how to change system behaviors. First let’s understand session management in ZooKeeper.

In ZooKeeper, a session represents a connection between a client and the server. Sessions are essential for maintaining the state of interactions, and their management is critical for the overall reliability and consistency of the system. ZooKeeper tracks sessions to manage ephemeral nodes, which are tied to the lifecycle of each session.

Overview of SessionTrackerImpl.java

SessionTrackerImpl.java (zookeeper-server/src/main/java/org/apache/zookeeper/server/SessionTrackerImpl.java) is part of the ZooKeeper server’s source code that specifically handles session tracking. It implements the SessionTracker interface, providing the functionality to track the creation, expiration, and renewal of sessions.

Key Responsibilities

  • Session Creation: When a new client connects to a ZooKeeper server, a session is created. SessionTrackerImpl assigns a unique session ID and tracks the session’s timeout settings.
  • Session Expiration: If a client does not send a heartbeat within the specified timeout period, SessionTrackerImpl marks the session as expired. This triggers the deletion of any ephemeral nodes associated with the session.
  • Session Renewal: Clients send heartbeats to keep their sessions active. SessionTrackerImpl updates its internal tracking to reflect the latest activity, preventing the session from expiring prematurely.

Core Components and Processes

Session Tracking Mechanism

  • Session Map: SessionTrackerImpl maintains a map of session IDs to session objects, which contain the session’s timeout information and last activity timestamp.
  • Timeout Handling: The implementation uses a timing wheel algorithm to efficiently manage session timeouts. This structure allows for quick updates and checks for session expiration.

Expiration and Cleanup

  • Expiration Queue: Sessions close to their expiration time are placed in a queue. A background thread periodically checks this queue to determine which sessions have expired.
  • Ephemeral Node Deletion: On session expiration, SessionTrackerImpl notifies the ZooKeeper server to remove any ephemeral nodes associated with the expired session.

Heartbeat Processing

  • Heartbeat Reception: Clients send periodic heartbeats to keep their sessions alive. SessionTrackerImpl processes these heartbeats, updating the session’s last activity timestamp.
  • Session Renewal Logic: The tracker uses the updated activity timestamp to adjust the session’s position in the expiration queue, effectively renewing the session.

Exercise Task: Modify ZooKeeper codes to Make Session Immortal (Thus Ephemeral Nodes)

Modify SessionTrackerImpl.java codes, so sessions will not expire after clients disconnect.

Hint: you may want to take a look on codes related to sessionExpiryQueue

To test, run:

1
2
bin/zkCli.sh -server 127.0.0.1:2181
create -e /e1

quit the client, wait 30s and start a new one:

1
2
bin/zkCli.sh -server 127.0.0.1:2181
ls /

If you succeeded, the ephemeral node should still exist, instead of disappearing after timeout

1
[e1, zookeeper]

Game: Counter Strike, details to be announced

We will announce the task in the middle of class.

Resources