Project 2: Randomized Consensus Protocol

Implement Ben-Or's randomized consensus protocol

Ben-Or Randomized Consensus Protocol:

Each peer implements the protocol below.  See also page 85 of the Consensus slide deck on the Schedule.

input:= random boolean initial value, [0,1]
output:= boolean final consensus value

n:= number of nodes in the system
f:= number of faulty nodes permitted; n/2 > f


begin

  pref = input
  round = 1

    while true
        do_Protocol
    end

end

do_Protocol:
  send {"round":round, "pref": pref, "phase": 1, "ratify":0} message to all peers.
  wait to receive (n – f) (phase 1)  messages
  if received more than (n / 2) ( phase 1) messages with same pref
      send ("phase": 2, "round":round, "pref": pref, "ratify":1) to all processes
  else
      send ("phase":2, "round":round, "pref": -1, "ratify":0) to all processes

  wait to receive (n – f) (phase 2) messages

  if received 1 ("phase":2, "round":round, "pref":v, "ratify":1) message
      pref = v
      if received more than f ("phase":2, "round":round, "pref":v, "ratify":1) messages
           output = v
           print DONE:output
           exit
  else
           preference = CoinFlip()
  round = round+1

def CoinFlip() := random selection of 0 or 1


All peers communicate over a single UDP socket on the default port 50000: 

Your program should print **exactly one line of output:DONE: [output value, 0 or 1]; Round: [round number].  **
There must be NO extra debugging output, or you will lose points.  Use --verbose to enable debugging output in your implementation.

Your program should have a deterministic result, i.e., it should always find the correct solution, and none of the peers should crash.  No more than f peers should time out during any run. 

Your program:

Your program, proj2, must execute on the command line using the following command.

$ ./proj2 [-h] --peers PEERS [PEERS ...] [--port PORT] [--f number of faulty nodes allowed] [--timeout max runtime in seconds] [--verbose]

--peers , --f, and --port is required on the command line.
--timeout(default 300 seconds), and --verbose must be implemented, but are optional on the command line.  

A Dockerfile and several Docker Compose files, proj2-compose-5.yml, proj2-compose-11.yml, and proj2-compose-21.ymlare supplied in the starter code. 

To run and build the containers, the FAULTS environment variable must be set, so the value of –f is passed to the compose file.  For example, for bash or sh: 

  • export FAULTS=[num faults]; docker compose -f proj2-compose-[5,11,21].yml up --build 
To run the container after building:
  • export FAULTS=[num faults]; docker compose -f proj2-compose-[5,11,21].yml up

Asynchronous communication:

We encourage you to write your code in an event-driven style, using select() or poll() on the datagram socket to which the peer is connected (see the starter code for an example). This will keep your code single-threaded, making debugging significantly easier. Alternatively, you can implement your router in a threaded model (with one thread handling each socket), but expect it to be substantially more challenging to debug.

Starter code:

A template repository containing the Dockerfile and Docker compose files is available on the Khoury GitHub server.    

Runtime Analysis:

Once your implementation is running, perform the following runtime analysis.  For 5-, 11-, and 21-node systems, determine the distribution of the number of rounds needed for the system to reach consensus.  The table below lists the values of f to use for your tests.  For each value of f, the system should be run at least 12 times.  More is OK.  The number of runs must be documented in the written evaluation.  Using a whisker or box plot, display the min, max, 1st, and 3rd quartiles of the number of runs to consensus for each configuration.  Plot the results of all the configurations on the same graph; the x-axis is the configuration, and the y-axis is the number of runs. In your written analysis of the results, briefly explain the differences in the individual runs per system given, f, and across systems with different values of n

`n`, Number of Nodes in System `f`, Number of Failures  Number of runs
5 1
2
11 1
2
5
21 1
4
10

Submitting Your Project

To turn in your project, you should submit the following items:

  • A Dockerfile
  • The compose files proj2-compose-*.yml
  • A PDF-formatted README.pdf file. In this file, briefly describe your high-level approach, any challenges you faced, an overview of how you tested your code, and the analysis graphic with a short explanation of your findings (graph 1/2 page, explanation [1/2 page, no more than a page]).
  • The source code of the implementation and any files needed to compile it.  If the source code is a script, confirm it is executable.

These files should all be placed in the root directory of a compressed archive (i.e., a .zip file) and then uploaded to Gradescope. Alternatively, you can check all these items into Khoury’s GitHub, download a zip file from Khoury’s GitHub, and submit that to Gradescope, or any other upload/submission option Gradescope permits.  If your submission is missing any required file for evaluation, it will receive 0 points.

There is NO autograder for this project.  Your submission will be evaluated by hand.  

Grading

The grading of this project will be broken down as follows:

Item Percentage of Grade
Program correctness 70%
Analysis 15%
Style and documentation 15%

Questions:

Ask on Piazza or come to office hours (the TA’s or Prof. Jackson’s).