## Description

## Cryptanalysis of RC4 encryption algorithm

The objective of this assignment is to do some simple cryptanalysis of the RC4 encryption algorithm. RC4

is a stream cipher, internally generating a stream of random bits which is then XOR-ed with the plaintext.

The secret key initializes the random number generator. However, on observation there appears to be a

weakness in RC4. Perhaps two keys that are very similar will result in similar random numbers coming out

of RC4. In this assignment you will implement RC4, make some differential measurements of randomness

for related keys, and you will graph your results. For details about RC4 you can refer wiki RC4.

Credits\Courtesy –

https://www.cs.rice.edu/ dwallach/courses/comp527 s99/ass4.html

## 1 Assignment Details

### 1.1 The Basics

You are going to first implement the RC4 algorithm. You can find some code at the RC4 in three lines of Perl

site (which also includes a full, readable C version). This assignment is a form of related-key cryptanalysis.

With a stream cipher like RC4, it should be the case that, if you toggle exactly one bit in the key, 50% of

output bits should toggle. If you look closely at the key setup, it doesn’t look like this is the case. It appears

that single-bit differences in the key will result in fairly small differences in the internal state of RC4. After

a sufficient amount of output, this small change will eventually propagate and then you would expect 50%

of the output bits to toggle.

### 1.2 Implement the cipher

Don’t worry about XOR-ing the output bits with a message you’re going to keep secret. We’re only going

to study the random bits. Make sure you can accept keys of length up to 2048 bits.

### 1.3 Capture the output bits from two runs and XOR them

Generate a random key of 2048 bits (Unix’s random(3C) is good enough for now). Collect some output

from RC4. Toggle one or more of the key bits. Reinitialize RC4 and collect the output again. XOR them

1

together. This gives you the difference of the two streams. If RC4 were ”perfect”, this difference stream

would be ”perfectly” random.

### 1.4 Analyze the differential output bits for randomness

There are many tests for randomness of numbers. You’re going to implement a simple frequency counting

test. Create an array of counters of length equal to a power of two (say, 256). Now, say the test data is a

sequence of bits (b0, b1, b2, b3…bN ).

You can look at each sequence of 8 bits (b0, b1, b2, b3…b9) as a number

and increment the appropriate counter in the array. When you’re done you would expect that the counters

would be approximately equal.

If the two original bit streams were very similar, you would expect the

counters for lots of zeros to have higher values than the counters for lots of ones.

You can compute a numerical measure of the randomness like so:

• N = number of samples.

• C = number of counters.

• D = standard deviation of counter values.

• R =(D ∗ C)/N;

The closer the randomness (R) is to zero, the more random the data. You might consider using different

numbers of counters and see if your randomness measure changes very much

### 1.5 Run this randomness test lots of times and collect the results

Measure the randomness on outputs ranging from short through long (i.e., 2 bytes, 4 bytes, 8 bytes, 32

bytes, 128 bytes, 1024 bytes). For each of these, you need to consider the effect of toggling one bit in the

key, two bits in the key, and so forth through 32 bits. For each of these pairs, you should make at least 20

measurements of the randomness and average the results. Your results will be more accurate if you make

several thousand measurements for each pair.

### 1.6 Graph the results

You should produce a graph with one line for each output length. The Y axis is the randomness (higher

values imply less randomness). The X axis is the number of bits you toggled. You would expect each line

to start somewhere above zero and, as you toggle more bits, approach zero quickly. You would also expect

to see higher values for the lines corresponding to shorter runs of data.

You can either use Unix tools like gnuplot or a spreadsheet like Excel to generate your graph.

## 2 What to Submit

Create a tar-gz file with name: Assignment1-RollNo1.tgz (e.g. Assignment1-CS17B099.tgz) that will contain

a directory named Assignment1-RollNo1 with all relevant files.

The directory should contain the following files:

• Source Files

2

• A Makefile which generates all your executables

• A technical REPORT (in PDF format) with help of screen shots showing the working of your implementation (Graphs). Report your observations and analyze the results, in 1-2 paragraphs. The report

should include your name, roll number, assignments.

Also, give a brief summary as to what you learnt in this experiment and how much beneficial you feel

the experiment was.

• a README file containing instructions to compile, run and test your program. README file should

be written such that a TA must be able to run your code without your presence/help.

## 3 Help

1. Ask questions EARLY and start your work NOW. Take advantage of the help of the TAs and the

instructor.

2. Submissions PAST the extended deadline SHOULD NOT be mailed to the TAs. Only submissions

approved by the instructor or uploaded to Moodle within the deadline will be graded.

3. Demonstration of code execution to the TAs MUST be done using the student’s code uploaded on

Moodle.

4. NO sharing of code between students, submission of downloaded code (from the Internet, Campus

LAN, or anywhere else) is allowed. Code copying will result in a ’U’ Course Grade. Students may

also be reported to the Campus Disciplinary Committee, which can impose additional penalties.

5. Please protect your Moodle account password. Do not share it with ANYONE. Do not share your

academic disk drive space on the Campus LAN.

6. Implement the solutions, step by step. Trying to write the program in one setting may lead to frustration and errors.

## 4 Grading

• Code : 40 points

• Report: 40 points

• Viva voce: 20 points

3