Project 3: Tree-based encryption and decryption
Due: Sunday, 10/2/2022, by 12:00pm noonYou will receive 100% for turning this project in and passing the functionality test and style check before the deadline. After the deadline is 25% off.
Total points: 90
Before you start this project, you should have read the project submission guidelines and the coding style rules.
After that, get started on the project immediately! You will submit your solution to the new submission site.
Project Description
- Implement an ordered data structure,
- Implement a binary search tree,
- Implement algorithms with constant, linear, and logarithmic time,
- Practice the use of data abstraction,
- Practice coding an encryption algorithm,
- Test your program in the command line environment with input/output redirection and diff utilities, and
- Get you to start using the new submission site for project uploads
Encryption is a procedure for turning information which anyone can understand into information which only people with knowledge of a special key can understand. Decryption is the reverse of this process. Many simple encryption schemes are based on substitutions. For example, we could "rotate the alphabet" by 1 letter, letting b represent a, c for b, and so on. Thus the phrase "four score and seven years ago" would be encrypted into "gpvs tdpsf boe tfwfo zfbst bhp," using this simple substitution cipher. The key needed to encrypt information in this type of substitution cipher is the knowledge that each letter is shifted forward one (and the key for decryption is the to shift each letter backward one). When a message is not encrypted, it is called "plaintext."
For this project you will implement a program which implements encryption and decryption using word substitutions rather than letter substitutions. Every word will be comprised only of lowercase alphabet characters.
The codebook (the key) for this substitution cipher will be a binary search tree. The tree will store all possible words that can be encrypted (or decrypted). The method for encryption will be a representation of the path that one takes from the root of the tree to find the word in the tree.
Picture a binary tree codebook that looks like this:score <-- root node / \ / \ / \ and seven / \ \ ago four years
Then let us encrypt the phrase we had earlier ("four score and seven years ago") by translating each word into the the path taken from the root of the tree to the word. The word "four" translates to "r01", since we start at the root (r) and go to its left child (0), then the right child (1) of that node. The word "score" translates to "r", since it is the root word. The entire phrase is encrypted into "r01 r r0 r1 r11 r00". Each word is encrypted by starting at the root (r), and searching. Each time we move to a left child as we go down the tree, we add a zero (0), and each time we move to a right child, we add a one (1).
The tree is, of course, ordered. Thus the word in each tree node should be greater (in lexicographic sorted order) than the words in all nodes in its left subtree, and lesser than the words in all nodes in its right subtree.
Notice that depending on the codebook (which is the tree), the same input message could be encrypted in many different ways.
To build a codebook, insert the words into the codebook one at a time. One way to build the example codebook above is to start with an empty tree, and then insert the following words in the given order: 'score', 'and', 'seven', 'ago', 'four', 'years' (which you may notice is a level-order traversal of the final tree). However, that is not the only way to form that tree. The order the words are inserted can change the structure of the tree.
Input and output format
Your program should read its input from standard input (System.in
) and
print to standard output (System.out
). Input will be a sequence of
commands which modify the codebook and give the unencrypted message, and output
will be the codebook(s) and the encrypted message(s). All input will be valid
input as specified here, and there will be no blank lines on the input.
i x
— insert word x into the codebook (here x represents any word)r x
— remove word x from the codebook (here x represents any word)e 'cleartext message'
— encrypt the given cleartext message. Each cleartext message will consist of words separated by a single space, and the entire message will be surrounded by single quotes.d 'encrypted message'
— decrypt the given encrypted message. Each encrypted message will have the same format as a normal message, but the words are encrypted (e.g. r00101).p
— print the codebook in preorder format (see section 4.6 in your book to learn about preorder tree traversal), visiting left children before right childrenq
— quit the program (stop processing input)
You may find the getline(istream
&,
string &)
function useful for getting an entire line at a
time (e.g.
after you read a command to encrypt or decrypt). However, this is not
necessary.
Also, messages on the input will not span multiple lines; they will be on the
same line as the 'e' or 'd' commands.
If the input asks the program to encrypt a word that is not in the codebook, the program should print out '?'.
The reason you must output the codebook in preorder is because it would allow the receiver of the encrypted message to reconstruct the codebook. The receiver could take the codebook you printed and insert the words into his codebook in the order you printed them, and obtain the same codebook.
Procedures for inserting and removing words
Insert each new word at the bottom of the tree, at the place where it would be sorted. For example, inserting the word 'zebra' into the example tree would place it as the right child of 'years'.
Removing a word must follow a particular strategy. Assuming that the word to remove is at tree node N:- If N has no children, then N is simply removed.
- If N has only one child, that child replaces N.
- If N has both children, then the left-most node in the right child of N is removed and used to replace N.
Sample inputs
Several sample inputs are available here. You can get the correct output by capturing the output of the sample executable on these inputs.Provided code
You must use the C++ file provided here. I have provided a lot of code in this file. You are not allowed to change the existing code in this file. Instead, write your cpp and submit the implementation file.
Remember that when using templates, all of the code you write goes in the .h
file. So you will turn in 2 files for this project: bst-student-proj3.h and
driver-proj3.cpp. Your driver should
#include "bst-student-proj3.h"
, and that file should
#include "bst-prof-proj3.h"
(which it already does).
The EncryptionTree's two methods (encrypt and decrypt) method should not print anything; instead they should return values to the caller (driver) who can then print something.
Sample executables
Here are sample executables for you. When you design test cases, you can judge your output against the output from the correct solution. Here are the compiled solutions: If you give a command-line argument to these executables, they will print extra information about how they are exploring. For example, this will execute the program like normal, redirecting input from a file called my_input.txt:% project3_solution_dos.exe < my_input.txt
But here is the mode of operation that will cause the program to print out what it is doing in more detail:
% project3_solution_dos.exe printMore < my_input.txt
The command-line argument doesn't have to be the word "printMore", it can be anything.
Milestones
Since this is a large project, it helps to have a plan of attack. The following milestones will not be turned in nor graded, however, I strongly recommend that you do them for your own good and at the suggested pace.
Step | Finish by | Milestone |
---|---|---|
1. | Thursday a.p. | WRITE and TEST all the unfinished methods in the BSTNode. |
2. | Monday w.a.p. | WRITE and TEST the BST insert method, and the encrypt and decrypt methods in the EncryptionTree. |
3. | Wednesday w.a.p. | WRITE and TEST the BST remove method and the project driver. Finish early so you have time to debug. |
Use your book. Your book has code for writing a binary search tree, but your code will be fairly different, since the book uses recursion but your code will mostly not use recursion. However, you should read and understand the code in the book first.
Final notes
Remember when writing this program to adhere to the coding style guidelines. This is an individual project, so you must work alone. No credit will be given for a solution which does not pass all the hidden tests I create, or does not pass in the allowed time. For more detailed instructions, read the project submission guidelines.