FIT2004 S2/2019: Assignment 4

Nathan Companez

DEADLINE: Friday 25th October 2019 23:55:00 AEST

LATE SUBMISSION PENALTY: 10% penalty per day. Submissions more than 7 days

late are generally not accepted. The number of days late is rounded up, e.g. 5 hours late

means 1 day late, 27 hours late is 2 days late. For special consideration, please complete

and send the in-semester special consideration form with appropriate supporting document

before the deadline to [email protected]

PROGRAMMING CRITERIA: It is required that you implement this exercise strictly

using Python programming language (version should not be earlier than 3.5). This

practical work will be marked on the time complexity, space complexity and functionality

of your program.

Your program will be tested using automated test scripts. It is therefore critically impor-

tant that you name your files and functions as specified in this document. If you do not, it

will make your submission difficult to mark, and you will be penalised.

SUBMISSION REQUIREMENT: You will submit a zipped file (named

studentId_A4.zip, e.g. if your student id is XXXX, the name of zipped file must be

XXXX_A4.zip). It should contain a single python file, assignment4.py, and a single pdf

file, description.pdf

PLAGIARISM: The assignments will be checked for plagiarism using an advanced pla-

giarism detector. All suspected cases will be investigated. The faculty takes academic

integrity seriously and consequences for such misconduct can include receiving zero marks

for the assignment or unit or suspension or exclusion. Please see the University Policy

https://www.monash.edu/students/admin/policies/academic-integrity for more details.

“Helping” others is NOT ACCEPTED. Please do not share your solutions partially or/and

completely to others. If someone asks you for help, ask them to visit us during consultation

hours for help.

1

Learning Outcomes

This assignment achieves the Learning Outcomes of:

• Analyse general problem-solving strategies and algorithmic paradigms, and apply them

to solving new problems

• Prove correctness of programs, analyse their space and time complexities

• Develop and implement algorithms to solve computational problems.

In addition, you will develop the following employability skills:

• Text comprehension

• Designing test cases

• Ability to follow specifications precisely

Warning

For all assignments in this unit, you may not use python dictionaries or sets. For all assignments

in this unit, please ensure that you carefully check the complexity of each python function that

you use. Common examples which cause students to lose marks are list slicing, inserting or

deleting elements in a list, using the in keyword to check for membership of an iterable, or

building a string using repeated concatenation of characters. These are just a few examples, so

be careful. Remember, you are responsible for the complexity of every line of code you write!

Assignment timeline

In order to be successful in this assessment, the following steps are provided as a suggestion.

This is an approach which will be useful to you both in future units, and in industry.

Planning

1. Read the assignment specification as soon as possible and write out a list of questions

you have about it.

2. Clarify these questions; You can go to a consultation, talk to your tutor, discuss the tasks

with friends or ask in the forums.

3. As soon as possible, start thinking about the problems in the assignment.

• It is strongly recommended that you do not write code until you have a solid feeling

for how the problem works and how you will solve it.

4. Writing down small examples and solving them by hand is an excellent tool for coming

to a better understanding of the problem.

• As you are doing this, you will also get a feel for the kinds of edge cases your code

will have to deal with.

2

5. Write down a high level description of the algorithm you will use.

6. Determine the complexity of your algorithm idea, ensuring it meets the requirements.

7. Write your description.pdf.

• You should be able to start working on this before you write your code.

• If you cannot, perhaps your it is worth thinking a little more about how exactly your

code will work.

Implementing

1. Think of test cases that you can use to check if your algorithm works.

• Use the edge cases you found during the previous phase to inspire your test cases.

• It is also a good idea to generate large random test cases.

• Sharing test cases is allowed, as it is not helping solve the assignment.

2. Code up your algorithm, (remember decomposition and comments) and test it on the

tests you have thought of.

3. Try to break your code. Think of what kinds of inputs you could be presented with which

your code might not be able to handle.

• Large inputs

• Small inputs

• Inputs with strange properties

• What if everything is the same?

• What if everything is different?

• etc...

Before submission

• Make sure that the input/output format of your code matches the specification.

• Make sure your filenames match the specification.

• Make sure your functions are named correctly and take the correct inputs.

• Make sure you zip your files correctly

3

Background

A word ladder is a series of words, all the same length, where each word differs with the

previous word by exactly one character. In this assignment we will refer to the first word

as the start_word and the last word as the target_word. A word ladder puzzle is

normally presented by giving the start_word, then some number of empty space, then

the target_word. To solve the puzzle, the player must find the sequence of words that

completes the chain. These words are called the intermediate words. In general the words

must be valid English words. In general, the solution to a word ladder will not involve

repeated words, since this loop could be removed without invalidating the solution. An

example of such a puzzle and its solution are:

To make our own word ladders, we need to use a data structure which allows us to easily

check if two words differ by a single letter. In this assignment, we will represent this

information using a graph, where each vertex corresponds to a word, and there is an edge

between two vertices when those two words differ by a single letter. A solution to a word

ladder would be a path in this graph.

Algorithm Descriptions (2 mark)

For each of the three tasks, you must also write a brief description of your algorithm. The

total length should be no more than 800 words.

These two descriptions will be submitted in the pdf file description.pdf mentioned in the

"Submission Requirement" section. These description should explain the steps your algorithm

takes to solve the problem, and the complexity of each step. Please try to keep these descriptions

at a fairly high level, talk about your data structures and algorithms, not individual variables

or lines of code.

0 Building the graph (6 marks)

For all the tasks in this assignment, we will be working with a graph where each vertex repre-

sents a word, and there is an edge between two vertices if and only if those two words differ by

exactly one character. You do not need to check that this is true, you may assume the input

4

files always correctly encode such a graph.

Before we start, we need to construct the graph. You will be provided with two files. You will

need to read these files into an appropriate graph data structure. To do this, you will create

a Python class, Graph. The other tasks in this assignment will each require you to write a

method for your Graph class.

For this assignment, all the graphs will be simple, undirected graphs.

Important: If you do not create a class, or you implement the later tasks as independent

functions rather than methods of the Graph class, you may receive 0 marks for the assignment.

0.1 Input

The graph is given to you as two files, the first containing information about the vertices,

and the second containing information about the edges. For this task, you need to write the

__init__(self, vertices_filename, edges_filename) function for your Graph class. This

function takes two filenames as inputs, and reads the data from them into an appropriate data

structure.

The first line of vertices_filename has a number, n, which is the number of vertices in the

graph. The next n lines each start with a unique integer between 0 and n − 1 inclusive (the

vertex ID) and then a space, and then a word (the word which that vertex represents). The

words will contain only lowercase English alphabet characters. For this assignment, you

may assume that all words are constant size.

Each line of edges_filename consists of two numbers, separated by a a space. Each of these

pairs of numbers represents an undirected edge connecting the two vertices with those IDs. You

are guaranteed that all the numbers in edges_filename are valid vertex IDs (i.e. are integers

between 0 and n− 1 inclusive). The edges are in no particular order.

Example:

Calling this line of code should correctly populate an instance of your Graph class.

my_graph = Graph("vertices.txt", "edges.txt")

Note that here the two files are called vertices.txt and edges.txt, but they could be called

anything. Similarly, the name my_graph is just used as an example, it could be any name:

0.2 Output

No output is required. The marks for this task are awarded for having the correct complexity,

and having no errors in your graph construction. That said, the choices you make about how

your graph is implemented will affect the complexity of later tasks, so make sure you have read

the whole assignment.

0.3 Complexity

__init__ should run in O(V + E), where

5

• V is the number of lines in vertices_filename

• E is the number of lines in edges_filename

6

1 Solving a ladder (10 marks)

Once we have built the graph, we want to be able to solve word ladders! Given a word ladder

puzzle (i.e. a start_vertex and target_vertex, which each have a word), you need to write

a function which finds the shortest list of intermediate words which solves it (or reports that

no such list exists)

You will write a method for the Graph class, called solve_ladder(self, start_vertex,

target_vertex), which will return a list of the intermediate words.

For a given pair of start_vertex and target_vertex, there may be several equally short lists

of intermediate words. Any of them is acceptable.

1.1 Input

start_vertex and target_vertex. These are integers between 0 and V − 1 inclusive, where

V is the number of vertices in the graph.

1.2 Output

The function should return a list of strings, which correspond to the words in the word ladder

from start_vertex to target_vertex. Note that this list can be empty under certain con-

ditions. In some cases, it may not be possible to find a chain from the start_vertex to the

target_vertex. In these cases, the function should return False.

Example: If vertices_filename contains

7

0 aaa

1 aab

2 abb

3 bbb

4 caa

5 cba

6 bba

Then edges_filename would contain

0 1

0 4

1 2

2 3

3 6

4 5

5 6

Suppose that we had created an instance of Graph, called my_graph to store the above graph.

my_graph.solve_ladder(0, 6) returns ["aaa", "caa", "cba", "bba"]

7

1.3 Complexity

solve_ladder must run in O(V + E) time, where

• V is the number of vertices in the graph

• E is the number of edges in the graph

8

2 Restricted word ladder (12 marks)

Now, rather than the shortest word ladder, you want to find the cheapest word ladder. The

cost of a word ladder is found by summing the the costs of the changes made to characters

during that ladder. The cost of changing a character from given by squaring the difference in

alphabet position of the two characters. For example, the cost of transforming "a" into "c"

would be (3− 1)2 = 4. Transforming "a" into "z" would cost (26− 1)2 = 625. The cost of the

word ladder in the background section, which starts with "cold" and ends with "warm" would

be

(L→ R) + (O → A) + (C → W ) + (D →M)

=(12− 18)2 + (15− 1)2 + (3− 23)2 + (4− 13)2

=36 + 196 + 400 + 81

=713

However, there is an additional requirement. In this task, the word ladder must also

contain at least one word which starts with a given letter. This letter will be one of our input

parameters.

You will write a method for the Graph class, called cheapest_ladder(self, start_vertex,

target_vertex, req_char) to solve this problem.

2.1 Input

start_vertex, target_vertex and req_char. start_vertex and target_vertex are integers

between 0 and V − 1 inclusive, where V is the number of vertices in the graph. req_char is a

lowercase english alphabet character.

2.2 Output

cheapest_ladder should return two things as a tuple (or False). The first is the cost of the

cheapest word ladder. The second is a list of strings, which are the intermediate words in the

cheapest word ladder starting at start_vertex and ending at target_vertex, such that at

least one word in the ladder has req_char as its first character (This can be the the word

associated with start_vertex or target_vertex). Note that this list can be empty under

certain conditions.

In some cases, it may not be possible to find a chain from the start_vertex to the tar-

get_vertex with at least one word in the ladder has req_char as its first character . In these

cases, the function should return False.

Example:

7

0 aaa

1 aab

2 abb

3 bbb

4 caa

9

5 cba

6 bba

Then edges_filename would contain

0 1

0 4

1 2

2 3

3 6

4 5

5 6

Suppose that we had created an instance of Graph, called my_graph to store the above graph.

• my_graph.cheapest_ladder(0, 6, ’z’) returns False, as there is no path from "aaa"

to "bba" where one of the words has z as the first letter.

• my_graph.cheapest_ladder(0, 6, ’a’) returns (4, [’aaa’, ’aab’, ’abb’, ’bbb’,

’bba’]), as there are two paths from "aaa" to "bba", both of which have at least one

word starting with a, but this path has total cost 4 while the other has total cost 6.

• my_graph.cheapest_ladder(0, 6, ’c’) returns (6, [’aaa’, ’caa’, ’cba’, ’bba’]),

as there are two paths from "aaa" to "bba", but only this path has at least one word

starting with c

2.3 Complexity

cheapest_ladder must run in O(E log(V )) time, where

• V is the number of vertices in the graph

• E is the number of edges in the graph

10