辅导案例-CSCM12

欢迎使用51辅导,51作业君孵化低价透明的学长辅导平台,服务保持优质,平均费用压低50%以上! 51fudao.top
CSCM12 CW2 Google Self-Driving Car Gary Tam


Scenario: Google is developing the self-driving car business. Their technology captures the surrounding as 3D point
cloud, then it segments and subsequently detects nearby objects. One of the critical computational tasks is to find K
neighbouring 3D points closest to a query point (the car). The operation must be implemented as quick as possible.
However, due to hardware and cost constraint, and the huge amount of point cloud (over million points), they often
found that there is not enough memory to do so. Google is now testing if you can help to solve the problem.
Task: This is an individual coursework. You are given a set of 3D points (x,y,z) stored in a linked list. These points are
in an arbitrary order. To simplify the problem, all coordinates (x,y,z) are integers. Your task is to find the XORsum of
the K closest points given a query point qPoint. Due to limited resources, there are two further requirements:
1) there is limited memory (213MB) available on their hardware, and
2) the algorithm should process ALL provided test cases correctly and within 1 second.
To simplify output checking, you should XORsum all K output points into a three-value-tuple for verification.
Definition: XORsum of a sequence b1, b2, bk, …, bn is defined as b1 XOR b2 XOR bk XOR … XOR bn
Example
INPUT:
qPoint: query 3D point
K: 4 (Number of points closest to qPoint)
N: 10 (Number of input 3D points)
newList: N input 3D points (data)
OUTPUT:
A three-value-tuple which is the
XORsum of all K nearest points
to qPoint.
CONSTRAINTS:
a) 1 ≤ N ≤ 10,000,000
b) N/5 ≤ K ≤ N/2.5 < N
c) all memory use ≤ 213MB
Sample Input
qPoint = [11,10,14]
K = 4
N = 10
newList:
[ 1, 1, 9]
[10, 9, 8]
[ 6, 12, 0]
[12, 4, 6]
[ 5, 5, 2]
[ 4, 14, 13]
[11, 10, 14]
[ 8, 13, 11]
[14, 2, 12]
[13, 0, 7]

Sample Output: [13, 0, 0]
Explanation:
K nearest points and their Euclidean
distances to qPoint are highlighted:
[11, 10, 14] 0.00
[ 8, 13, 11] 5.20
[10, 9, 8] 6.16
[ 4, 14, 13] 8.12
[14, 2, 12] 8.77
[12, 4, 6] 10.05
[13, 0, 7] 12.37
[ 5, 5, 2] 14.32
[ 1, 1, 9] 14.35
[ 6, 12, 0] 15.00

(including the qPoint if matched)
XORsum(
[11, 10, 14],
[ 8, 13, 11],
[10, 9, 8],
[ 4, 14, 13])
11 ^ 8 ^ 10 ^ 4 = 13
10 ^ 13 ^ 9 ^ 14 = 0
14 ^ 11 ^ 8 ^ 13 = 0

Output: [13, 0, 0]

Google shares a piece of algorithm to check your knowledge on complexity.
INPUT: qPoint – query 3D point, K – the number of 3D points closest to qPoint;
newList – N input 3D points; N/5 ≤ K ≤ N/2.5 < N
OUTPUT: The XORsum of K closest points in newList to qPoint.
ALGORITHM:
compute_dist() // compute Euclidean distance to qPoint
BubbleSort(newlist) // sort 3D points in increasing order of distances
i ← 0
xorsum_x ← 0
xorsum_y ← 0
xorsum_z ← 0
cur = newlist.head ← 0
while ( i < K && cur != null )
xorsum_x ← xorsum_x ^ cur.x
xorsum_y ← xorsum_y ^ cur.y
xorsum_z ← xorsum_z ^ cur.z
i++
cur = cur.next
end while
print xorsum_x
print xorsum_y
print xorsum_z

Coursework 2 - Details (Total 100 marks):
You must provide original codes that is created entirely by you. If you do use external reading to inspire your
solution, you should provide references. Otherwise, case of plagiarism will be pursued aggressively.

Task 1 [10 marks] Code and complexity analysis
Analyze the algorithm provided by Google and find the asymptotic complexity in Big-O notation, in term of N alone
[5 marks] (Hint, note the constraint of K in relation to N.) Show your steps clearly [5 marks].
Task 2 [40 marks])
Design an algorithm that is asymptotically faster than that in Task 1.
- Write your idea in plain English and explain your new algorithm [10 marks], and WHY it works [20 marks].
(Do not give step by step pseudocodes. I can read your codes. I want you to EXPLAIN the ideas, preferably with pics.)
- Analyse the complexity of your algorithm using Big-O notation (in term of N), show your steps [10 marks].
Task 3 [50 marks]) Coding & Program Correctness
Develop your algorithm (Task 2) as a java program, using the provided java template (TestClass.java).
Verify that your answers passes all test cases, satisfies the memory requirement, and runs < 1s in total on server.
Tip 1: You are provided a compressed project file for this coursework - CW2.7z.
Download CW2.rar from
Blackboard.

Download and use WinRAR (right)to
decompress it.


Once extracted, the folder structure looks like the following:
/CW2
| diff.exe - a program which finds and shows the difference between two files (see Tip 4-6).
| timeout.exe - a Windows program which kills a process if it runs too long (see Tip 9).
| timeout - a Mac equivalent of timeout.exe (be used on mac only).
| FastRead.java - a java class provided to quickly read a binary integer file - no modification allowed.
| libiconv2.dll - a helper file for diff.exe to run on a Windows machine
| libintl3.dll - a helper file for diff.exe to run on a Windows machine.
| Stopwatch.java - a java class provided to handle timing - no modification allowed.
| LinkList.java - a container class that stores all input 3D point - no modification allowed.
| PointNode.java - a node class that stores a 3D point (to be used in LinkList) - no modification allowed
| TestClass.bat - a script to test your program (see description below, see Tip 6), containing the valid K
| TestClass.sh - same as TestClass.bat, see Tip 6 also.
| TestClassG.sh - script for marking, runs on server (observe the Google’s memory requirement therein.)
| TestClass.java - your java program (see Tip 2). modify this template for submission. submit only this file.
+---input - a folder containing 7 input test cases (see Tip 3)
| A1.bin - test case#1 newlist - see example input on page 1 of this handout.
| B1.txt - test case#1 - contains the qPoint. Note that K is contained in TestClass.bat/.sh/G.sh files
| … … … …
\---result - a folder containing 7 correct results from each test cases (see Tip 4)
Result1.txt - results for test case #1 - see example output on page 1 of this handout.
… … … …
Result7.txt - results for test case #7

Tip 2: Use the provided java code template. FastRead class provides functionality to quickly read data from a file. A
Stopwatch class provides functionality for timing.
Tip 3: You should prepare and test your program locally on your own machine. Use the provided test cases.
Your java program will take three parameters. 1) K – the number of closest points to query, 2) qPoint – the query 3D
point (from B?.txt), and 3) a file name - an external file containing a linklist of integers (A?.bin).
It must then print the XORsum of K closest 3D points to the standard output (screen, System.out.println(...)). If your
program runs correctly, it should produce these results (timing will vary across machines, see also Tip 8):
~/CW1> javac TestClass.java
~/CW1> java -cp . TestClass 4 input/B1.txt input/A1.bin
13
0
0
Elapsed Time: 0.002 s
A1.bin are the list of integers of test case#1 - see page 1. B1.txt contains the qPoint. K = 4 (different test cases use
different K, see Tip 7.)
Tip 4: All test cases #1-7 are provided with expected correct results in respective files Result1.txt ... Result7.txt.
You can easily verify your program output as follows:
Use "> Output1.txt" to redirect/pipe your result from standard output (screen) into a file Output1.txt.
Use "diff Result1.txt Output1.txt" to check and compare your result.
diff is a program to find "difference" within two files. If there is nothing return like below, your result is correct.
~/CW2> javac TestClass.java
~/CW2> java -Xms32m -Xmx213m -cp . TestClass 4 input/B1.txt input/A1.bin > Output1.txt
Elapsed Time: 0.001 s
~/CW2> diff result\Result1.txt Output1.txt

~/CW2>
On Linux/Mac OS, the diff program is available from the operating system by default.
"-Xms32m -Xmx213m" flag simulates the requirement of memory constraints. It is used to restrict JVM memory to use
at most 32MB stack memory (i.e. method stack) and 213MB heap memory (e.g. creating objects with new).
Tip 5: If your program produces incorrect results from the output, you will see the difference when you run the diff
program. For example, if you use the java template and intentionally query just 2 closest points, this is the output.
~/CW2> javac TestClass.java
~/CW2> java -Xms32m -Xmx213m -cp . TestClass 2 input/B1.txt input/A1.bin > Output1.txt
Elapsed Time: 0.0 s
~/CW2> diff result\Result1.txt Output1.txt
1,3c1,3
< 13
< 0
< 0
---
> 3
> 7
> 5
~/CW2>
Tip 6: If you want to process all test cases in one go, use TestClass.bat (on Windows) or TestClass.sh (on Linux/Mac
OS) on your machines (lab machines / laptops). This example indicates that results (5-7) are incorrect/not available.
~/CW2> javac TestClass.java
~/CW2> ./TestClass.bat
Elapsed Time: 0.001 s
Elapsed Time: 0.0 s
Elapsed Time: 0.01 s
Elapsed Time: 0.227 s
Files result/Result5.txt and Output5.txt differ
Files result/Result6.txt and Output6.txt differ
Files result/Result7.txt and Output7.txt differ
To allow Linux/Mac OS to run the TestClass.sh, use the following command to change file permissions:
chmod 755 TestClass.sh and replace ./TestClass.bat with ./TestClass.sh in the above example.
Tip 7: Your program must process all 7 test cases and give the correct results on the benchmark server. Respective
parameters K, qPoint and input file linklist of size N used in each of the test cases, can be found in the TestClass.bat,
TestClass.sh or TestClassG.sh. All test cases must be processed in under 1 seconds (in total) and use less than
213MB heap memory on the benchmark server (IP: 137.44.2.196) to score full marks using TestClassG.sh. Login
details have been sent individually to your emails.
On the benchmark server, there is NO javac compiler available. You should work locally on your own machine, and
upload TestClass.class when you are ready. Use the benchmark server only to test your java program for timing.
Check out the provided A2-ServerSetup.pdf to setup your testing environment on benchmark server. The following
solution will get partial marks from Part2, with timing on the benchmark server (see screenshot):
~/CW2> ./TestClassG.sh
Elapsed Time: 0.0 s
Elapsed Time: 0.001 s
Elapsed Time: 0.001 s
Elapsed Time: 0.01 s
Elapsed Time: 0.034 s
Elapsed Time: 0.527 s
Elapsed Time: 5.565 s
~/CW2>


Tip 8: Test case #7 is a large dataset. You need a fast algorithm to handle them. To score full marks, your program
must obtain the correct result for all test case#1-7 in under 1 second and using at most 213MB memory on the
benchmark server. A sample solution getting full marks will produce the following results, with timing on the
benchmark server (see screenshot):
~/CW2> ./TestClassG.sh
Elapsed Time: 0.001 s
Elapsed Time: 0.0 s
Elapsed Time: 0.001 s
Elapsed Time: 0.003 s
Elapsed Time: 0.018 s
Elapsed Time: 0.18 s
Elapsed Time: 0.693 s
~/CW2>

Tip 9: ./TestClass.bat (Windows) and ./TestClass.sh (Linux/MacOS) are intended to be used on your local machines.
To test on the benchmark server, you should run ./TestClassG.sh (see Tip7 and 8).
./TestClassG.sh and ./TestClass.sh are the same except that it uses files from marker’s account directly without
duplicating resources on the benchmark server.
During testing, your java program may run out of memory (Main memory use is > 213M) with error message like:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
It means your program use more memory than allowed. Sometimes it runs without problem on your PC/Mac, but
fails on the linux server. That is because the memory management of all JVM on PC/Mac/Linux is slightly different.
Marking will stick to ./TestClassG.sh on the benchmark server (linux). Make sure you test your program on the
server before submission.
The Benchmark server has 6GB memory and four 2GHz CPU cores. It will support four testings simultaneously. Use
Ctrl-C to stop your program if it runs too long - no more than 12 seconds!
You are advised to use the timeout command to automatically kill your process if it runs more than 12 seconds on
the benchmark server. Example:
~/CW2> timeout 12 java -cp . -Xms32m -Xmx213m TestClass 4 input/B1.txt input/A1.bin > Output1.txt
Tip 10: The marker will check and run each submitted coursework using ./TestClassG.sh on the benchmark server for
marking purpose. Please make sure your program runs as expected.
Tip 11: There may be issues (e.g. caching issue, multiple users, background tasks etc) that affect the timings.
Run your program 10 times (wait sometime between each run), select the best timing, and report it.
Tip 12: Download your submitted file and verify its content. Make sure you submit the correct version of your java
source file, and that you did submit it. In the past, students got zero because s/he submitted the wrong java files.
Tip 13: How do you achieve full marks? Read the marking assessment matrix below.
Tip 14: This coursework is designed to be challenging. The goal of the coursework is to teach you the skills and
challenges that you will encounter in your job interview and future career. It tests your knowledge of the efficiency
and tools we have in computer science, and make choices in different situations: constraint in speed and memory.
Tip 15: There will be a lot of challenges, e.g. algorithm design, programming and debugging during your coursework.
You can start with simple ideas, get the correct results, and then think of faster ways to improve it. In any case, start
very early!
Tip 16: Your algorithm should be generic, and handle all provided cases. Marker may test your algorithm with new
and larger dataset.

Release Date: 13 March, 2020.
Due Date: 24 April, 2020, 11am. (during Easter break)
Submission: You should submit (electronically) the following to the Blackboard coursework collection page.
1. A written report (pdf ONLY) to Task1, and Task2.
Provide a screenshot to show how long your program runs for each test cases on benchmark server.
Due to other concurrent processes on benchmark server, your timing may be a little slower at time.
Try 10 times, screen capture the best timing, and report it here.
(You need not attached the TestClass.java in your pdf.)
2. ONE working java source file ONLY for Task3: TestClass.java
Marker will use only this submitted file to test run on the benchmark server for exact timing.
The java source file must be clearly commented.
If your program does not run/work and the marker is unable to understand your code, no marks will be given.
You must compress the above two (pdf and java) into a zip file, and submit it to blackboard.
Tip: download your submitted file and verify its content. Make sure you submit the correct version of your java
source file, and that you did submit it. In the past, students got zero mark because s/he submitted the wrong files. If
in doubt, submit again (unlimited submission, but only the last attempt will be marked).

Late Submission: Zero mark - in line with the College of Science policy.
Marking Criteria:
Task 1/Task 2: You must show your steps to obtain the complexity of your algorithm in Big-O notation.
Task 2: If your algorithm has the same or asymptotically slower complexity than that in Task 1, no more than 50%
will be given in this part.
No efforts to articulate
your ideas.

Fail to use Big-O notation
to analyse complexity.

No efforts or knowledge
is shown that you
understand Big-O
notation or complexity
Only steps by steps discussion
of the algorithm is provided. No
summary of ideas or
explanation of WHY the
algorithm works.

There are problems in your
derivation of complexity.

Your algorithm is not
asymptotically faster than that
in Task1.
Some ideas how you derive
your algorithm are
discussed. The explanation
of WHY the algorithm works
is still unclear.

The derivation of complexity
is mostly right, and your
algorithm is asymptotically
faster than that in Task1;
however, there are minor
mistakes in your steps.
The derivation and
explanation of WHY
the algorithm works is
well explained.


Correct complexity
analysis.

Your analysis reflects
your implementation
in Task3.
<50% 50% 75% 100%

Task3: You are NOT allowed to use any standard library provided by JAVA (e.g. java.util) nor other java classes (e.g.
ArrayList), except the LinkList, PointNode, FastRead, Stopwatch class provided. You are allowed to use [] array if
you need to. You must implement all source codes by yourself, and show your understanding. Otherwise, up to 40%
of total marks will be deducted. To reduce difficulty, you can get inspiration and adapt codes available from the
textbook website: https://algs4.cs.princeton.edu/code/ . Note that all adaption must be discussed and referenced.
If you do use external reading to inspire your solution, you must provide references. Otherwise, case of plagiarism
will be pursued aggressively. The following timing is based on the running on benchmark server. If your solution
satisfies the timing, but not the memory constraint, further marks will be deducted. Note that you are also not
allowed to remove any elements from the provided linked list.
Many test cases
failed / too slow to
run
All test cases passed
within 12s + satisfy
mem req.
+ all test cases run <
8s in total
+ all test cases run <
7s in total
+ all test cases run <
1s in total
<50% 50% 75% 80% 100%
General Notes:
(a) The lecturer has tried to provide all the information you require to complete this coursework. If you think
some information is missing or not clear, you can either ask for the information (preferably by email) or
make a reasonable assumption (and document the assumption(s) made in your coursework submission).
(b) An electronic submission of the coursework, using blackboard and the provided answer sheets, is
required. Ensure it is clearly labelled with your details and who the work is for.
(c) Important: You are expected to submit your own work for this coursework. On submitting your
coursework on blackboard, you agree that the following statement is true:

Academic Integrity and Academic Misconduct Statement
By submitting this coursework, electronically and/or hardcopy, you state that you fully understand and are complying
with the university's policy on Academic Integrity and Academic Misconduct. The policy can be found at:
https://myuni.swansea.ac.uk/academic-life/academic-misconduct/

On many occasions when working on coursework and projects, it is useful to ask others (the lecturer, a
postgraduate assistant, or other students) for hints or debugging help, or to talk generally about the written
problems or programming strategies. Such activity is both acceptable and encouraged, but you must indicate
in your submission any assistance you received. Any assistance received that is not given proper citation
will be considered a violation of the policy on plagiarism or collusion. In any event, you are responsible for
understanding and being able to explain on your own all written and programming solutions that you submit.
The course staff will pursue aggressively all suspected cases of plagiarism and collusion, and they will be
handled through official University channels.
What are plagiarism and collusion?
https://myuni.swansea.ac.uk/academic-life/academic-misconduct/
https://myuni.swansea.ac.uk/media/FAQs---School-College-Level.pdf.pdf

51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: abby12468