程序代写案例-CS 458/658

欢迎使用51辅导，51作业君孵化低价透明的学长辅导平台，服务保持优质，平均费用压低50%以上！ 51fudao.top

UNIVERSITY OF WATERLOO
Cheriton School of Computer Science
CS 458/658 Computer Security and Privacy Winter 2021
ASSIGNMENT 2
• Milestone due date: February 26th 2021 at 3:00pm (optional)
• Assignment due date: March 12th 2021 at 3:00 pm (the usual 48-hour automatic ex-
tension applies).
Total marks: 89 + 5 Bonus Marks
Written Response TA: Sajin Sasy
Programming Response TAs: Miti Mazmudar, Matthew Rafuse
TA Office Hours: Mondays 10:00 - 11:00 EDT
Please use Piazza for questions and clarifications. We will be using BigBlueButton for TA office
hours this term; we have separate online rooms for the written and programming parts. To attend
office hours, access the corresponding URL for that assignment part and use the corresponding
access code when prompted. When asked for your name please enter both your first and last name
as they appear in LEARN.
Written: https://bbb.crysp.org/b/you-9rz-fv4 - Access code: 190863
Programming: https://bbb.crysp.org/b/you-nv9-7pn - Access code: 367905
1
What to hand-in
All assignment submission takes place on the student.cs machines (not ugster or the virtual
environments), using the submit facility. In particular, log in to the Linux student environment
(linux.student.cs.uwaterloo.ca), go to the directory that contains your solution, and
submit using the following command: submit cs458 2 . (dot included). CS 658 students
should also use this command and ignore the warning message.
1. A2 milestone deadline:
Note that the A2 Milestone is optional. If you submit your (functional) A2 milestone by the
deadline, you will receive an additional 5 bonus marks. These marks are on top of the total
marks for the assignment, and not submitting the milestone does not preclude you from getting
full marks on the assignment.
src.tar: Your source files for the first question of the programming assignment, in your
supported language of choice, inside a tarball. After the milestone, your submissions for the
first question will not be tested. See the instructions below on how to create the tarball.
2. A2 deadline:
a2.pdf: A PDF file containing your answers for the written-response questions. It must contain,
at the top of the first page, your name, UW userid, and student number. If it does not, a
three mark penalty will be assessed. Be sure to “embed all fonts” into your PDF files.
Some students’ files were unreadable in the past; if we can’t read it, we can’t mark it.
Note that renaming a .txt file to a .pdf file does not make it a PDF file.
src.tar: Your source files for the 2nd to the 7th questions of the programming assignment, in
your supported language of choice, inside a tarball.
Re-read the output requirements and testing and marking sections, before creating your
tarball. To create the tarball, cd to the directory containing your code, and run the
command
tar cvf src.tar .
(including the .). If you are using an interpreted language, your source should include an
executable script named ids that runs your code using the proper shebang. For compiled
languages, include a Makefile in your source with a default target that builds an
executable named ids.
2
1 Written Response Questions [42 marks]
1.1 [19 marks] Intelligent Agents of Intelligence fight for Intelligence
CipherIsland Intelligence Service (CIIS), the central espionage service of CipherIsland, uses the
Bell-La Padula confidentiality model to protect its documents with the following sensitivity/clear-
ance levels:
Director >c Executive >c Handler >c Agent >c Support >c Unclassified
CIIS also compartmentalizes all of its documents by projects, with the respective project code-
names for access control. The project codenames are typically greek alphabets.
1. [8 marks] Sterling Archer, the best field agent at CIIS, absolutely detests access control
mechanisms and has no intent to understand how Bell-La Padula works. All he knows is that
his clearance level is (Agent, {β, λ, ρ}). For each of the following documents, help Sterling
Archer figure out whether he has read access, write access, both, or neither for the following
documents under the Bell-La Padula Confidentiality Model:
(i) F320: (Executive, {α, η})
(ii) F210: (Director, {ρ})
(iii) F102: (Support, {β, λ})
(iv) F513: (Support, {α, λ})
(v) F219: (Agent, {β, λ, η})
(vi) F924: (Support, ∅)
(vii) F100: (Agent, {β, λ, ρ})
(viii) F465: (Director, {α, β, λ, η, ρ})
2. [6 marks] CIIS is actively under attack by its rival agency, the central espionage service of
CryptoLand, CryptoLand Intelligence Service (CLIS). With help of their secret mind-control
weapon, CLIS has successfully infiltrated CIIS by controlling two of their employees, Ray
Gillete and Cyril Figgis. Unfortunately, while their secret weapon allows them to control the
actions of an individual, it does not retain the target individual’s memory. Hence CLIS has
no idea what credentials Ray Gillete and Cyril Figgis hold.
CLIS knows from its previous infiltration attempts that CIIS has a strict security policy that
triggers an alarm if an employee tries to access files without appropriate credentials more
than once. On an employee’s first attempt to access a file without appropriate credentials,
3
the employee is warned by the system. The second attempt results in an internal alarm and
the employee being locked out of the system. To avoid triggering the alarm CLIS decides to
access files strategically to figure out the credentials their infiltrators hold. However, it turns
out fine-grained muscle control with their mind-control weapon is extremely hard; Ray and
Cyril effectively end up trying to access random files.
• Ray successfully reads files F111 (Support, {λ, ρ}) and F331 (Handler, {α, γ}), but
triggers a warning when attempting to read F222 (Agent, {β, γ, λ}).
• Cyril successfully reads files F212 (Unclassified, {γ}) and F396 (Agent, {α, β}), but
triggers a warning when attempting to read F579 (Agent, {α, ρ}).
Given that CIIS only has {α, β, γ, δ, η, λ, ρ} as the set of compartments, and the above inter-
actions of Ray and Cyril:
(a) What is the lowest clearance level and minimal set of compartments Ray must hold?
What is highest clearance level and maximal set of compartments Ray could have?
(b) What is the lowest clearance level and minimal set of compartments Cyril holds? What
is the highest clearance level and maximal set of compartments Cyril could have?
(c) CLIS desperately wants to read the file F999 (Handler, {α, β, δ}), which of their two
infiltrators is the better choice to attempt reading this file and why?
3. [5 marks] Meanwhile, CIIS is engaged in infiltrating CLIS and gaining access to their in-
telligence. From their undercover mole Barry Dilton they come to know that CLIS uses a
Biba integrity model for its documents, specifically one with a Low Watermark property and
with the same sensitivity/clearance levels as CIIS. CLIS too uses greek letters for its project
codenames.
Barry has been tasked with exposing the file F123 (Agent, {α.β, γ}), by dropping down
F123’s integrity to (Unclassified, ∅). However, he is aware that CLIS has an alarm that will
trigger if the clearance level of a subject or object changes by more than one level in an
action; similarly the alarm also triggers if the integrity level of the subject or object changes
by more than one compartment in an action. (The system will allow for a change in one level
of clearance as well as one change in compartment within the same action.) The alarm will
also instantly lock out the subject whose actions triggered the alarm preventing them from
taking any further actions that might harm CLIS.
Detail the steps Barry needs to take to complete his task, without setting off the alarm, given
that Barry’s credential is (Handler, {α, β, γ, δ, η}) and that he can see the following set of
files in the system:
(i) F546: (Handler, {α, β, γ})
(ii) F101: (Unclassified, ∅)
(iii) F513: (Agent, {β, γ})
4
(iv) F121: (Executive, {β, δ, λ})
(v) F676: (Agent, {α, β, γ})
(vi) F917: (Agent, {β, γ, η})
(vii) F369: (Support, {β})
(viii) F129: (Agent, {α, β, γ, η})
(Note that a simple way to manipulate credentials of a file would be to add an empty string
to the file in question, so that the file itself doesn’t change but the integrity level gets updated
by the Low Watermark property.)
1.2 [13 marks] Securing password authentication
CIIS among other precautionary mechanisms is auditing the security of their password authentica-
tion mechanism. Currently, they store the hash of a password (fingerprint) in a file, and authenticate
their employee login attempts against this fingerprint. Their scheme for generating and verifying
fingerprints is sketched below:
• Every password entry P maintains an 8-bit random salt S used for generating its fingerprint
F , and the system uses a hash functionH.
• The fingerprint of a password is computed as F = H(P)⊕S and is stored in their password
fingerprint file (essentially their version of /etc/shadow) along with the username and S for
that user, where ⊕ is the bitwise XOR operator.
• When an employee attempts to log in with a password P ′, the system verifies the password
by computingH(P ′)⊕ S where S is the salt for that user in their password fingerprint file.
1. [3 marks] Is this scheme secure? If no, what attacks are they currently susceptible to?
2. [2 marks] How can they improve their scheme without changing the underlying hash func-
tion?
Roger Pollock, CIIS’s resident cryptographer realizes during the audit that the hash function (H)
they have been using for their password authentication is actually just an 8-bit CRC (Cyclic Re-
dundancy Check), and is furious at this oversight on the company’s part.
Roger: “We absolutely need to use a Cryptographic Hash Function for password authentication,
since they provide the following desirable properties:
5
• Pre-image resistance: Given a hash value h, it should be difficult to find any message m,
such thatH(m) = h.
• Second pre-image resistance: Given an input m1, it should be difficult to find a different
input m2 such thatH(m1) = H(m2).
• Collision resistance: It should be difficult to find two different messages m1 and m2 such
thatH(m1) = H(m2).1
So please make sure we use a strong cryptographic hash function.”
3. [3 marks] Which of the above listed desirable properties of a cryptographic hash function
does an 8-bit CRC have?
4. [3 marks] Erythrina, the CIIS employee responsible for re-implementing the password au-
thentication module remembered a cryptographic hash function from a security course she
took as an undergraduate student herself about a decade ago. Here is a hash of a password
she deems very secure which she generated using the hash function she remembered (The
hash value below is before the salt is added to it):
EA0C04513C32717F3A09FF7B1FA882C4D8424B2A
Name and justify a candidate hash function that could have produced this hash. What is the
password that hashes to that value, and how did you determine it?
5. [2 marks] Propose an alternate hash function that could provide better security properties
and justify your choice.
1.3 [10 marks] Firing up the Firewall
After a recent breach of security, and loss of several confidential files. CIIS has decided to set up its
firewall again. You are tasked with this ordeal alongside the current network security expert Lana
Kane. CIIS owns the IP address range 17.27.13.0/25. The following are the network functionalities
that CIIS requires for its day-to-day operations:
• All employees of CIIS should be able to browse the internet from within their network (i.e.
browse all HTTP and HTTPS web pages).
1To clarify the difference between second pre-image resistance and collision resistance is in the second pre-image
resistance clause one starts with a fixed m1, and it states that it should then be hard to find an m2 that hashes to the
same value as m1 hashes to. While the collision resistance property states that it should be hard to find any arbitrary
m1 and m2 that hashes to the same value, which is clearly a stronger property to ask for. Hence second pre-image
resistance is also often referred to as ‘weak collision resistance’.
6
• Their public webpage which is hosted on an internal server (with the IP address 17.27.13.7
and served with HTTPS) must be accessible on the internet.
• Employees should be able to ssh into their work devices in the company network from any-
where in the world.
• CIIS only trusts a special DNS server (located at the IP address 33.99.22.101) hosted by
an allied organization to handle all of its DNS lookups. This DNS server is unique in that
it serves requests on port 1551 (normally DNS servers serve requests on port 53) and also
expects the clients to send these requests from the ports in range 5000 to 5100.
• CIIS also maintains an IRC server (on port 3223 of a server with IP address 17.27.13.17)
which is meant to facilitate communications of their covert agent (with the IP address 9.19.11.217)
with the rest of the organization.
1. [2 marks] Lana Kane is of the opinion that they should reinstate a deny-list with the list of
all known malicious IP addresses along with the source IP address of the recent breach to
protect against future attacks. Is this a good defense strategy? Why or why not?
2. [2 marks] While configuring the firewall, you notice a series of IP packets from outside the
company network that have their source IP addresses as 32.23.11.17. What kind of an attack
is this? What type of firewall can be used to defend against it?
3. [6 marks] Configure the firewall by adding the required rules to meet the aforementioned
requirements of CLIS. Rules must include the following:
• DROP or ALLOW
• Source IP Address(es)
• Destintation IP Address(es)
• Source Port(s)
• Destination Port(s)
• TCP or UDP or BOTH
Here is an example rule to allow access to HTTP pages from a server with IP address 5.5.5.5:
ALLOW 5.5.5.5 => 32.23.11.0/25 FROM PORT 80 to all BY TCP
HINTS:
• CIDR Notation may be helpful for this portion of the assignment.
• Some requirements may need more than one rule.
• Ports can be specified as a singular value, range, as a set, or as ‘all’ as seen in the
example above.
7
2 Programming Question [47 marks]
In this part of the assignment, you will write some software that interacts with real-world network-
ing technologies. The goal of this section is to introduce you to the details of network security, as
well as some specific attacks.
This assignment typically involves a substantial amount of programming. For this reason,
we suggest that you start working on your solutions well in advance of the deadline.
An incredibly brief networking primer:
Information is sent across the network in packets—small units of information. Packets contain
multiple layers of information. Each layer serves a different conceptual purpose. For example, a
typical packet containing data as part of a TCP connection contains the following layers:
1. Ethernet Layer: This layer contains the source and destination MAC addresses, which are
used for sending data between machines on a local network segment.
2. Internet Protocol (IP) Layer: This layer contains source and destination IP addresses, which
are used for routing packets between networks.
3. Transmission Control Protocol (TCP) Layer: This layer contains source and destination port
numbers, packet type flags, and connection state information. This information is used for
creating the concept of stateful “connections” on the packet-based network, and for differ-
entiating services on the recipient machine.
Each layer typically consists of some headers containing information specific to that layer, an
integer specifying the type of the next layer, and then the next layer. Some layers, such as Ethernet,
have headers with known fixed sizes. Other layers may contain the header and content length as
part of the packet. The TCP and UDP headers do not specify the type of data that they contain;
you should guess the format of the data based on the port numbers. Most application protocols
have well-known port numbers (e.g., DNS on port 53, HTTP on port 80, NTP on port 123).
For this assignment, most questions involve packets that contain information inside an IP layer. Im-
portant protocols include ICMP, which is often used for network troubleshooting (e.g., “pings”),
UDP, which is a simple connectionless protocol for efficiently sending self-contained pieces of in-
formation, and TCP, which is a protocol that provides the notion of streaming connections. You will
also need to work with the DNS, HTTP, and ARP protocols. DNS is a protocol that typically oper-
ates over UDP. It allows machines to ask questions about domain names (e.g., example.com) by
8
retrieving records (e.g., the A, or “address”, record) containing values set by the domain’s owner
(e.g., the A record for example.com is "93.184.216.34"). HTTP is the protocol that is
used by the worldwide web, allowing web browsers to request web pages from web servers. ARP
is a protocol that deals with low-level LAN infrastructure such as mapping IP addresses to MAC
addresses. We recommend reading up Section 10.6 of the van Oorschot textbook to familiarize
yourself with the terminology.
The Setting
Your security consultancy has been approached by a large technology business, Initrode. Initrode
has recently been the victim of several cyberattacks that have evaded their firewalls. They have
hired you to improve their security systems so that they can identify attacks as they occur.
Initrode has purchased a powerful new internal switch that will host an intrusion detection system
(IDS). This IDS is network-based, and it will silently monitor all network traffic for known attacks
based on provided signatures. Your job is to implement the application that runs on this machine.
Your application will receive suspicious packet capture files from a network monitoring program
and output any detected attacks, as well as some details about them. The output from your program
will be used by other scripts to send alerts to the network administrators, or to dynamically add
rules to the firewall, as deemed appropriate.
The Initrode network has the following structure:
Router Firewall Internet
Public IP
Address:
8.5.4.92
Private
Network:
10.0.0.0/8
IDS
Switch
9
The public IP address of the Initrode network is 8.5.4.92. Within the corporate LAN, machines
are assigned IP addresses in the 10.0.0.0/8 range (in CIDR notation). The router connecting the
Initrode network to the internet uses network address translation (NAT) to move packets between
these networks, much like the consumer routers found in homes. Initrode does not permit IPv6
packets to be transmitted, so you will only need to parse IPv4 packets. Note that the switch (and
thus the IDS) is placed on the LAN side of the router and can thus silently observe all internal
traffic.
Your Task
The host machine for the IDS will monitor network traffic using the popular tcpdump utility for
Unix-like operating systems. Another programmer has written software that causes suspicious
network traffic to be saved in pcap files—the file format used by tcpdump to save captured
packet sequences. Your program will be responsible for analyzing these pcap files and raising an
alarm if certain attacks are detected.
Your program must accept a single command-line argument: a file path for a pcap file. You will
need to read the packets from this file to determine if any attacks have occurred. You may assume
that the packets in the pcap file are complete and sorted by timestamp. When an attack is detected,
you will need to print an alert to the standard output stream. The output format for question 1 is
indicated in the question. For questions 2–7, your alert messages should conform exactly to the
following format:
[attack]: details
In your output, both attack and details should be replaced with the information specified
in the relevant section of the assignment. The output from your program will be processed by a
series of scripts written by another programmer. These scripts will react to your alerts in a manner
determined by the network administrators.
Outline: The following sections describe the attacks that you should detect. Marks for the pro-
gramming part are allocated as follows:
• Detections: Detect a variety of network-based attacks
– [6 marks] Anomaly detection: Count packets and sizes
– [6 marks] Spoofed packets: Detect packets with clearly spoofed addresses
– [6 marks] ARP spoofing: Detect ARP cache poisoning attacks
– [6 marks] Unauthorized servers: Detect LAN-based servers
– [6 marks] IIS worms: Detect the presence of famous worms
10
– [6 marks] Sinkhole lookups: Detect DNS queries for sinkholed domains
– [6 marks] NTP reflection DDoS: Detect amplified denial-of-service attacks
• [5 marks] Output requirements: Use correct output formatting
The remainder of the assignment describes each part in detail. Please go through the References
section, even if you are familiar with networks, as it will help you develop a strategy to solve each
question and point you to necessary references that specify details of attacks. We strongly recom-
mend that you install the Wireshark application to help you with this assignment, as described in
that section. Before you start coding in your preferred language of choice, go through the program-
ming languages section and familiarize yourself with the testing and marking procedures. You will
find test files, project skeletons, and additional information on LEARN. Refer to the start of this
document on what to hand-in for each deadline.
2.1 [6 marks] Anomaly detection
Despite our best efforts, sometimes attacks can pass by our IDS undetected. However, it is some-
times possible to detect that something is unusual, even if we’re not sure what the problem is.
Anomaly detection is the process of detecting when a system, such as a network, is behaving un-
usually. One simple way to do this for a network is to detect when there is more traffic than usual
for the time of day or day of the week—this may indicate that a virus is performing a denial-of-
service attack, or that corporate secrets are being stolen.
Luckily, the IDS machine already has some scripts to determine if a given amount of bandwidth is
unusual; all your program needs to do is report how many packets are contained in the input file,
as well as the sum of the packet sizes, in bytes. Given this information, the external scripts will
determine if there is an unusual amount of suspicious activity.
This task does not use the same output format as other questions. Your program should output the
following line to the standard output stream after processing the pcap file:
Analyzed packet-count packets, size bytes
In your output, packet-count should be replaced with the number of packets contained in the
pcap file, and size should be replaced with the sum of the packet sizes, in bytes. Note that you
should not include the size of the pcap headers, which contain information such as timestamps,
in your computation of size—only the sizes of the captured packets should be summed.
• Sample input for testing: q1-anomaly.pcap
11
• Sample output for testing: q1-anomaly-output.log
In this assignment, you may assume that all capture files provide the complete contents of every
packet; no packets will be truncated (i.e., caplen == len for every packet in the file).
2.2 [6 marks] Spoofed packets
Packets with spoofed sources or destinations are usually part of an attack, such as a distributed
denial-of-service attack, and are unexpected in Initrode’s network. In fact, this problem is common
enough to prompt a best current practice entry from the Internet Engineering Task Force (BCP 38).
Network Ingress Filtering restricts outgoing network traffic from invalid source IP addresses.
Your IDS can monitor all of the network traffic on the corporate LAN, where local computers are
within the 10.0.0.0/8 IP range. Every packet visible to your IDS is expected to be coming
from or traveling to one of these local machines. Write a rule for your IDS that detects packets that
do not satisfy these constraints (i.e., packets that clearly must contain spoofed information).
• Value for attack in your output: Spoofed IP address
• Value for details in your output: src:source, dst:destinationwhere source
and destination are the IP addresses from the packet.
• Sample input for testing: q2-spoofed.pcap
• Expected output for testing: q2-spoofed-output.log
2.3 [6 marks] ARP Spoofing
How ARP works: On a local network, machines typically communicate by transmitting Ethernet
packets. Ethernet packets are sent to and from MAC addresses that (in theory) uniquely identify
particular networking hardware. The Ethernet frame wraps data for a higher-level protocol, such
as the Internet Protocol (IP). IP packets are sent to and from IP addresses. However, since all IP
information must be wrapped inside of Ethernet frames, sending an IP packet to another machine
requires addressing the packet to a particular MAC address. To communicate with other machines
on the LAN, each computer maintains a dynamic table in memory that maps IP addresses to MAC
addresses. This table is populated using the Address Resolution Protocol (ARP).
ARP packets are wrapped inside Ethernet frames, just like any other packets sent through the LAN.
When Alice wants to talk to Bob, Alice first sends an ARP packet to a special “broadcast” address
12
asking for the MAC address for Bob. The network switch ensures that this packet is delivered
to all machines on the LAN. When Bob’s machine receives the packet, it responds with an ARP
response packet indicating Bob’s MAC address. Alice then updates her ARP table to map Bob’s IP
to his MAC address. All future packets sent from Alice to Bob can now be addressed to the proper
MAC address.
ARP spoofing: ARP spoofing is an attack where a machine on the LAN maliciously manipulates
ARP tables in order to insert itself as a “man in the middle”. Mallory can send an ARP response
to Alice saying that Bob is located at Mallory’s MAC address. She can then send another ARP
response to Bob saying that Alice is located at Mallory’s MAC address. Now all communications
from Alice to Bob are covertly redirected through Mallory. If Bob is the gateway providing access
to the Internet, then Mallory can effectively monitor and modify all of Alice’s online communica-
tions.
Your IDS should detect possible ARP spoofing by maintaining its own mapping from IP addresses
to MAC addresses. Whenever you observe an ARP reply packet, update your table to record the
new mapping for the source address. If any existing entry in the table is ever changed (i.e., if a
given IP address was previously mapped to an old MAC address A, and is now updated to a new
MAC address B), then an alert should be raised. The network administrator should investigate
these events in order to identify if the action was legitimate (e.g., if a user replaced a network card
in their computer) or malicious.
For the purposes of this assignment, you only need to record mappings that appear as the source
MAC/IPv4 address in ARP reply packets.
• Value for attack in your output: Potential ARP spoofing
• Value for details in your output: ip:ip, old:oldmac, new:newmac where ip is
the IP address, oldmac is the previous MAC address, and newmac is the new MAC address.
• Sample input for testing: q3-arp.pcap
• Expected output for testing: q3-arp-output.log
2.4 [6 marks] Unauthorized servers
Many computer viruses allow the virus author to issue commands to the infected system over the
Internet. A very simple approach for facilitating this remote control is to set up a server on the
infected machine. The virus author can then connect to the server and issue commands. Initrode
has a corporate policy that prohibits any remotely accessible servers on machines in the LAN, as
they are intended to be workstations used by employees. (Here, the “server” does not need to
13
be an application-level server, e.g. an HTTP server, but rather just a machine that accepts TCP
connections via sockets.)
Write two rules for your IDS:
1. Detect when an external computer (i.e., one outside of the 10.0.0.0/8 IP range) attempts
to connect to a server running within the LAN. These requests should be blocked by the
firewall, so if they are visible to your IDS then this indicates that the firewall has failed or
has been somehow subverted.
• Value for attack in your output: Attempted server connection
2. Detect when a server running within the LAN accepts a connection from an external com-
puter. Note that it is not necessary for the connection to be established; you should raise an
alert as soon as a machine on the LAN expresses the intent to accept a connection from an
external machine.
• Value for attack in your output: Accepted server connection
In both cases, the value for details in your output should be:
rem:remote, srv:server, port:port where remote is the IP address of the external
computer, server is the IP address of the LAN-based server, and port is the port that the
external computer attempted to connect to.
2.5 [6 marks] IIS worms
Several of the most iconic computer worms in history—Code Red, Code Red II, the Sadmind
worm, and Nimda—all attacked Microsoft IIS web servers by using a directory traversal vul-
nerability caused by incorrect parsing of unicode characters. By sending requests for maliciously-
crafted pages, the worms could cause the web servers to execute programs anywhere on the server’s
hard drive. Specifically, it was possible to cause the servers to execute the Windows command-line
interpreter with arguments specifying a command to execute.
While these vulnerabilities were patched by Microsoft over a decade ago, abandoned and infected
machines around the world continue to scan the Internet to this day, looking for potential targets.
Initrode has noticed that one of their employees is constantly being infected by these worms due
to carelessly downloading vintage video games from irreputable websites. Unfortunately, the em-
ployee in question is the CEO and founder of the company and cannot be reprimanded for political
reasons. Instead, you will need to write an IDS rule to detect when their computer has been infected
so that the IT department can quietly remove the worm.
14
More information about the vulnerability exploited by these worms is available from the SANS
Institute.
Write a rule for your IDS that detects malicious web requests attempting to exploit these unicode
vulnerabilities. Note that you should not attempt to detect a specific worm such as Nimda; in-
stead, you should detect any of these directory traversal attacks. Your solution for this task should
examine each packet and perform the following steps:
1. Determine if the packet is an IP packet. If so, continue.
2. Determine if the packet is a TCP packet. If so, continue.
3. Determine if the packet is likely to contain an HTTP request (hint: check the destination
port). If so, continue.
4. Parse the TCP packet contents to locate the page that has been requested from the server. If
you found a web request, continue.
5. Check the page for the malicious unicode characters mentioned in the SANS article. If
found, raise an alert.
Your solution should be able to detect all of the sensible examples provided in the table in the
SANS article irrespectively of the worm’s payload. (As mentioned in the article, table entries 42,
44, 45, 47, 52, 53, 54, and 55 are not valid attacks and do not need to be detected. You should be
able to detect all of the other table entries.) The sample file provided for this task contains all of
these examples; your IDS should produce 62 alerts for this input, as in the sample output. Hint:
many of the examples in the article use the same unicode exploit pattern; you will only need to
look for approximately 15 patterns to detect all of the valid cases.
You are not required to detect attacks that take place over TLS connections (i.e., HTTPS).2 You
should not attempt to match the payload of the worms (e.g., executing cmd.exe); instead, detect
the unicode attacks. Your IDS should detect attacks that are made via all valid HTTP 1.1 request
types (GET, POST, HEAD, PUT, DELETE, and OPTIONS).
• Value for attack in your output: Unicode IIS exploit
• Value for details in your output: src:source, dst:destinationwhere source
and destination are the source and destination IP addresses from the packet in the pcap
file.
2If you were able to reliably detect attacks within encrypted web connections in the given setting, then you would
have broken TLS. If you have broken TLS, please let us know!
15
• Sample input for testing: q5-unicode.pcap
• Expected output for testing: q5-unicode-output.log
2.6 [6 marks] Sinkhole Lookups
Since connections to servers running within the LAN are easily detectable, many viruses receive
commands by making outbound connections to the Internet instead. For example, a virus might
connect to a website operated by the virus author in order to download new commands. In order
to dismantle botnets (networks of infected machines), Internet authorities will often collaborate to
disable these “command and control” servers by seizing control of the domain names and redi-
recting them to harmless IP addresses. These harmless servers, called sinkholes, do not return
any commands to the infected machines, effectively disabling the botnet. They can also log the
incoming connections in order to gauge the size of the former botnet, and possibly notify owners
of infected machines.
In order to detect infected machines on the Initrode network, your IDS should identify DNS re-
quests that resolve to known sinkhole IP addresses. Your IDS should read a list of IP addresses
from sinkholes.txt, which will be located in the current working directory. Each line of this
file contains the plaintext form of an IP address of a known sinkhole. If your IDS observes any
DNS responses specifying any of these IP addresses as the A record for a domain, it should raise
an alert.
You may assume that:
• DNS takes place over UDP only.
• DNS requests contain only a single request, and that the request is for an A record.
Output format:
• Value for attack in your output: Sinkhole lookup
• Value for details in your output: src:source, host:host, ip:ipwhere source
is the source IP address of the machine performing the DNS query, host is the hostname
being queried, and ip is the IP address of the sinkhole.
• Sample input for testing: q6-sinkholes.pcap
• Expected output for testing: q6-sinkholes-output.log
16
2.7 [6 marks] NTP reflection DDoS attacks
A common way to attack the availability of an Internet resource is to perform a Distributed Denial-
of-Service (DDoS) attack. In a DDoS attack, a large set of computers sends traffic to the victim
machine as quickly as possible. The traffic in question might be packets full of meaningless data, or
actual requests for the service that the victim provides. The large amount of bandwidth overwhelms
the victim’s service capacity, preventing them from responding to legitimate requests. In this
way, attackers controlling large networks of infected machines (“botnets”) can remove others from
the Internet. Common uses of DDoS attacks include politically-motivated attacks on websites,
extortion of gambling websites before significant sporting events occur, and disconnecting players
from online video games.
The Network Time Protocol (NTP) allows computers to synchronize their system clocks over the
Internet. A few years ago, attackers discovered that many NTP servers support a command that
returns a lot of data in response to a small query. Moreover, NTP requests are delivered using UDP
packets—they do not require connections to be established, as in TCP-based protocols. These
factors allow innocent NTP servers to be used to launch amplified DDoS attacks.
To perform the attack, the attacker sends a MON GETLIST 1 request to multiple NTP servers.
The UDP packets containing these requests have a forged source address—they appear to originate
from the victim of the attack. The NTP servers then dutifully send lists of their last 600 clients
to the victim, who they believe to be the source of the requests. These responses are typically 50
times larger than the request, resulting in a massive amplification of bandwidth and thus a more
powerful DDoS attack.
Note that these attacks would be identified by your rule that detects packets with obviously spoofed
source addresses. Nonetheless, it is often useful to have your IDS produce more specific alerts.
Initrode has no need to allow outgoing MON GETLIST 1 requests, so the presence of one indicates
that there is likely an infected machine on the LAN. When observing such a request, your IDS
should output an NTP DDoS alert (in addition to the spoofed packet alert introduced earlier in the
assignment).
• Value for attack in your output: NTP DDoS
• Value for details in your output: vic:victim, srv:server where victim is the
IP address of the intended victim and server is the IP address of the NTP server.
• Sample input for testing: q7-ntp.pcap
• Expected output for testing: q7-ntp-output.log
17
2.8 [5 marks] Output requirements
The output format for question 1 is:
Analyzed packet-count packets, size bytes
The output format for questions 2–7 is:
[attack]: details
where both attack and details should be replaced with the information specified in the re-
spective question.
Ensure that your output conforms to the given format. In particular, ensure that there are no
differences between your output and the expected output for the sample files. Read the expected
alert formats carefully. Common mistakes that students have made in the past are:
• Incorrect labels (e.g., writing server instead of srv)
• Omitting commas and/or spaces between alert details
• Inserting spaces around, or completely omitting, the colons
Since we mark your output using an automated testing suite, simple discrepancies in your output
tend to lead to incorrect zero marks, needless remark requests, and loss of marks for the output
requirements. Be sure to double check your outputs carefully to avoid this hassle!
2.9 Testing and Marking
Your program must be called ids. This single program should detect all of the different attacks.
It will be invoked in the following manner:
ids /path/to/capture.pcap
For each attack, a file containing an example of the attack has been provided. To test your IDS,
you can provide the path of the sample file as the first argument to your program. Do not attempt
to test your IDS by running actual attacks against systems that you do not own. Each of the
sample files comes with a text file containing the expected output. To ensure that your output is
the expected one for the problem, you can compare the attacks that your program detected against
the expected attacks using diff:
ids sample-file.pcap | diff - expected-output-file.log
18
If your IDS is working properly, the output from this command should be empty. However, you
should ensure that you are identifying attacks using the requested methodology (e.g., hard-coding
output for the sample files is not a valid solution). Upon submission, your program will be tested
using additional sample files. Your IDS should avoid raising false positives as we will be testing
it to ensure that it does not issue spurious alerts. Your IDS should not require Internet access to
download any libraries or complete any tasks; it is important that network-based IDS software does
not reveal its presence to attackers under any circumstances. For this reason, Initrode’s network
switch executes your IDS inside a virtual machine that does not have any network connectivity.
For marking, we will compile and execute your IDS in a virtual machine with no access to the
Internet. The following steps will be performed:
1. Your submission files will be extracted into the (initially empty) current working directory.
2. If Makefile is found, then make will be executed to compile your code.
3. sinkholes.txt will be copied into the current working directory, overwriting any exist-
ing version.
4. (For students writing their solutions in Go) If there are bin, pkg, and src subdirecto-
ries in the submission directory, and at least one .go file is found in the submission, then
gopacket will be copied into src and go install will be executed for the package.
5. ./ids path will be executed as a non-root user, where path is the absolute path to a
pcap-format packet capture file.
6. The output of step 5 will be compared to the expected output for the test case.
7. If there are more tests to run, go to step 3. Otherwise, derive a mark based on the program
outputs.
We will provide you with access to a system where you can submit your files to ensure that they
will compile successfully in the marking environment. We will make an announcement through
LEARN and/or email when this system becomes available. You will be expected to ensure that
your code compiles in this environment before your final submission.
2.10 Programming Languages
You may implement your solution in several of the most popular programming languages. You may
choose any of the languages supported by our marking system to use for your IDS implementation.
You are expected to use libpcap to parse the contents of the pcap files. For C and C++, you can
19
use libpcap directly. For other languages, designated wrapper libraries will be made available to
you. While it is very natural to parse packet contents in C and C++ due to native libpcap access
and pointer arithmetic, libraries for other languages may offer richer processing capabilities.
The marking system runs Ubuntu 16.04; your submission is expected to operate in this environ-
ment. The following table enumerates the supported languages, the available pcap libraries in the
marking environment, and, for interpreted languages, the shebang (the line starting with #!) that
you should include as the first line of your source file:
Language Version Shebang pcap Library
C gcc 5.4.0 (Makefile) libpcap
C++ g++ 5.4.0 (Makefile) libpcap
Go 1.6.2 (Makefile) gopacket
Python 2.7.12 #!/usr/bin/env python Scapy & dpkt
3.5.2 #!/usr/bin/env python3 Scapy-python3
You may not use any third-party libraries other than the pcap libraries explicitly mentioned in the
table, and the standard libraries for your chosen language. We do not guarantee support for parsing
libraries in other languages within our marking environment. (In line with the testing and marking
procedures, your IDS should not require Internet access to download any libraries or to complete
any tasks as your code will be executed inside a virtual machine that does not have any network
connectivity.)
We have provided project skeletons for each of the languages that are known to compile in the
marking environment on LEARN and/or Piazza. We highly recommend that you use these files as
the base for your implementation.
For solutions implemented in Go, your submission is extracted into the $GOPATH. Consequently,
you will need to have bin, pkg, and src subdirectories inside your submission. However, you
will still need to include a Makefile that copies your final executable to $GOPATH/ids. The
provided skeleton accomplishes all of this. It places your code in src/ids/. If the marking script
detects that your submission resembles a GOPATH, then it will copy and install the gopacket
library within it before running your code. This means that you do not need to include gopacket
with your submission.
2.11 Background & Hints
Writing solutions: To solve the problems in the programming part, you may find it helpful to
follow these general steps:
20
1. Determine what the question is asking you to detect. What networking protocols are in-
volved? What do these protocols accomplish in general, and how do they work at a high
level? The short textbook sections mentioned in the references will briefly describe each of
the protocols discussed above.
2. Determine what information you will need to check in the packet to identify the attack.
(You may assume that all servers listen on the default ports for their protocols.) What fields
will you need to check? What protocol layer contains the relevant fields? How can you
determine the byte offset for the field? Examine the sample files and locate information on
the web to help you. Installing Wireshark, as discussed below, will help you process the log
files manually and understand their structure. The Wikipedia links will be handy references
for the structure of packets, while you code your parser.
3. Write your implementation and test its correctness using the given sample files. Did you
detect the sample file? Is your solution likely to produce false positives (i.e., is your IDS
likely to encounter packets that would trigger your alert but are not attacks)?
References:
• ARP: A quick animation of how ARP works. Wikipedia - Structure of ARP packets. Section
11.5 of the van Oorschot textbook→ “ARP” and “ARP spoofing” descriptions. Figure 11.8
illustrates an ARP spoofing attack.
• IPv4: Wikipedia. Section 10.6 → Figure 10.14 shows an IP packet encapsulating a TCP
segment/UDP datagram.
• TCP: Wikipedia - TCP server and clients. Section 10.6 of the van Oorschot textbook →
“TCP Header, TCP Connection set-up” and Figure 10.15 for the TCP header structure. Sec-
tion 11.6→ Figure 11.9 illustrates the TCP three-way handshake. Wikipedia - TCP Segment
Structure
• UDP: UDP datagram structure
• DNS: Section 11.5 of the van Oorschot textbook → “DNS” description and “DNS resolu-
tion” example. Wikipedia - “message format” and “protocol transport”.
• NTP: Overview of the amplification-based DDoS. Structure of NTP packets and detecting
the attack.
Wireshark: Wireshark is a graphical application that can read the packet captures in the sample
files. Using this tool, you can browse the packets in the files and examine their contents. For each
packet, Wireshark will show you the contents of the various protocol layers. It will also interpret
21
the fields of each layer and highlight the bytes that correspond to each field. Using Wireshark, in
combination with the references, you can discover the fields that your program will need to parse.
Note that integers within packets are stored in “network order”, which typically means big endian
format—the most significant bytes come first. On most hosts, this is the opposite of how integers
are stored in memory. Depending on your language and library of choice, you may need to swap
the byte orders before processing integers within your program. If you are writing in C or C++,
you should use the ntoh family of functions. You may also find inet ntop to be useful.
22

欢迎咨询51作业君