辅导案例-EECS 325/425

欢迎使用51辅导，51作业君孵化低价透明的学长辅导平台，服务保持优质，平均费用压低50%以上！ 51fudao.top

EECS 325/425
Project 2
Due December 2, 2020 before 11:59pm.
CSDS325: 80 points + 10 bonus points; CSDS425: 100 points In this project, you will investigate the correlation between two common inter-host distance metrics: the hop count and the RTT. Before you do your measurements, you need to implement a program that measures hop distance and RTT to a remote host. As you know, the traceroute tool gives you not just the number of hops between the two hosts, but also the IP addresses of all the hops and the RTT between the sender and each hop. This is more than what we require, and therefore traceroute is too heavy weight for our purpose: traceroute sends out a lot of probes, while your tool should need many fewer probes. In fact, your tool should be able to measure the number of hops between the sender and destination using only one probe. To do this, you can exploit the fact that ICMP error messages include an initial portion of the datagram that triggered the error. So, if you send out a datagram to a destination to an unreachable UDP port and set the TTL x to this datagram, the residual TTL y that the datagram has when it reaches the destination will be returned to you in the ICMP payload. Then you know that the difference (x-y) represents the number of hops between the sender and the destination. To implement your hops and RTT measurement tool, you need so-called “raw sockets” as they allow you to control the values of the IP header fields (which as you know from Project 1 are filled out by the kernel for you if you use regular UDP or TCP sockets) as well as receive ICMP messages. You can learn about raw sockets in detail on the Internet (e.g.,, https://www.binarytides.com/raw-socket-programming-in-python-linux/) but necessary simple details are provided below. Note: When we send measurement traffic to unsuspecting sites, we try to mark our traffic as explicitly as we can to distinguish it from attack traffic. So, in your payload, please include text “measurement for class project. questions to student [email protected] or professor [email protected]”. Internet measurement experiments are often done at large scale (unlike our “toy” experiments with only 10 destinations) and take a very long time to complete. Therefore, we often try to combine measurements, think ahead, and extract as much information from a single experiment as we can. In this experiment, for the 425 students, I would like you to combine the above measurement with answering another question: while ICMP stipulates that the ICMP error messages include only 28 initial bytes of the datagram that triggered the error, rumor has it that in practice these error messages include much larger part of the datagram. I would like you to construct your probes to include some payload to make the probe datagrams have the maximum length of 1500 bytes (or 1472 of payload plus 8 bytes of UDP header plus 20 bytes of IP header). Then when you receive the ICMP response, see how much of the original datagram the response contains. Produce the result in your report as a table. In other words, pad the payload in your probing datagram (which, again, identifies your traffic) with, e.g., character “a” so that the datagram reaches the maximum length above. Thus, the output of your tool must include, for each destination, (a) the number of router hops between you and the destination, (b) the RTT between you and the destination, (c) the number of probe/response matching criteria that matched for this destination (see explanation below) and – for the 425 students - (d) the number of bytes of the original datagram included in the ICMP error message.
Further details on tasks involved and implementation notes: (1) You will need to use python to implement this program. If you do not know it, just find some intro tutorial with lots of examples. You only need rudimentary understanding of the language, just enough to implement the task at hand. Note: you can find that traceroute
implements somewhat similar functions (and much more). You are welcome to read traceroute code (you can find it on the Internet) or even reuse parts of it as long as you
document carefully what code is yours and what you adopted from elsewhere; if you do use
traceroute code as an example, provide a URL or a copy of this code in your submission zip file.
Your machine has python 3 installed; you need to access it using “python3” command (instead of usual “python” command, which does not exist). (2) You will need to have root privilege to work with raw sockets. You each have received your own (virtual) machine at your disposal (sent to you individually). Extremely important: do
NOT change the passwords associated with your VM. Doing so will invalidate your
credentials and you will lose access! There is no IDE on this machine – you would have to use an old-fashioned text editor and command-line terminal window to write and debug your code. You can of course try to develop/debug on your own machine but it will work well only if your machine runs linux: I am told Window’s support for raw sockets is spotty so your mileage may vary. Rami (my former PhD student) tried to implement this project on Windows 10 and it worked fine. Note: if you do use your own machine, I believe CaseGuest wi-fi blocks ICMP messages, so make sure you are on Case Ethernet or Case Wireless network, or log into the VPN. (3) You will need to create a datagram with custom values of some header fields for your probe. The easiest way to do it is to create a socket of type socket.SOCK_DGRAM and then use setsockopt() to change just the fields of your socket that you need to change. Then you can send it with socket.sendto() without going through the trouble of building the rest of the headers yourself. (4) You will also need to create a raw socket to receive ICMP messages. To create a raw socket for receiving an ICMP datagram, you can use the following function: recv_sock = socket.socket(socket.AF_INET, socket.SOCK_RAW, socket.IPPROTO_ICMP) after which you can get a packet from this socket by using the following: icmp_packet = recv_sock.recv(max_length_of_expected_packet) Note if you try to debug this on your local machine that runs Windows (see note about Windows support for raw sockets): Windows apparently requires you to “bind” the socket to a dummy port and local IP address after you create it and before reading, by using the command below. I put “bind” in quotes because ICMP has no notion of a port so it makes no sense to bind this socket to a port, but you should just view this as an idiom in Windows (and this command does not seem to do any harm in Linux either, so you could leave it in place when you move your tool to your remote machine): recv.socket.bind(“”,0); (5) A few other useful functions: a. To extract a field from a packet, you can use “struct” module, in particular struct.unpack function. Keep in mind that whenever you extract a multi-byte integer (e.g., port number) you need to make make sure the order in which the bytes appear in the packet (the most significant byte first) matches the order in which these integers are represented in the computer you happen to use. A portable way to handle this is to use so-called “network order” when extracting the bytes from the packet. E.g., if packet[x:x+1] represents the two-bytes port number in a packet, you can convert it to int by the following:
port_from_packet = struct.unpack("!H", packet[x:x+2])[0] Here, the first argument “!H” speficies format of the packet fragment being extracted, in this case signifying that the two bytes from packet appear in the network order (“!”) and they represent an unsigned short (“H”). b. However, when you need to extract only a single byte, you can just convert it to integer using ord(byte) function. c. Code to include disclaimer in the probe seems to have to be a bit different in python3 from what would have been in regular python. This snippet seems to work: msg = ‘measurement for class project. questions to student [email protected] or professor [email protected]’ payload = bytes(msg + ‘a’*(1472 - len(msg)),'ascii') send_sock.sendto(payload, (dest_ip, dest_port)) (6) You will need to match ICMP responses with the probes you are sending out. Note that your socket may receive unrelated packets since there is no port number to distinguish “your” packets from someone else’s (i.e., another process running on your host). In principle, there are several possibilities: one technique would compare the IP address of your measurement target to the source IP address of the ICMP response and match probes with ICMP responses when these IP addresses match. This technique is going to work most of the time but not all the time: sometime you may receive the ICMP response from an IP address that is different from the address you are probing. (We have actually seen this behavior. I speculate it ’s due to the use of a load balancer at the destination. In this setup, when you resolve the destination’s hostname, you get the IP address of the load balancer, which receives your packets and forwards them to one of the servers in a server cluster. However, the ICMP messages that the selected server generates, carry the server’s own IP address, which is different from that of the load balance which your probe used.) Another technique matches the datagram ID (the “IPID” field from the IP header) with the datagram ID you find in the ICMP payload (which, as a reminder, contains full IP header plus 8 bytes of the transport-level header of the original datagram in those first 28 bytes of payload). Again, this will work most of the time but does not guarantee to work all the time because a NAT box, a firewall, or another middlebox between you and the destination may change the IPID of the datagram as it forwards it along. Yet another technique involves making the source port number unique for each target and matching the source port number from your probe datagram with the corresponding information from the 28-byte ICMP payload. (If your measurement host is behind a NAT box, the NAT box may change the source port of your probe on the way out, but it’s actually supposed to change the corresponding field in the quotation from the ICMP response back to your source port number before forwarding the response to you, it’s not guaranteed that every NAT box does it). Thus, I would like you to implement all three techniques above and declare a match when any of the three techniques produce a match. For each destination, please report which of the three fields were,
and which were not, modified in-transit. (7) You need to allow for a possibility that you will get no answer – because the target host does not send ICMP error messages, or because the destination’s firewall blocks outgoing ICMP messages, or because the destination’s firewall blocks unexpected UDP messages. Hence your program should not be stuck on reading from a socket forever. You will need to use either a “select” or “poll” call on the socket (read about them – google for “socket select”) before reading from the socket. WARNING: do not process this error by changing the destination port number – stick with 33434 or you may get a nasty call from network admins! This will be interpreted as port scanning. I suggest you just try couple times and if
you still get no answer, give up on this destination and produce some error message and move on to measuring the next site. You can check manually if it is your program’s fault by trying to reach this destination using a standard ping measurement (there is a small chance that ping would be treated differently and experience a different outcome, but if you see that your tool produces no answer while ping succeeds on many destinations, this should increase your doubt in the correctness of your tool). (8) While in theory you can address your probe to any unreachable port, how would you know which port on the target host is unused by other applications? There is a specially allocated port number for this type of probing by traceroute, and please use that port number, which is 33434. Your script should read the file “targets.txt” (collocated in the same directory as your script), which includes the set of websites you are exploring and accept no arguments, and obtain the IP address of each target using socket.gethostbyname method. The script should print out the result on standard output (in addition to any file that you find convenient to produce the plot below) that is understandable to the grader. Please name the script “distMeasurement.py”. Once you have your tool, measure the hop count as well as RTT to each of the websites you worked with in earlier homework assignments (or read instructions for the eecs425 portion of HW 2 if you are a 325 student). From the output of your tool, produce a scatter graph to visualize the correlation (use any tool you want for this, e.g., Excel): have hops on X-axis, RTT on Y-axis, and for each remote host, place a dot with corresponding coordinates on the graph. This is a typical technique to visualize correlation. Note, as mentioned, you may not get responses from some of the servers. In order to produce a meaningful scatterplot, pick some other sites that were not included in your original list of ten. Keep probing until you have around ten. You can say in your report that you are substituting these sites because your original sites did not respond.

Deliverables: 1. Submit to canvas: A single zip file with (a) all programs (make sure they are well commented and include instructions how to run them – arguments etc. Under-documented programs will be penalized.); (b) Project report that includes all measurement results, graphs, and conclusions that you draw from your measurements. 2. Place your project into directory “/project2” in your VM. (You can use sftp to upload.) Then create a copy of that directory, “/project2grading”. We will be testing your work by going into the project2grading directory and issuing the commands “python3 distMeasurement.py”.
Tips:
1. The machines may take a long time to execute your “sudo” commands. To save time, the best thing to do is to start by executing a “sudo bash” command, which will get you a shall prompt as a root. Then you can just run any commands you want there as a root. 2. To debug your tool, you might find it useful to capture packets that your machine sends out and receives back. Jerome and Neo had a study session for Project 1 to discuss how to capture and analyze a packet trace on the lab virtual machines. As they said, the simplest way (which requires no setup) is to use tcpdump from the command line on your VM to capture the trace, then copy it to your local machine, and then use wireshark on your machine to read the trace from the file.
Grading rubrics: 1. Code for measuring hop distance and RTT to a given destination: 40 points; 2. Code for reading the set of destinations from a file and obtaining the above metrics to all the destinations: 20 points; 3. (EECS425 only, 10 bonus points for 325 students): Code for obtaining the number of initial bytes from the original datagram included into the ICMP error message: 20 points; 4. The lab report including the tool output data, the graphs, and the conclusions: 20 points;

欢迎咨询51作业君