辅导案例-CSC 645-01

欢迎使用51辅导,51作业君孵化低价透明的学长辅导平台,服务保持优质,平均费用压低50%以上! 51fudao.top
CSC 645-01: Computer Networks
Peer-to-Peer Network (P2P) BitTorrent Protocol
Programming Assignment #2
Assigned on: 07/02/2020
Due on: 08/06/2020
Jose Ortiz
May 17, 2020
In this programming assignment, you will implement a decentralized peer-to-peer network architec-
ture (P2P), including the basic implementation of the BitTorrent protocol (BTP). General P2P architec-
tures can be classified into centralized and decentralized, In P2P centralized architectures, new peers send
a request to the Traker (normally done via HTTP) requesting a list containing all the IP addresses of
the peers that are already connected to the P2P network sharing the same file. The tracker then, replies
with a response containing such list. In P2P decentralized architecture, each peer is also a Traker, and it
can share only limited resources because it only sees the partial network. On the other hand, P2P decen-
tralized networks avoid single point of failure since they do not depend of centralized trackers. Examples
of P2P applications that are/were implementing centralizing architectures are Nasper, the Berkeley Open
Network Infrastructure (BOINC) and some versions of BitTorrent. Other versions of BitTorrent implement
decentralized architectures as well.
1. General terminology in a P2P BitTorrent protocol (BTP):
(a) Peer
Peers are those who are uploading and downloading data at the same time from other peers.
Note that they do not posses the whole resource that is being shared, only some pieces of it.
Peers become seeders for a specific file when they downloaded all the pieces from that file.
(b) Swarm
A swarm is a network of peers, leeches and seeders sharing the same file. The more seeders
a swarm has, there will be faster downloads rates among peers. In other words, there will be
better chances of getting the file faster at high download speed.
(c) Traker
Trackers in a semi-decentralized network a tracker is a server which only function is to send the
ip addresses of all the peers in the swarm. In a decentralized network, all the peers are also
trackers.
(d) Metainfo File
The metainfo file is also called as a torrent file and has a .torrent extension. This file mainly
contains encoded information regarding the URL of the tracker, name of the file, and hashes of
the pieces of the file for verifying downloaded pieces of the file.
1
(e) Torrent
A torrent file is a file that contains meta-info (not the actual data) of the file that is being shared
on the swarm by all peers connected to it. It normally has extension .torrent and contain useful
info about the trackers, and the pieces of the file being shared in the network such as length,
SHA1 hashes of all the pieces
(f) Piece
A file shared in P2P is divided in pieces by the seeder. All the pieces from a torrent have the
same size with the only exception of the last piece which may have the leftover bytes. The
length of a piece is normally 16KIB, however this is defined by the implementer. For example,
a piece could also have a length of 32KIB, Info about the file,the length of its pieces, and the
hashes of the pieces are found in the Metainfo file (the file with extension .torrent)
(g) Blocks
Pieces from a torrent may became huge in size, we need to divide them into blocks. Blocks are
the basic structure transmitted in the swarm byte by byte. For example if a piece size is 16KIB,
blocks in that piece will be 2KIB. So, the piece will be split into 8 blocks. The final piece of a
file may have eight or less blocks.
(h) Seeder/Seed
Seeders are those who uploaded the torrent seeds to others, or those who downloaded the the file
already. Note that seeders in a swarm do not download pieces of the file that they are seeding.
However, they may download pieces from other files in swarms where they are not seeders.
(i) Leech/Leeches
Leeches are peers that do not have the required piece of a file. For example, if there are zero
seeders in the network at some point most of the peers will become leeches.
2. Overall operation BitTorrent Protocol (BTP)
Note this section is only informative to show you how the BitTorrent protocol works. Implementation
is described in next step.
BTP consist in two main distinct protocols Tracker HTTP Protocol (THP), and Peer Wire Protocol
(PWP). THP defines methods used by peers for contacting Trakers to join the Swarm. PWP
defines mechanisms that allow communication between peers. However, decentralized P2P networks
using BTP rely heavily in TCP.
(a) High level steps for a client to download a torrent
• Normally, in centralized P2P, a client must download the metainfo file with .torrent exten-
sion via HTTP. Then, once the file is downloaded, the client extracts the IP address of the
tracker, and creates a TCP connection. After the mandatory handshaking process between
client and tracker, a list of the peers sharing the resource requested in the swarm is sent to
the new client. Then the client becomes a peer in the swarm. For this assignment, you can
assume that you will be given the .torrent file (you can find it on iLearn).
• The new peer, then creates a connection with all the peers sharing the requested file in the
swarm, and the sharing process is initiated between two peers if the uploader is not choked,
and the downloader is interested (more about this later). Only when the two peers agree
to initiate the downloading process, the downloader becomes a leecher.
• Once a piece of a file is downloaded by a leecher, it will make that piece available for
downloading in the swarm. When the leecher completes the download process of all the
pieces from a file, it becomes a seeder in the swarm for that file.
2
• A higher number of seeders seeding the same file to the network will make a huge positive
impact in the download rates in the swarm
(b) High level steps to publish a torrent (seed)
• There must be at least a tracker in the swarm. Note that in centralized P2P architectures
there may exist more than one tracker. In decentralized ones, there is always more than
one tracker (a peer may be tracker too).
• A meta-info file must exist in the tracker containing information about the structure of the
swarm, peers connected to it, and resources shared by each peer and seeds.
• The swarm must have at least a seed with access to the complete file.
3. High level implementation steps for your programming assignment:
(a) The Meta-info file In this assignment, I will provide a .torrent file (the file is on iLearn).
However, you can also create (with your preferred text editor) a file with extension .torrent that
will contain all the meta-info related to the tracker, and info about the resource that needs to
be downloaded. Normally torrent files are encoded in Bencode. Your peer needs to have the
capacity to decode this file to extract all the info from it. You are allowed to import the python
package bencode to perform this task.
Here is what a de-bencoded torrent file (with piece length 256 KiB = 262,144 bytes) for a file
example.pdf (whose size is 678 301 696 bytes) might look like:
{
’announce’: ’127.0.0.1:12000’,
’info’:
{
’name’: ’example.pdf’,
’piece length’: 262144,
’length’: 678301696,
’pieces’:
}
}
The announce value is the IP Address/port ( or URL in centralized) of the tracker containing
all the info about the peers in the swarm that have the complete ’example.pdf’ file or one or
more pieces of that file.
(b) Tracker
Recall that in this program, every peer is also a tracker. The Tracker class takes as a parameter
the Server instance. It has two functionalities implemented in two different methods
• When a new peer connects, the tracker will broadcast the IP address of the new peer to all
the trackers connected in the same swarm.
• Once all the Trackers objects detect a broadcast transmission from another tracker, they
will include the new IP of address in their list of connected peers to the swarm.
(c) Peer
The Peer class is the main class of this program. It may inherit methods from both, the Client
and the Server classes because in a P2P architecture all the clients are also servers. It may also
create server and client objects instead of inhering their methods. That is a design decision that
you need to make. In addition. the peer class provides the following functionalities:
3
i. Extract the meta-info from the .torrent file, and using that info, create a TCP connection
with the tracker. Note that you don’t need to send any info about the file requested to
the tracker (i.e name or pieces). The tracker do not know anything about it. The only
information that the tracker has is the IP addresses of the peers connected
ii. Once the peer’s tracker that is trying to connect, retrieves all the list of all peers connected
to the swarm, it will create a TCP connection with all the peers in the swarm from the
list provided by the tracker. Take into consideration that there must be a max number
of connections set. After the initial handshake with all the peers, and if the new peer is
interested, and the other peers are not choked, then the downloading process starts. Note
that you can only download data from four peers at the same time in the same swarm
(tic-tac-tic protocol). Try to download from the rarest peers first. See ”Application Layer”
slides for more detail about this protocol.
iii. Set the peer’s status to leech if it still has missing pieces of the file, and interchange symmet-
ric messages in the swarm using the peer protocol described in section 3e of this document.
Those messages contain the blocks from pieces that you are requesting. Messages are ex-
plained in detail in the peer protocol section. Once a piece from a file is completed then,
the piece can be uploaded to the swarm, so others can download it. When you have the
entire file, set your status to seeder.
iv. Like in any P2P client app, we need to log and show in console info about the download-
ing and uploading processes taking place. The below figure illustrates those processes at
one specific moment of the application execution. Note that percentages of download and
upload data must be updated in real time. You do not need to worry about implement-
ing the progress bars to represent the progress of the pieces downloaded because you can
use my python module HTPBS (horizontal threaded progress bars). You can find it here
https://pypi.org/project/htpbs/.
v. Take into account that this program must support sharing different files at the same time
in different swarms. This will be explained in detail in class labs.
(d) Additional Classes: I recommend to think carefully of the classes needed in this program
apart from the client, server, tracker and peer ones. You will also need a connection class to keep
track of all the files that are being share in the network (multiple swarms). Also, you’ll need
a uploads class (think about the client-handler class), and a downloads class. In addition, you
may need the forwarding and routing classes. The routing class is in charge of routing the blocks
downloaded that belong to each file because the peer may be downloading blocks from different
files. The forwarding class is in charge forwarding blocks to their belonging pieces related to the
same file. Think about how routing and forwarding protocols work in real applications (those
protocols were learned in class) and then apply this knowledge here.
(e) The Peer Protocol
The BitTorrent’s peer protocol operates over TCP or uTP. Peer connections are symmetrical.
Messages sent in both directions that look the same, and data can flow in either direction. The
following is a general step by step high level description about how to implement this protocol:
i. The peer protocol refers to pieces of the file by index as described in the metainfo file,
starting at zero. When a peer finishes downloading a piece and checks that the hash
matches, it announces that it has that piece to all of its peers.
ii. Connections, in both ends, must define one of the following states: interested or not inter-
ested and chocked or not. When a peer is chocked it won’t send any data. Connections
always start out choked and not interested. The data transfers begins between two peers
when a peer is interested and the other peer is not chocked.
4
iii. Once the data transfer has began the peer will send pieces using different algorithms. In
this assignment, you will implement the tit-for-tat transfer process, which sends chunks
(bytes from pieces) to the top four peers with highest upload rate, and re-evaluate top 4
each 30 seconds approximately with the goal of find better trading partners, and get the
file faster.
iv. For this assignment, after a peer is run in terminal, it must ask the user to set the maximum
upload and downloads rates that the P2P app is allowed to use. Downloads and Uploads
rates are dynamic and need to be re-evaluated every 30 seconds as follow:
Let P1 be a peer trying to download a resource from the swarm, Dmax and Umax be the
maximum download and upload rates set by the peer, UR = (UR2, .., URi, ...., URn) be
the set of upload rates from the leeches or seeders uploading the file requested by P1 in
the swarm, and Fn be the number of files being uploaded by P1 to the swarm. Then, the
download and uploads rates of P1 are computed as follows:
D(P1) =
n∑
i=2
URi where Dmax >= D(P1)
U(P1) =
Umax
Fn
v. The message sent between peers has the possible values.
• unchoke if set to 1, the peer allows data to be downloaded. Otherwise, peer is in choke
status and cannot sent data.
• interested if set to 1, the requester is interested in downloading data from a peer.
Otherwise is not interested
• have a positive integer that defines the number of pieces from the file that the peer
have.
• bitfield indicates the missing pieces. (i.e [[1,1,1,0,1,1,0,1],[1,1,1,1,1,1,1,1],[0,0,0]]) In
this example, the 0s and 1s are bits representing blogs of the pieces. If a piece is size
16KiB, then the piece is represented by 8 bits (blocks 2KiB size). Note that the last
piece contains the leftover blocks. When a block of a piece is completed it is set 1 bit,
when is missing is set to 0 bit. In the above example, in piece 0, blocks 3, and 6 are
missing, piece 1 is completed, and piece 2 is missing all its blocks.
• request messages contain an index, begin, and length
• piece messages contain an index, begin, and piece.
• cancel messages have the same payload as request messages. They are generally only
sent towards the end of a download,
Note that choke, unchoke, interested and non interested values do not contain payloads. Also
note that this is a basic implementation of the peer message protocol. Real implementation
deals with messages bit by bit, and with offsets. For more detail about this protocol see
class labs.
4. Libraries allowed in this assignment
For this assignment the only libraries/packages allowed are threading, socket, and pickle. In addition,
you can use any libraries or packages to help you with the following tasks:
• Helping with imports problems
5
• Make your data persistent. For example, when a peer reconnects to the swarm, it needs to know
which pieces are missing from the file it was downloading. In my solution, I use configuration
files, but you are free to use any library that will help you to maintain persistent data in disk.
Note that pickle can also save data in files in disk.
5. Submission Guidelines
(a) By the assignment due date specified at the top of this document, your complete and tested
project must be hosted in your GitHub private class repository. All working code must be hosted
in the master branch of the repository
(b) Your README file for this assignment must contain the following info:
• Your name and student id
• Project name
• Project description
• Project purpose
• Clear instructions about how to clone/download/install/execute the project Compatibility
issues
• A paragraph or two about what challenges you have faced in this assignment, and what you
have learned from it.
(c) I will only grade assignments located in the Master branch of your repository.
(d) Commits done after the assignment due date won’t be accepted by any means. So plan accord-
ingly!
(e) You must send me an email (by the deadline of the assignment) to [email protected]. The email
must have the following subject and body:
• Email Subject: CSC645 Computer Networks: P2P
• Email Body: your name, student id and Github username. In addition, it must have a
direct link to the source code of your project in the master branch of your repository.
6. Grading Rubrics
(a) This assignment will be graded based in the correct implementation of all the functionalities
described in this document.
(b) Your code must be readable and self explanatory. Add comments only for some complicated
functions or algorithms that need extra clarification.
(c) P2P programming assignment is 25% of your final grade.
(d) Correct and complete implementation 20%
(e) Appropriate documentation of the code (README.md), coding style and readability. If I cannot
understand your code, then your code is not good. 5%
(f) If I cannot run the peer.py file, you will get a 0 in this assignment. No exceptions!!!. Make sure
that you test this before submission.
6
51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: abby12468