程序代写案例-COMP9024

欢迎使用51辅导,51作业君孵化低价透明的学长辅导平台,服务保持优质,平均费用压低50%以上! 51fudao.top
COMP9024 (21T0): AVVLJQPeQW
SimSle GUaSh VWUXcWXUe-baVed SeaUch Engine
[The specification ma\ change. A notice on the class Zeb page Zill be posted
after each reYision, so please check class notice board freqXentl\.]
Change log:
No entr\ as \et!
Objectives
to implement a simple search engine based on the Zell knoZn PageRank algorithm (simplified).
to giYe \oX fXrther practice Zith C and adYanced data strXctXres (BST and Graph ADTs)
Admin
MarkV 30 marks toZards total coXrse marks. Part-A (10 marks), Part-B (10 marks), Part-C (10 marks).
DXe 10am Monda\ 01 Feb 2021.
LaWe
PenalW\
10% marks per da\ off the ceiling.
Last da\ to sXbmit this assignment is 10am Wednesda\ 03 Feb 2021, of coXrse Zith late penalt\.
SXbmiW Read instrXctions in the "SXbmission" section beloZ.
Aim
In this assignment, \oXr task is to implement a simple search engine Xsing the Zell knoZn algorithm
PageRank, simplified for this assignment, of coXrse!. YoX shoXld start b\ reading the Zikipedia entries on the
topic. Later I Zill also discXss these topics in the lectXre.
PageRank (read Xp to the section "Damping factor")
The main focXs of this assignment is to bXild a graph strXctXre, calcXlate PageRanks and rank pages based
one these YalXes. YoX don't need to spend time craZling, collecting and parsing Zeblinks for this assignment.
YoX Zill be proYided Zith a collection of "Zeb pages" Zith the reqXired information for this assignment in a
eas\ to Xse format. For e[ample, each page has tZo sections,
Section-1 contains Xrls representing oXtgoing links. Urls are separated b\ one or more blanks, across
mXltiple lines.
Section-2 contains selected Zords e[tracted from the Xrl. Words are separated b\ one or more spaces,
spread across mXltiple lines.
HiQW: YoX can assXme that ma[imXm length of a line ZoXld be 1000 characters. YoX need to Xse a d\namic
data strXctXre(s) to handle Zords in a file and across files, no need to knoZ ma[ Zords beforehand.
E[ample file XUO31.W[W
#VWDUW 6HFWLRQ-1
XUO2 XUO34 XUO1 XUO26
XUO52 XUO21
XUO74 XUO6 XUO82
#HQG 6HFWLRQ-1
#VWDUW 6HFWLRQ-2
0DUV KDV ORQJ EHHQ WKH VXEMHFW RI KXPDQ LQWHUHVW. EDUO\ WHOHVFRSLF REVHUYDWLRQV
UHYHDOHG FRORU FKDQJHV RQ WKH VXUIDFH WKDW ZHUH DWWULEXWHG WR VHDVRQDO YHJHWDWLRQ
DQG DSSDUHQW OLQHDU IHDWXUHV ZHUH DVFULEHG WR LQWHOOLJHQW GHVLJQ.
#HQG 6HFWLRQ-2

YoXr tasks in sXmmar\:
CaOcXOaWe PageRaQNV: YoX need to create a graph strXctXre that represents a h\perlink strXctXre of
giYen collection of "Zeb pages" and for each page (node in \oXr graph) calcXlate PageRank YalXe
and other graph properties.
IQYeUWed IQde[: YoX need to create "inYerted inde[" that proYides a list of pages for eYer\ Zord in a
giYen collection of pages.
SeaUch EQgiQe: YoXr search engine Zill Xse the giYen inYerted inde[ to find pages Zhere qXer\
term(s) appear and rank these pages Xsing their PageRank YalXes (see beloZ for more details)
HoZ Wo geW VWaUWed HinWV and Sample fileV
Hints on "HoZ to Implement Assignment", Zill be discXssed in the lectXre.
Sample files for HoZ to Get Started (ass-getting-started.]ip), Zill be discXssed in the lectXre.
Sample1.]ip
AddiWional fileV
YoX can sXbmit additional sXpporting files, *.F and *.K, for this assignment. For e[ample, \oX ma\
implement \oXr graph adt in files JUDSK.F and JUDSK.K and sXbmit these tZo files along Zith other reqXired
files as mentioned beloZ.
Part-A: Calculate PageRanks
YoX need to Zrite a program in the file SDJHUDQN.F that reads data from a giYen collection of pages in the file
FROOHFWLRQ.W[W and bXilds a graph strXctXre Xsing Adjacenc\ Matri[ or List Representation. Using the
algorithm described beloZ, calcXlate PageRank for eYer\ Xrl in the file FROOHFWLRQ.W[W. In this file, Xrls are
separated b\ one or more spaces or/and neZ line character. Add sXffi[ .W[W to a Xrl to obtain file name of the
corresponding "Zeb page". For e[ample, file XUO24.W[W contains the reqXired information for XUO24.
E[ample file FROOHFWLRQ.W[W
XUO25 XUO31 XUO2
XUO102 XUO78
XUO32 XUO98 XUO33
Simplified PageRank Algorithm (for this assignment)
PageRank(d, diffPR, ma[IWeUaWiRnV)
5HDG "ZHE SDJHV" IURP WKH FROOHFWLRQ LQ ILOH "FROOHFWLRQ.W[W"
DQG EXLOG D JUDSK VWUXFWXUH XVLQJ AGMDFHQF\ LLVW 5HSUHVHQWDWLRQ
1 = QXPEHU RI XUOV LQ WKH FROOHFWLRQ
FRU HDFK XUO pi LQ WKH FROOHFWLRQ

EQG FRU
LWHUDWLRQ = 0;
GLII = GLII35; // WR HQWHU WKH IROORZLQJ ORRS
:KLOH (LWHUDWLRQ < PD[IWHUDWLRQ A1D GLII >= GLII35)
LWHUDWLRQ++;

ZKHUH,
- LV D VHW FRQWDLQLQJ OLQNV(XUOV) SRLQWLQJ WR SL
(LJQRUH VHOI-ORRSV DQG SDUDOOHO HGJHV)
- LV RXW GHJUHH RI
- FRUUHVSRQGV WR YDOXH RI "LWHUDWLRQ"


EQG :KLOH

YoXr program in SDJHUDQN.F Zill take three argXments (d - damping factor, diffPR - difference in PageRank
sXm, ma[IWeraWionV - ma[imXm iterations) and Xsing the algorithm described in this section, calcXlate
PageRank for eYer\ Xrl.
For e[ample,
% SDJHUDQN 0.85 0.00001 1000

YoXr program shoXld oXtpXt a list of Xrls in descending order of PageRank YalXes (Xse format string "%.7I")
to a file named SDJHUDQNLLVW.W[W. The list shoXld also inclXde oXt degrees (nXmber of oXt going links) for
each Xrl, along Zith its PageRank YalXe. The YalXes in the list shoXld be comma separated. For e[ample,
SDJHUDQNLLVW.W[W ma\ contain the folloZing:
E[ample file SDJHUDQNLLVW.W[W
XUO31, 3, 0.2623546
XUO21, 1, 0.1843112
XUO34, 6, 0.1576851
XUO22, 4, 0.1520093
XUO32, 6, 0.0925755
XUO23, 4, 0.0776758
XUO11, 3, 0.0733884

Sample FileV for ParW-A
YoX can doZnload the folloZing three sample files Zith e[pected
SDJHUDQNLLVW.W[W files.
Use format string "%.7I" to oXtpXt pagerank YalXes. Please note
that \oXr pagerank YalXes might be slightl\ different to that
proYided in these samples. This might be dXe to the Za\ \oX carr\
oXt calcXlations. HoZeYer, make sXre that \oXr pagerank YalXes
match to sa\ first 6 decimal points to the e[pected YalXes. For
e[ample, sa\ an e[pected YalXe is 0.1843112, \oXr YalXe coXld be
0.184311[ Zhere [ coXld be an\ digit.
All the sample files Zere generated Xsing the folloZing command:
% SDJHUDQN 0.85 0.00001 1000
aE[1
aE[2
aE[3
Part-B: Inverted Index
YoX need to Zrite a program in the file named LQYHUWHG.F that reads data from a giYen collection of pages in
FROOHFWLRQ.W[W and generates an "inYerted inde[" that proYides a sorted list (set) of Xrls for eYer\ Zord in a
giYen collection of pages. Before inserting Zords in \oXr inde[, \oX need to "normalise" Zords b\,
remoYing leading and trailing spaces,
conYerting all characters to loZercase,
remoYe the folloZing pXnctXation marks, if the\ appear at the end of a Zord:
'.' (dot), ',' (comma), ';' (semicolon), ? (qXestion mark)
In each sorted list (set), dXplicate Xrls are not alloZed. YoXr program shoXld oXtpXt this "inYerted inde[" to a
file named LQYHUWHGIQGH[.W[W. One line per Zord, Zords shoXld be alphabeticall\ ordered, Xsing
ascending order. Each list of Xrls (for a single Zord) shoXld be alphabeticall\ ordered, Xsing ascending order.
E[ample file LQYHUWHGIQGH[.W[W
GHVLJQ XUO2 XUO25 XUO31 XUO61
PDUV XUO101 XUO25 XUO31
YHJHWDWLRQ XUO31 XUO61

NoWe: for this part, in \oXr oXtpXt file, on each line, a Zord and Xrls mXst be separated b\ one (or more)
spaces. The testing program Zill ignore additional spaces.
Part-C: Search Engine
Write a simple search engine in file VHDUFK3DJHUDQN.F that giYen search terms (Zords) as commandline
argXments, finds pages Zith one or more search terms and oXtpXts (to stdoXt) top 30 pages in descending
order of nXmber of search terms foXnd and then Zithin each groXp, descending order of PageRank. If nXmber
of matches are less than 30, oXtpXt all of them.
YoXr program mXst Xse data aYailable in tZo files LQYHUWHGIQGH[.W[W and SDJHUDQNLLVW.W[W, and mXst
deriYe resXlt from them. We Zill test this program independentl\ to \oXr solXtions for "A" and "B".
NoWe: For this part,
each line in "inYertedInde[.t[t" contains - a Zord and the corresponding Xrls separated b\ one (or more)
spaces. YoXr program for Part-C needs to be able to handle sXch an inpXt. Please see the sample
program proYided "e[Tkns.c" .
each line in "pagerankList.t[t" contains - Xrl, oXt-degree and pagerank. To simplif\ \oXr task, \oX can
assXme that the\ are separated b\ ", " - that is a comma and one space.
E[ample:
% VHDUFK3DJHUDQN PDUV GHVLJQ
XUO31
XUO25

Submission
AddiWional fileV: YoX can sXbmit additional sXpporting files, *.F and *.K, for this assignment.
IMPORTANT: Make sXre that \oXr additional files (*.c) DO NOT haYe "main" fXnction.
For e[ample, \oX ma\ implement \oXr graph adt in files JUDSK.F and JUDSK.K and sXbmit these tZo files
along Zith other reqXired files as mentioned beloZ. HoZeYer, make sXre that these files do not haYe "main"
fXnction.
I e[plain beloZ hoZ Ze Zill test \oXr sXbmission, hopefXll\ this Zill ansZer all of \oXr qXestions.
YoX need to sXbmit the folloZing files, along Zith \oXr sXpporting files (*.c and *.h):
pagerank.c
inYerted.c
searchPagerank.c
NoZ sa\ Ze Zant to mark \oXr SDJHUDQN.F program. The aXto marking program Zill take all \oXr sXpporting
files (other *.h and *.c) files, along Zith SDJHUDQN.F and e[ecXte the folloZing command to generate
e[ecXtable file sa\ called pagerank. Note that the other tZo files from the aboYe list (LQYHUWHG.F and
VHDUFK3DJHUDQN.F) Zill be remoYed from the dir:
% JFF -:DOO -OP -VWG=F11 *.F -R SDJHUDQN
So Ze Zill noW XVe \oXr Makefile (if an\). The aboYe command Zill generate object files from \oXr sXpporting
files and the file to be tested (sa\ SDJHUDQN.F), links these object files and generates e[ecXtable file, sa\
SDJHUDQN in the aboYe e[ample. Again, please make sXre that \oX DO NOT haYe main fXncWion in \oXr
VXpporWing fileV (other *.c files \oX sXbmit).
We Zill Xse similar approach to generate other tZo e[ecXtables (for LQYHUWHG.F and VHDUFK3DJHUDQN.F).
HoZ Wo SXbmiW
InstrXctions on hoZ to sXbmit \oXr assignment Zill aYailable later.
Plagiarism
YoX are alloZed to Xse code from the coXrse material (for e[ample, aYailable as part of the labs, lectXres and
tXtorials). If \oX Xse code from the coXrse material, please clearl\ acknoZledge it b\ inclXding a comment(s)
in \oXr file. If \oX haYe qXestions aboXt the assignment, ask \oXr tXtor.
YoXr program mXst be entirel\ \oXr oZn Zork. Plagiarism detection softZare compares all sXbmissions
pairZise (inclXding sXbmissions for similar projects in preYioXs \ears, if applicable) and serioXs penalties Zill
be applied, particXlarl\ in the case of repeat offences.
DR nRW cRS\ fURm RWheUV; dR nRW allRZ an\Rne WR Vee \RXU cRde, nRW eYen afWeU Whe deadline
Please refer to the on-line soXrces to help \oX Xnderstand Zhat plagiarism is and hoZ it is dealt Zith at
UNSW:
Plagiarism and Academic Integrit\
UNSW Plagiarism ProcedXre
Before sXbmitting an\ Zork \oX shoXld read and Xnderstand the sXb section named PlagiaUiVm in the coXrse
oXtline. We regard XnacknoZledged cop\ing of material, in Zhole or part, as an e[tremel\ serioXs offence. For
fXrther information, see the coXrse oXtline.
-- end --

欢迎咨询51作业君
51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: abby12468