程序辅导案例 > Program >

代写辅导接单-FIT5222

欢迎使用51辅导，51作业君孵化低价透明的学长辅导平台，服务保持优质，平均费用压低50%以上！ 51fudao.top

FIT5222-PlanningandAutomatedReasoning

Assignment2:

Pacman-CapturetheFlag

InthisassignmentwewilldevelopanAIcontrollerforamultiplayervariantofPacman.Inthisgame

twoteams,redandblue,competewitheachothertocaptureasmanydotsaspossiblefromthe

oppositesideofthemap.

OriginallydevelopedatUCBerkeley,thissetuppresentsuswithachallengingplanningdomain

andanopportunitytosolveafunproblem.Attheendofthesemesterwewillholdacontesttofind

outwhodevelopedthemosteffectivecontroller.Gloryandafancycertificateawaitthetopthree

students!

Part1:Installation

PigletPDDLSolver

InthisassignmentyouwillusethepigletlibrarytosolvePDDLproblems.Thisrequiresinstalling

piglettothepythonenvironment:

1.Gotopiglet-publicfolder(Ifyouuseavirtualenvironment,activatethevirtualenvfirst.You

coulduseyourflatlandvirtualenvironmentaswell.)

2.gitfetch

3.gitcheckoutpddl_solver//ifsuccessful,youshouldfindapddl_solver.pyunder

lib_piglet/utils/folder

4.pythonsetup.pyinstall

CheckthemyTeam.pystarterimplementationtoseehowtousethenewlyinstalledlibrary.

ThecompetitionserverwillhavethesamepigletinstalledintheenvironmentasaPDDLsolver.

FIT5222-PlanningandAutomatedReasoning

UpdatePacmanCode

MakesureyouhavethelatestversionofthePacmanCTFgameenvironment:

1.Gotothepacman-publicfolder(youclonedtherepoinweek1)

2.gitfetch

3.gitreset--hard//incaseanychangesyoumadelocally

4.gitpull

5.Ifyourcodeisnewest,youshouldseestaffTeam.pyandberkeleyTeam.pyintherepo.

Allcodesforthegameenvironmentarenowfoundinthepacmanfolder.

Part2:GameRules

WecancharacterisePacmanCTFasfollows:

●Multi-agent:Twoagentsneedtoworktogetheragainstanopposingteam.

●Discrete:Theenvironmentisagridmazeandtimeisdiscretisedintounittimesteps.

●Dynamic:Theenvironmentchangesasfoodisconsumed.

●Partiallyobservable:Eachagenthasalimitedsensingrange.

●Sequential:Pastdecisionsaffectfutureactions.

●Deterministic:Thecurrentstatedependsonlyontheteamsandtheiractions.

●Offline:Theworldstopswhileanagentdeliberates.

●Known:Alltherulesofthegameareavailableapriori.

Environment

Thegamemapisdividedintotwohalves:red(left)andblue(right).Whenontheredside,ared

agentisaghost.Whencrossingintoenemyterritory,theagentbecomesaPacman.Red

agentsmustdefendtheredfoodwhiletryingtoeatthebluefood.

Scoring

WhenaPacmanagenteatsafooddot,thatdotisremovedfromtheboardandplacedina

virtualbackpack.Whentheagentreturnstotheirownsideoftheboard,theyautomatically

"deposit"thecontentsoftheirbackpack,earningonepointperdot.TheRedteamscores

positivepoints,whiletheBlueteamscoresnegativepoints.

Watchout!Ifanagentiscaughtbyaghost,beforereachingtheirownsideoftheboard,all

dotsintheirbackpackwillbedepositedbacktotheiroriginalpositionsontheboard.Theagent

alsoreturns(asaghost)toitsoriginalstartingposition.Nopointsarescoredinthiscase.No

pointsareawardedtotheopposingteameither(forcatchingPacman).

FIT5222-PlanningandAutomatedReasoning

PowerCapsules

Asmallnumberofpowercapsulesarescatteredthroughoutthemaze.Thesecanbeeatenby

aPacmanagent.Whenthishappensghostsbecome“scared”andaresusceptibletobeing

caughtbythepoweredupPacman.Thisghosteffectlastsforthenext40timesteps,oruntil

caught,whichevercomessooner.

Observations

Agentscanonlyobserveanopponent'sconfiguration(positionanddirection)iftheyortheir

teammatearewithin5squares(Manhattandistance).Inaddition,anagentalwaysgetsanoisy

distancereadingforeachagentontheboard,whichcanbeusedtoreasonaboutunobserved

opponents.Thereadinghasanerrorof+/-6stepsfromthetruedistance.Ageneraldirection

(towardtheopponent)isnotspecified.

Winning

Agameendswhenoneteamreturnsallbuttwooftheopponents'dots.Gamesarealsolimited

to300timesteps(i.e.,300movesforeachofthefouragents).Ifthismovelimitisreached,

whicheverteamhasreturnedthemostfoodwins.Ifthescoreiszero(i.e.,tied)thisisrecorded

asatiegame.

Part3:Task&StudentWorkflow

Ateverytimestepthesimulatorwillpresentyouwiththecurrentstateofthegameboard.Yourtask

willbetoreasonoverthiscurrentstateanddecidehowyouragentsshouldreact.

Thiswillrequireyoutothinkabouttheplanningprocessinseveraldifferentways:atthehigh-level,

whereyoudecidewhatstrategyyourteamwillpursue;atthelow-level,whereyoutransformthat

strategyintoaconcretesequenceoflow-levelactions;andyouwillneedtodecidewhentoreplan

youragents,inresponsetochangingcircumstances.

Weillustrateanexampleofthestudentworkflowinthediagrambelow.

FIT5222-PlanningandAutomatedReasoning

YourstartingpointformodifyingthisworkflowisthefunctionchooseActioninfilemyTeam.py.

Thisfunctioniscalledbythesimulatorateverytimestepandforeachagent.Thesimulator

providestheagentwithadescriptionofthegameboard.Attheendofthefunctionyoumustreturn

aconcreteactionfortheagenttoexecuteinthenexttimestep.

Howtodecidewhichactiontheagentshouldtakeisentirelyuptoyou.Thestartingcodeprovides

abasictemplateforthedecision-makingprocessandsomesimple,thoughnotveryeffective,

strategies.YouwillneedtomodifythiscodetocreateawinningAI.

Thenextsectionsgivefurtherdetailsaboutthedecision-makingprocess.Formoredetails

regardingthePacmanCTFcode,themyTeam.pyimplementation,andothergeneraltips,referto

theadditionaldocument“Assignment2CodeDocumentation.pdf”!

3.1.ChooseaHighLevelGoal

Whatisthehigh-levelstrategythatagentsshouldpursue?Forexample,theycouldvisitthe

opposingsideandtrytoscorepoints,ortheycouldstayhome,anddefendtheirscore.Eachagent

choosesitsownhighlevelgoal,butthatdecisionmaybeinfluencedbythecurrentstateofthe

board,thecurrentscore,andthestrategyandpositionoftheteammateagent.

YourstartingpointhereisthegetGoalsfunctioninthefilemyTeam.py.GivenaPDDLdescription

ofthecurrentstate,thisfunctionreturnspositiveandnegativePDDLstatedescriptionsthat

togetherdescribeahigh-levelgoal.Youcancreatemultipledifferentgoalsandthenchoose

betweenthembasedontheinformationinthecurrentstate.

FIT5222-PlanningandAutomatedReasoning

3.2.GenerateaHighLevelPlan

Toachieveyourhigh-levelgoalsyouwillneedtocomputeacorrespondingsequenceofhigh-level

actions(i.e.,aplan).GivenaPDDLdescriptionoftheinitial(i.e.,thecurrent)statesandthegoal

states,thefunctiongetHighLevelPlancomputessuchaplan.Youdonotneedtomodifythis

function.Justcallitasnecessary.

Butwhatiftheagentalreadyhasahigh-levelplan,computedinapreviousiteration?It’suptoyou

todecidehowtoproceed.Theagentcouldretaintheexistingplanoritcouldchange,andreplan

anewfromthecurrentstate.Forexample,ifthepre-conditionsforthenextactionarenolonger

satisfied,theagentwillcertainlyneedtogenerateanewhighlevelplan.Theagentmayalso

decidetoreviseitscurrentplan(andpossiblycurrentgoal!)ifitsexecutionproducesan

unexpectedresult(e.g.,beingcaughtbyaghostorapowered-upPacman).

Thestaffimplementationchecksifwecanreuseexistingplan:

Note:OnlyalimitedsetofpredicatesareincludedinthebasicPDDLstatedescription(seethefile

myTeam.pddl)andnotallofthemappearinthegivenactions.Youcancreatearichermodel,and

computemorediverseplans,by(1)refiningexistingactionsorcreatingnewonesthatusemoreof

theavailablepredicates;(2)enhancingthePDDLstatedescriptionwithadditionalinformationfrom

thecurrentgamestate(i.e.,addmorepredicates).Weprovideusefulprogrammaticfacilitiesfor

FIT5222-PlanningandAutomatedReasoning

thispurpose.Seethe get_pddl_statefunctionforexamplesofhowtocollectinformationfromthe

simulatorenvironment(gameState).

3.3.GenerateaLowLevelPlan

Highlevelplanstelltheagentswhattodo,butinbroadstrokes.Forexample,ifthegoalistoscore

points,anecessaryactionistomovetotheoppositesideoftheboard.Buthowshouldtheagent

getthere?Onceontheopposingside,thenexthighlevelactionmightbetoeatafooddot.But

whichone?Lowlevelplanningprovidesanswerstothesequestions.

ThefunctiongetLowLevelPlanisyourstartingpointhere.Itsinputsareahigh-levelactionanda

descriptionofthecurrentgamestate.Thefunctionreturnsasequenceofmoveactionsthatthe

agentneedstotaketoachievethehigh-leveleffectsoftheinputaction.

Youarefreetoimplementthisfunctionusinganymethodyouchoose.Forexample:

●HeuristicSearchStrategy:youcandevelopyourowncustomstrategiesbyimplementing

anewexpanderclassforyourowncustomisedstate-spacerepresentation.

●LearningBasedStrategy:weprovideaverybasicexampleofthisapproach.Youcan

makeyourownimplementationorimproveitbyintroducingmorefeaturestothemodel.

Thenyouwilltrainthenewmodeltoobtainasetofgoodweights.Youshouldcarefully

consider,foreachhighlevelaction,howyouwillrewardorpenalisetheagentforits

performance.Youmayneedtokeepadjustingyourimplementation,observehowyour

agentbehaves,andtrysomethingnewtoovercomethedrawbacksyouobserved.

Butwhatiftheagentalreadyhasalow-levelplan,computedduringapreviousiteration?Again,it’s

uptoyoutodecidehowtoproceed.Theagentcouldcontinuefollowingtheexistinglow-levelplan

oritmightdecidetorevisititsdecisionsonaccountofnewinformationprovidedbythegamestate.

Forexample,ifPacmanisheadingtowardsafooddot,butobservesaghostupahead,continuing

towardthedotmaybeabadidea.

Thestaffimplementationreuselowlevelplans:

FIT5222-PlanningandAutomatedReasoning

Remember!Therearemanydifferentwaystoachieveahigh-levelaction.Youmayalsowishto

considerwhetheryourtwoagentscoordinatetheirlow-levelactionsoractindependentlyofone

another.

3.4.ExecuteMoves

Yourlow-levelstrategyhasgeneratedasequenceofdirectionsfortheagenttomovein.Thefirst

moveinthisplanisgiventothesimulatorforexecution.Eachofyourtwoagents,andthetwo

opposingagents,moveaccordingtotheirowninstructionsandthetimestepisadvancedbyone.

AllofthishappensafterthecompletionofthechooseActionfunction.

Basedonthenewstateofthegameboard,youmayneedtore-evaluateyourgoalsandplans.

Thisworkflowiscyclicandcontinuesuntilthegameisover.

Remember:Youcan(andshould!)compareyourimplementation(myTeam.py)againstthe

existingbaselines(staffTeam.pyorberkeleyTeam.py).Youcanfindtheseimplementationsinthe

pacmanfolder.Thebaselinesareusefultocheckthatyourimplementationworksandtomeasure

yourprogress.

Part4:CompetitionSetup

WewillhaveacompetitiontoseewhohasthestrongestpacmanAI.Youwillbeabletosubmit

yourimplementationtoacontestserverwheretheywillberunagainsteachother.

Uponsubmissiontotheserveryouragentwillbeevaluatedagainstastaffbaselineimplementation

(staffTeam.py).Youragentwillbeevaluatedfor49gameson7differentmaps.Withineachgame

therewillbeatimelimitandtheagentthathassucceededineatingthemostfoodduringthattime

willwinthatgame.Towinthematchwithstaffbaseline,yourteammustwinconvincingly:28outof

49games.

BywinningthestaffTeam.py,youwillgetapassforCriteria1competitionscore(seemoredetails

inmarkingrubrics),haveapositionontheleaderboard,andyouragentwillbeevaluatedagainst

allotherparticipantsontheleaderboardtogetadditionalvictorypoints.

Witheachopponentontheleaderboard(excludingthestaffbaselineimplementation),youragent

willbeevaluatedonthesamesetofproblems:

●Intheeventofaconvincingwin(winatleast28outof49games),youwillget3victory

points.

●Intheeventofatie(winlessthan28games,butloselessthan28gamesaswell,

consideringtiegamesneitherwinnorlost.),youwillget0.5victorypoints.

●Intheeventofaloss(lose28gamesormore),youwillget0victorypoints.

Ifyouwinthestaffbaselineimplementation,yourmarksforCriteria1competitionwillbedecided

by(Yourtotalvictorypoints/Allavailablevictorypoints)%,wheretheallavailablevictorypointsis

3×(numberofallparticipants-1).

FIT5222-PlanningandAutomatedReasoning

Youareexpectedtoclearlywinagainstthestaffbaselineimplementation,otherwiseyouwill

notproceedtothenextstageofcompetingwithentriesfromotherstudents.Inthiscaseyour

gradeforCriteria1ofthemarkingrubricwillbeafail.

Theserverwillkeeparecordofyourmatchesagainsteveryopponentontheleaderboard.When

someonehaveanewbestimplementationrecorded,yourscoreagainstthatstudentwillbe

updatedaswell.

Moredetailsregardingthecompetitionsetupwillbeprovidedinasupplementarydocument

“Assignment2ContestSubmissionInstruction.pdf”.

Part5:Report

Togetherwithyourimplementationsourcecodeyouwillberequiredtosubmitareportthat

describesyourapproach.Youwillneedtowriteadescriptionofyourstrategyandgivea

justificationforalgorithmicchoices.Makesurethatyourreportisasdetailedandcompleteas

possible,includingappropriatereferences.

Whendescribingyourmodelyoushouldexplainnotonlywhatitdoesbutalsoanalyseitsstrengths

andweaknesses(e.g.,complexity,guaranteesetc).Youshouldalsodiscusshowyourcode

implementsthatmodel:forexample,importantdatastructures,necessaryoptimisationsand

parameterchoices.Justifyyourdesignandimplementationchoiceswithdiscussionand,where

appropriate,withexperiments.

Youcanaskquestionsabouttheassignmentandyoucandiscussthemeritsofdifferentdesign

choices(e.g.,onEd).Howeverallsubmittedworkmustbecompletelyyourown.Donotcopy

implementationsyouhappentofindonline.Donotaskyourfriendsorcolleaguesfortheir

implementations.Youwillneedtoworkindependentlyandtakereasonablestepstomakesure

others’don’tcopyormisuseyourwork.Formoreinformationpleasesee

https://www.monash.edu/students/study-support/academic-integrity

PLAGIARISMISNOTACCEPTABLE.

YOUWILLFAILTHEASSIGNMENTANDPOSSIBLYTHEENTIREUNIT.

FIT5222-PlanningandAutomatedReasoning

Part6:MarkingRubric

Therubricspecifiesthemarkingcriteriaforyourimplementationandreport.Toavoiddisappointment,paycloseattentiontotherequirements.

CriteriaN

0%-49%

50%-59%

60%-69%

70%-79%

80%-100%

Weight

(%)

1.CompetitionLosestothestaff

baseline

implementation

Outranksstaffbaseline

implementation.

Get25%~49%ofall

availablevictorypoints.

Get50%~74%ofall

availablevictorypoints.

Get75%~100%ofall

availablevictorypoints.

33.3%

2.Agentstrategy

implementation.

Minorupdatesto

examplePDDL

implementationin

thecodebase(for

example,only

changingparameter

values)

Addressessome

drawbacksinexisting

implementation.

Improvesthemodeland

highlevelplannerby:

(1)Introducingnewactions

tothepddlmodel,which

usesmorelistedpredicates

ornewpredicatesyou

introduced.Or

(2)Designnewgoal

functions.

Sameaspreviouscriteria

plus:

Implementsalternativehigh

levelgoalprioritisation

scheme(agentsswitch

goalsbasedonnew

observations).

Improveslowlevelplanner;

e.g.morefeatures,better

rewardfunctions,oruses

heuristicsearchforlow

levelplanning.

Sameasprevious

criteriaplus:

Customisedlowlevel

plansfordistincthigh

levelactions;e.g.

differentlow-level

strategiesforeatfood

(which?)vs.goto

opposingside(where?).

Sameaspreviouscriteria

plus:

Highlevelandlowlevel

decisionstakeintoaccount

thecurrentactionand

strategyoftheteammate

agent.(Twoagentsshould

cooperatewitheachother,

andshareinformationwith

eachother.)

Ordesignyourownmodel

andagentworkflow.

33.3%

FIT5222-PlanningandAutomatedReasoning

3.Report-

Descriptionof

yourapproach

Incompleteor

insufficient

descriptionofthe

approach.

Insufficientor

missing

justifications.

Littleorno

pseudo-code.

Highleveldescriptionofthe

approachwithsufficient

pseudo-code.

Discusseshowthenew

implementationimproves

uponthestaffbaseline,

staffTeam.py

Sameaspreviouscriteria

plus:

Discussionandalgorithmic

analysis(asapplicable).

E.g.,time,space,

completeness,optimality.

Providemotivationforyour

implementationchoices.

Linktheagentstrategyto

lecturesandtutorials.

Appropriatereferencing

(whereapplicable).

Sameasprevious

criteriaplus:

Reflections:

Discussandcompare

theadvantagesand

disadvantagesofat

least2-3implemented

strategies/attempts.

Sameaspreviouscriteria

plus:

Welldocumentednumerical

experiments,analysingthe

efficiency,advantages,

drawbacksofyourdifferent

implementations.E.g.

improvementsonsuccess

ratebetweendifferent

strategies/attempts,runtime

statisticsonplanningtime,

andappropriatereference

algorithms.

23.3%

4.Report-

Communication

skills

Hardtofollowwith

noclearnarrative.

Inadequateorno

separationof

discussiontextinto

coherentsections.

Writingisnot

accurateor

articulate.

Inadequate

supportingmaterials.

Inadequateor

Thewritinghasatenuously

logicalnarrative.Some

attemptattheexpected

structuralelements(e.g.

Intro,conclusion).

Writingisnotaccurateor

articulatemostofthetime.

Thedocumenthasfew

supportingmaterials

(tables,images,

pseudo-code).

Thestudenthas

attemptedtoundertake

Thetexthasaclearlogical

narrativeandexpected

structuralelements(e.g.

intro,conclusion).

Writingisnotaccurateor

articulatemostofthetime.

Therearesomesupporting

materials(tables,images,

pseudo-code)butnotwell

integratedwiththerestof

thetext.

Thestudentfollowsthe

requirementsforcitingand

Thewritingiswell

composedandhasa

clearandlogical

narrativeandiswell

structured.

Writingisgenerally

accurateandarticulate.

Thedocumenthas

appropriatesupporting

materialsthatarewell

integratedwiththerest

ofthetext.

Thestudentfollowsthe

Thewritingisverywell

composedandhasavery

clearandlogically

formedanarrative.

Writingisaccurateand

articulate.

Thedocumentisexpertly

structuredinthestyleofa

scientificreport,including

appropriatesupporting

materialsthatclearlyimprove

thequalityofassociated

discussion.

10%

FIT5222-PlanningandAutomatedReasoning

missingreferencing.citingandreferencing

withfrequenterrors.

referencing,with

somenotableerrors.

requirementsforciting

andreferencing,with

someminorerrors.

Thestudentfollowsthe

requirementsforcitingand

referencing.

FIT5222-PlanningandAutomatedReasoning

Part7:SubmissionGuide

SubmissiontoMoodle:

Thefollowingsubmissionitemsarerequired:

1.Yourimplementationsourcecodes,inasingledirectorycalled"src"(youcancopy

everythinginthepacmanfolderto"src").Zipthecodesdirectorywithfilename

last_name_student_id_pacman.zip.(Forexample,Chen_123456_pacman.zip)

2.Thereportdescribingyourapproachesasasinglepdffile.Namethepdfas

last_name_student_id_report_pacman.pdf.(Forexample,

Chen_123456_report_pacman.pdf)

SubmissiontoContestServer:

Youmustmakeasuccessfulsubmissiononthecontestserveraspartofyourmark.

Submitearlyandsubmitoften.Don’twaitforthelastday,whenyoumightbesubject

tocircumstancesoutofyourcontrol(likeserveroutages!).

Submissiondeadline:11:55pmWednesday30thOctober2024