FIT5222Planningandautomatedreasoning
IntroductiontoFlatland
Version1.1
Flatland
Flatlandisanopen-sourcetoolkitfordevelopingMulti-AgentPathfindingalgorithmsingridrail
worlds.Itprovidesa
two-dimensionalgridenvironmentinwhichmanyagentscanbeplaced,and
eachagentmustsolveoneormorenavigationaltasksinthegridworld.The
librarywasdeveloped
bySBB,AIcrowd,andnumerouscontributorsandAIcrowdresearchfellowsfromtheAIcrowd
community.
Inthisunit,youcanimplementanythingyoulearntinFlatland.
Part1:Installation
RefertoWeek1InstallationGuidetoInstallFlatlandonyourcomputer.
Youmustupdateyourflatlandcodeandinstallationbeforestartingtheassignment:
●Underflatlandfolder
●gitstashsave(Incaseyouhaveunstagedchangesmadeinweek1)
●gitpull
●gitstashpop
●pythonsetup.pyinstall
●Now,runcommandpython-mpiplist,youshouldseeflatland-rlversionisupdatedto2.2.4
●(Hint,usepython3insteadofpythonifpython3pointstotherightoneonyourmachine)
Part2:BasicRuleswithFlatland
Characteristics:
●Flatlandsimulatesarailwayenvironmentina2-dimensionalgridspaceofanysize.
FIT5222Planningandautomatedreasoning
●Flatlandisadiscrete-timesimulation.Adiscrete-timesimulationperformsallactionswithina
constanttimestep.
●Anagentcanmoveinanydirectiononwell-definedtransitionsfromcellstocells.
●Thecapacityofacellisusuallyone.Thususuallyonlyoneagentcanbelocatedatagivencell.
●Ateachstep,theagentscanchooseanaction.Forthechosenactiontheattachedtransitionwill
beexecuted.
●Whileexecutingatransition,Flatlandcheckswhethertherequestedtransitionisvalid.Ifthe
transitionisvalid,thetransitionwillupdatetheagent’sposition.
●Whileexecutingatransition,amalfunctionmayhappenwhichpreventsagentsfrommoving
towardtheirintendedtargetlocations.Eachmalfunctionlastsforacertainperiodoftimeand
duringthemalfunctionaffectedtrainsareunabletomove.Malfunctionsonlyoccurinquestion3.
●Also,eachagentisassignedastartandgoallocation.Ourtargetistofindpathsforeachagent
whileminimizingthesumoftheirindividualpathcostsortotalmakespan.
●Eachinstancehaveamaxtimesteplimit,whichis1.5*rail.height*rail.width.Allagentsneedto
arriveattheirdestinationbeforethismaxtimesteplimit.
Grid:
Arectangulargridofintegershape(dim_x,dim_y)definesthespatialdimensionsoftheenvironment.
WeuseNorth,East,West,SouthasorientationindicatorswhereNorthisUp,SouthisDown,Westisleft
andEastisRight.
Eachcellinthegridenvironmentcanbelocatedby(row,column)coordinate.Hereisanexample:
Apairofrowandcolumncoordinatescanalsobeconvertedtoasingleintegerbyrow×map_width+
column.
FIT5222Planningandautomatedreasoning
TileTypes:
EachCellwithinthesimulationgridconsistsofadistincttiletypewhichinturnlimitsthemovement
possibilitiesoftheagentthroughthecell.Forrailwayspecificproblem8basictiletypescanbedefined
whichdescribearailnetwork.Asageneralfactintherailwaynetwork,thereareatmosttwobranches
availableatanyjunctionpointwithanyentrydirection.
Thefollowingimagegivesanoverviewoftheeightbasictiletypes.Thesecanberotatedinstepsof45°
andmirroredalongtheNorth-SouthoftheEast-Westaxis.RefertoAppendixAforacompletelistof
tiles.
●Case0representsawall,thusnoagentcanoccupythetileatanytime.
●Case1representsapassagethroughthetile.Whileonthetiletheagentcanmakenonavigation
decision.Theagentcanonlydecidetoeithercontinue,i.e.passingontothenextconnectedtile,
orwaitatthecurrenttile.
●Case2representsasimpleswitch,thuswhencomingtothetopposition(southintheexample)a
navigationchoice(WestorNorth)mustbetaken.Generally,thestraighttransition(S->Ninthe
example)islesscostlythanthebenttransition.ThereforeinCase2thetwochoicesmaybe
rewardeddifferently.Case6isidenticaltocase2fromatopologicalpointofview,however,there
isnopreferredchoicewhencomingfromtheSouth.
●Case3canbeseenasasuperpositionofCase1.Aswithanyothertileatmaximumoneagent
canoccupythecellatagiventime.
●Case4representsasingle-slitswitch.Intheexample,anavigationchoiceispossiblewhen
comingfromtheWestorSouth.
●InCase5comingfromalldirections,anavigationchoicemustbetaken.
●Case7representsadead-end,thusonlystoporbackwardmotionispossiblewhenanagent
occupiesthiscell.
ExecutionandTermination
●Foreachquestion,thesimulatorwillexecutethecomputedplanforeachagent.
●Executionproceedsbyapplyingthenextplannedactionforeachagentinsomespecified
orderingdeterminedbythesimulator.
●Executionstopswhenanyagentisunabletoexecutetheirnextplannedaction.Atthispointthe
simulatorcallsareplanfunctionandallagentshaveanopportunitytorecomputetheirpaths.
●Evaluationofcurrentinstanceterminates,whenoneofthefollowingconditionsismet:
○Allagentsreachtheirdestinationcollision-free.
○Alloftheplannedactionsofallagentsareexecutedor,
○Themaximumnumberoftimestepisreached
FIT5222Planningandautomatedreasoning
Part3:Howtointeractwiththeflatlandenvironment.
Informationrelatedtoquestion1andquestion2
Flatlandenvironment
AlltheinformationaboutthemapisstoredinanobjectofGridTransitionMap.Intheexampleofthenext
chapter,thevariablerailistheobjectofGridTransitionMap.
Widthandheight:
●Getmapwidthbyrail.width
●Getmapheightbyrail.height
Getavailabletransitions:
cell_transitions=rail.get_transitions(x,y,direction)andinreturnyougetanarraywithpossible
transitionsineachofthefourcardinaldirections[North,East,South,West].Thexandyisthelocation
coordinateofthecurrentcellwherexin[0,height]andyin[0,width].
Dependingonthetiletype,atmosttwoofthereturneddirectionswillbevalidtransitions.Becareful
withtheheadingdirection!Differentheadingdirectionforthesamecellwillreturndifferentavailable
transitions.Thusyouneedtocorrectlyspecifythecurrentpositionanddirection.Theseimportantpieces
ofinformationarebothpartsofyouragent’sstate!Also,becarefulaboutheadingdirection’sinfluence
onheuristics!
Cannotfindpaths:
Inquestion2,itispossiblethataverysmallnumberofagents’pathscannotbefoundbyyouralgorithm
asmovingobstaclesmaytraversethroughthestartlocationofyouragent.Youcansimplyreturnan
emptylistinthiscase,andtheagentwithanemptylistwillnotshowonthemap.
Note,thisonlyhappenstonotmorethan5or6instances,withonly1~2agentshavenopaths.
Informationrelatedtoquestion3
Getagentinformation:
AgentsinformationisstoredinagentswhichisalistofEnvAgentobjects.AnEnvAgentobjectcontains
thefollowingattribute:
●initial_position
Atupleof(x,y).Theinitiallocationoftheagent.
●initial_direction
Anintegerin[0,1,2,3].Theinitialheadingdirectionoftheagent.
FIT5222Planningandautomatedreasoning
●target
Atupleof(x,y).Thegoallocationoftheagent.
●position
Atupleof(x,y).Thecurrentlocationoftheagent,thisisusedwhenaplanisinexecution,which
means.
●direction
Anintegerin[0,1,2,3].Thecurrentheadingdirectionoftheagent.
●status
Anintegerin[0,1,2].Thecurrentstatusoftheagent.
●deadline
Anintegerindicatestheexpectedlatestarrivaltimestepoftheagent.
●malfunction_data
Adictionarystoresdatarelatedtomalfunctions.
Youwillusetheinitial_position,initial_direction,andtargetofeachagenttoplanpathsforthem.The
positionanddirectionwillbeusedontheexecutionstage.Agentsarrivingatgoallocationsafterthe
deadlinetimestepwillreceiveapenalty.
Theheadingdirectionofthetrainisrepresentedbyaninteger:
●0:North.
●1:East.
●2:South.
●3:West.
Thestatusisaninteger:
●0:Theagentisnotactive(notonthemap).
●1:Theagentisactive(onthemap).
●2:Theagentreachesthegoallocation.
●3:Theagentreachesthegoallocationandisremovedfromthemap.
Executingyourcoordinatedplan
Bydefault,whenplanexecutionstarts,thestatusoftheagentis0andthepositionisNone.Bygivinga
moveforwardcommand,thestatusturnsto1,position=initial_positionandyoucanstartexecuting
yourplan.Whentheagentreachesitsgoallocationthestatusoftheagentturnsto3.
Duringexecution,itispossibleyourplannedactionsmayfailduetomalfunctions(i.e.,thetrain
experiencesamechanicalproblem).Themalfunction_dataobjectisadictionarythatcontainsthe
followinginformation:
●“malfunction”:int
Anumberthatindicateshowmanytimestepsnthemalfunctionlastsfor.Whenn>0theagent
willnotbeabletoexecuteanycommandforthenextntimesteps.Inotherwords,theagentmust
waitinplaceforthemalfunctiontopass(i.e.,untilthetrainisrepaired!).
FIT5222Planningandautomatedreasoning
Replan
Whenanewmalfunctionoccursoriftwoagentsplantoexecuteapairofincompatibleactions(i.e.,
leadingtoacollision),thesystemwillcallareplanfunctiontoprovideanewsetoffeasiblepaths.
Thenewpathsshouldconsiderthatagentscanonlywaitatthecurrentlocationduringthemalfunction
timesteps.Someextrainformationthatisavailabletothereplanfunction:
●current_timestep
Atwhichtimestepduringtheexecution,thereplanfunctioniscalled.
●existing_paths
Thepathsfromthepreviousget_path/replanfunctioncall.
●new_malfunction_agents
Alistofidsofagentsthathaveanewmalfunctionhappensatthecurrenttimestep(Doesnot
includeagentsalreadyhavemalfunctionedinpasttimesteps).
●failed_agents
Alistofidsofagentsthatfailedtoreachtheirintendedlocationatthecurrenttimestep.Itusually
includesnewmalfunctionagents,otheragentsinfluencedbythemalfunctionagents(i.e.,
knock-oneffects),andagentswithfailureexecutionduetocollisionsintheircurrentplans.
Shouldfindpathsforallagents:
Inquestion3,youhavefullcontrolonallagents.Thus,weexpectyoutofindpathsforallagents.
Part4:CodesforAssignment1.
GetPigletassignment1branch
Wereleasetheassignment1codebaseasaseparatebranchinpiglet.
Followthefollowingstepstoswitchbetweenbranchesinpigletandgetassignment1branch.
1.Assumingyouarecurrentlyworkinginthe“single-agent-prac”branch,andwanttocheckoutto
the“assignment1_2024”branch.(Toknowwhatbranchyouarecurrentlyworkingon,use‘git
branch’commandunderpiglet-public)
2.Discovernewbranchesintheremoterepository:
gitfetch
thiscommandshouldshowwhatitdiscoveredfromremoterepo.
3.Ifyoumadeanychangesinthe“single-agent-prac”branch,youshouldstageandcommityour
changestothepiglet(whichsavesallyourchangesandallowsyoutocheckouttoanother
branch):
cd
tothepiglet-publicfolder.
gitadd*
thiscommandwillstageallyourchangesinpiglet
(orusegitaddpath/to/changed/filetostageasinglefile)
gitcommit-m"Describewhatyoudid"
thiscommandwillcommit(save)allyour
changes.
FIT5222Planningandautomatedreasoning
4.Checkouttothetargetassignmentbranch:
gitcheckoutassignment1_2024
Youcoulddiscovermoregeekstyleusageofgitfromthegittutorialswerecommendedonweek1,which
couldspeedupyourversionmanagementanddevelopmentefficiency.
Questionscripts
Youwillfindquestion1.py,question2.py,andquestion3.pyintherootfolderofthepiglet.
●question1.pyrequiresyoutoimplementasingle-agentpathfindingalgorithmforflatland.
●question2.pyrequiresyoutoavoidconflictswithexistingpaths.
●question3.pyrequiresyoutoimplementamulti-agentpathfindingalgorithm,andareplanfunction
tohandlemalfunctions.
Eachquestionpythonscriptincludesaget_path()functionwithdifferentparameters.Youneedto
implementyouralgorithminsidetheget_path()functionandreturntherequireddata.
Ondefaultget_path()functionsincludesomedummyimplementation.Theyonlycangeneratesome
dummypaths.Replacethemwithyourimplementation.Also,youcanreadthesedummy
implementationstolearnhowweuseparametersandwhatkindofdatawereturn.
Testyourimplementation
Totestyourimplementation,runcommand(Makesureyourcondavirtualenvironmentisactivated):
pythonquestion1.py
Youcanreplacequestion1.pywithquestion2.pyorquestion3.pyforotherquestions.
Thescriptwilloutputstatisticdataatthesametime.
Debugandvisualizer
Youcanfindtwovariables“debug”and“visualizer”ineachquestionscript.
If“debug”issettoTrue,thescriptwill:
●outputdetailedexecutioninformation.
●pauseprogramifconflicthappenedormeetothererrors.
●outputconflictdetails.
If“visualizer”issettoTrue,thescriptwill:
FIT5222Planningandautomatedreasoning
●showyourplanwithanicevisualizer.
●showagentidif“debug”alsoTrue.
If“test_single_instance”issettoTrue,thescriptwill:
●evaluateonlyinstancespecifiedby“level”and“test”.
SubmissionofAssignment
Followinstructionsintheassignmentdescriptiontosubmitthezipfiletomoodle.
FollowtheContestServersubmissioninstructionstosubmittothecontestserver.
Part5:Practice-Navigateatrainwithyourkeyboard
(Optional).
ThisparthelpsyoulearnmoreaboutFlatland.Followtheseinstructionstocreateakeyboardnavigated
traindrivingprogramwithFlatland.
1.Createanemptypythonfile
2.Insertfollowingcodestoimportnecessarymodulesthatwillbeusedinthispractice:
#importnecessarymodulesthatthispythonscriptsneed.
fromflatland.envs.rail_generatorsimportcomplex_rail_generator
fromflatland.envs.schedule_generatorsimportcomplex_schedule_generator
fromflatland.envs.rail_envimportRailEnv
fromflatland.utils.rendertoolsimportRenderTool,AgentRenderVariant
importtime
3.InsertfollowingcodestoinitializeFlatlandRailwayenvironment:
#InitializeFlatlandRailwayenvironment
railWaySettings=complex_rail_generator(
nr_start_goal=10,#Numberofstartlocationandtrainstationpairs
nr_extra=10,#Extrarailsbetweeneachpairofstartandgoal.
min_dist=10,#Minimumdistancebetweenstartandgoallocations
max_dist=99999,
seed=999)
env=RailEnv(width=15,#Width/columnsofgrids
height=15,#Height/rowsofgrids
rail_generator=railWaySettings,
number_of_agents=1)#Howmanytrainsareenabled.
env.reset()#initializerailwayenv
FIT5222Planningandautomatedreasoning
#Initiatethegraphicmodule
render=RenderTool(env,gl="PILSVG",
show_debug=False,
screen_height=500,#Windowresolutionheight
screen_width=500)#Windowresolutionwidth
render.render_env(show=True,frames=False,show_observations=False)
window=render.gl.window#Wewillusethiswindowobjecttocaptureuserinput
4.Insertfollowingcodestocapturekeyboardpresseventandstorethekeytoavariable:
#CapturekeypresseventfromwindowandsetpressedkeytopressedKeyvariable
pressedKey=None#storepressedkeyhere
keyMap={"Left":1,#Turnleft
"Up":2,#Goforward
"Right":3,#Turnright
"Down":4#Stop
}
def_keypress(event):
globalpressedKey
ifevent.keysyminkeyMap.keys():
pressedKey=keyMap[event.keysym]#retrievecorrespondingactionsfromkeyMap
window.bind("
5.Insertfollowingcodestodefinecontrollerfunction,whichisthemostimportantpartyoushould
payattention:
#DefineController
defmy_controller(number_agents,timestep=False):
_action={}
globalpressedKey#declarepressedKeyisaglobalvariable
iftimestep:
#Inthefirstframeofflatland,agentsarendisabled,
#useanyactiontoenableagents.
foriinrange(0,number_agents):
_action[i]=2
return_action
#Retrievepressedactionandputitintoanactiondictionary.
#Weonlyhave1agentinthispractice,
FIT5222Planningandautomatedreasoning
#thusweassignthepressedactiononlytoagent0
ifpressedKeyisnotNone:
_action[0]=pressedKey
else:
_action[0]=0
pressedKey=None
return_action
6.Insertfollowingcodestocreatethemainloopofthispractice:
#Mainloop
cost_dict={}#Recordcostforeachagent
timestep=-1#allagentsaredisabledinthefirstmakespan,sowestartfrom-1..
all_done=False
whilenotall_done:
#Callcontrollerfunctiontogettheactiondictionaryforallagents
_action=my_controller(len(env.agents),timestep==-1)
#Passactiondictionarytorailwayenvandexecuteallactions.
obs,all_rewards,done,info=env.step(_action)
all_done=done["__all__"]
#countcostforeachagent
forkey,valueindone.items():
ifvalue==Falseandkey!="__all__":
ifkeynotincost_dict.keys():
cost_dict[key]=1
else:
cost_dict[key]+=1
timestep+=1
render.render_env(show=True,frames=False,show_observations=False)
time.sleep(0.25)
print("Sumofindividualcost:",sum(cost_dict.values()))
print("Totalmakespan:",makespan)
7.Runthepythonprogramunderthevirtualenvironmentyoucreated.IfyouuseaPythonIDE,you
shouldtrytofindavirtualenvironmentsettingandselectthecorrectone.Makingsurethe
railwaywindowisinthefrontwhenyoupressarraykeys.
FIT5222Planningandautomatedreasoning
Hint:
FlatlandDocumentation:
TheFlatlandsimulatorhasarichAPIwithmanyusefulfunctionsanddescriptionsofhowtheywork.You
mayfindithelpfultoexploretheofficialdocumentationwhendevelopingyourplanner.
Selectingavirtualenvironment:
WheneditingwithVScodeorPycharm,thedefaultenvironmentislikelybasepython.Thisenvironment
doesnothavethenecessarylibrariesfortheFlatlandenvironment.Forthisreasonyouwilllikelyneedto
specifythecorrectvirtualenvironmentyouhavealreadysetupfortheFlatlandchallenge.
VSCode:
1.InstallpythonextensionforyourVSCode.
2.Selectavirtualenvironmentbyclickingthe“Python”textonthebottom-leftcornerofthewindow:
Pycharm:
1.Clickthe“Python”textonthebottom-rightcornertoselectvirtualenvironment: