程序辅导案例 > Program >

辅导代写接单-FIT5222 Planning and automated reasoning FIT5222 Assignment 2

欢迎使用51辅导，51作业君孵化低价透明的学长辅导平台，服务保持优质，平均费用压低50%以上！ 51fudao.top

FIT5222 Planning and automated reasoning

FIT5222 Assignment 2

Pacman: Capture the Flag

In this assignment we will develop an AI controller for a multiplayer variant of Pacman. In this game

two teams, red and blue, compete with each other to capture as many dots as possible from the

opposite side of the map.

Originally developed at UC Berkeley, this setup presents us with a challenging planning domain

and an opportunity to solve a fun problem. At the end of the semester we will hold a contest to find

out who developed the most effective controller. Glory and a fancy certificate await the top three

students!

Part 1: Installation

Piglet PDDL Solver

In this assignment you will use the piglet library to solve PDDL problems. This requires installing

piglet to the python environment:

1. Go to piglet-public folder (If you use a virtual environment, activate the virtual env first. You

could use your flatland virtual environment as well.)

2. git fetch

3. git checkout pddl_solver //if successful, you should find a pddl_solver.py under

lib_piglet/utils/ folder

4. python setup.py install

Check the myTeam.py starter implementation to see how to use the newly installed library.

The competition server will have the same piglet installed in the environment as a PDDL solver.

Update Pacman Code

Make sure you have the latest version of the Pacman CTF game environment:FIT5222 Planning and automated reasoning

1. Go to the pacman-public folder (you cloned the repo in week 1)

2. git fetch

3. git reset --hard

//in case any changes you made locally

4. git pull

5. If your code is newest, you should see staffTeam.py and berkeleyTeam.py in the repo.

All codes for the game environment are now found in the pacman folder.

Part 2: Game Rules

We can characterise Pacman CTF as follows:

● Multi-agent: Two agents need to work together against an opposing team.

● Discrete: The environment is a grid maze and time is discretised into unit time steps.

● Dynamic: The environment changes as food is consumed.

● Partially observable: Each agent has a limited sensing range.

● Sequential: Past decisions affect future actions.

● Deterministic: The current state depends only on the teams and their actions.

● Offline: The world stops while an agent deliberates.

● Known: All the rules of the game are available a priori.

Environment

The game map is divided into two halves: red (left) and blue (right). When on the red side, a red

agent is a ghost. When crossing into enemy territory, the agent becomes a Pacman. Red

agents must defend the red food while trying to eat the blue food.

Scoring

When a Pacman agent eats a food dot, that dot is removed from the board and placed in a

virtual backpack. When the agent returns to their own side of the board, they automatically

"deposit" the contents of their backpack, earning one point per dot. The Red team scores

positive points, while the Blue team scores negative points.

Watch out! If an agent is caught by a ghost, before reaching their own side of the board, all

dots in their backpack will be deposited back to their original positions on the board. The agent

also returns (as a ghost) to its original starting position. No points are scored in this case. No

points are awarded to the opposing team either (for catching Pacman).FIT5222 Planning and automated reasoning

Power Capsules

A small number of power capsules are scattered throughout the maze. These can be eaten by

a Pacman agent. When this happens ghosts become “scared” and are susceptible to being

caught by the powered up Pacman. This ghost effect lasts for the next 40 timesteps, or until

caught, whichever comes sooner.

Observations

Agents can only observe an opponent's configuration (position and direction) if they or their

teammate are within 5 squares (Manhattan distance). In addition, an agent always gets a noisy

distance reading for each agent on the board, which can be used to reason about unobserved

opponents. The reading has an error of +/- 6 steps from the true distance. A general direction

(toward the opponent) is not specified.

Winning

A game ends when one team returns all but two of the opponents' dots. Games are also limited

to 300 time steps (i.e., 300 moves for each of the four agents). If this move limit is reached,

whichever team has returned the most food wins. If the score is zero (i.e., tied) this is recorded

as a tie game.

Part 3: Task & Student Workflow

At every time step the simulator will present you with the current state of the game board. Your task

will be to reason over this current state and decide how your agents should react.

This will require you to think about the planning process in several different ways: at the high-level,

where you decide what strategy your team will pursue; at the low-level, where you transform that

strategy into a concrete sequence of low-level actions; and you will need to decide when to replan

your agents, in response to changing circumstances.

We illustrate an example of the student workflow in the diagram below.FIT5222 Planning and automated reasoning

Your starting point for modifying this workflow is the function chooseAction in file myTeam.py.

This function is called by the simulator at every time step and for each agent. The simulator

provides the agent with a description of the game board. At the end of the function you must return

a concrete action for the agent to execute in the next timestep.

How to decide which action the agent should take is entirely up to you. The starting code provides

a basic template for the decision-making process and some simple, though not very effective,

strategies. You will need to modify this code to create a winning AI.

The next sections give further details about the decision-making process. For more details

regarding the Pacman CTF code, the myTeam.py implementation, and other general tips, refer to

the additional document “Assignment 2 Code Documentation.pdf”!

3.1. Choose a High Level Goal

What is the high-level strategy that agents should pursue? For example, they could visit the

opposing side and try to score points, or they could stay home, and defend their score. Each agent

chooses its own high level goal, but that decision may be influenced by the current state of the

board, the current score, and the strategy and position of the teammate agent.

Your starting point here is the getGoals function in the file myTeam.py. Given a PDDL description

of the current state, this function returns positive and negative PDDL state descriptions that

together describe a high-level goal. You can create multiple different goals and then choose

between them based on the information in the current state.FIT5222 Planning and automated reasoning

3.2. Generate a High Level Plan

To achieve your high-level goals you will need to compute a corresponding sequence of high-level

actions (i.e., a plan). Given a PDDL description of the initial (i.e., the current) states and the goal

states, the function getHighLevelPlan computes such a plan. You do not need to modify this

function. Just call it as necessary.

But what if the agent already has a high-level plan, computed in a previous iteration? It’s up to you

to decide how to proceed. The agent could retain the existing plan or it could change, and replan

anew from the current state. For example, if the pre-conditions for the next action are no longer

satisfied, the agent will certainly need to generate a new high level plan. The agent may also

decide to revise its current plan (and possibly current goal!) if its execution produces an

unexpected result (e.g., being caught by a ghost or a powered-up Pacman).

The staff implementation checks if we can reuse existing plan:

Note: Only a limited set of predicates are included in the basic PDDL state description (see the file

myTeam.pddl) and not all of them appear in the given actions. You can create a richer model, and

compute more diverse plans, by (1) refining existing actions or creating new ones that use more of

the available predicates; (2) enhancing the PDDL state description with additional information from

the current game state (i.e., add more predicates). We provide useful programmatic facilities for

this purpose. See the g

et_pddl_state function for examples of how to collect information from the

simulator environment (gameState).FIT5222 Planning and automated reasoning

3.3. Generate a Low Level Plan

High level plans tell the agents what to do, but in broad strokes. For example, if the goal is to score

points, a necessary action is to move to the opposite side of the board. But how should the agent

get there? Once on the opposing side, the next high level action might be to eat a food dot. But

which one? Low level planning provides answers to these questions.

The function getLowLevelPlan is your starting point here. Its inputs are a high-level action and a

description of the current game state. The function returns a sequence of move actions that the

agent needs to take to achieve the high-level effects of the input action.

You are free to implement this function using any method you choose. For example:

● Heuristic Search Strategy: you can develop your own custom strategies by implementing

a new expander class for your own customised state-space representation.

● Learning Based Strategy: we provide a very basic example of this approach. You can

make your own implementation or improve it by introducing more features to the model.

Then you will train the new model to obtain a set of good weights. You should carefully

consider, for each high level action, how you will reward or penalise the agent for its

performance. You may need to keep adjusting your implementation, observe how your

agent behaves, and try something new to overcome the drawbacks you observed.

But what if the agent already has a low-level plan, computed during a previous iteration? Again, it’s

up to you to decide how to proceed. The agent could continue following the existing low-level plan

or it might decide to revisit its decisions on account of new information provided by the game state.

For example, if Pacman is heading towards a food dot, but observes a ghost up ahead, continuing

toward the dot may be a bad idea.

The staff implementation reuse low level plans:

Remember! There are many different ways to achieve a high-level action. You may also wish to

consider whether your two agents coordinate their low-level actions or act independently of one

another.FIT5222 Planning and automated reasoning

3.4. Execute Moves

Your low-level strategy has generated a sequence of directions for the agent to move in. The first

move in this plan is given to the simulator for execution. Each of your two agents, and the two

opposing agents, move according to their own instructions and the timestep is advanced by one.

All of this happens after the completion of the chooseAction function.

Based on the new state of the game board, you may need to re-evaluate your goals and plans.

This workflow is cyclic and continues until the game is over.

Remember: You can (and should!) compare your implementation (myTeam.py) against the

existing baselines (staffTeam.py or berkeleyTeam.py). You can find these implementations in the

pacman folder. The baselines are useful to check that your implementation works and to measure

your progress.

Part 4: Competition Setup

We will have a competition to see who has the strongest pacman AI. You will be able to submit

your implementation to a contest server where they will be run against each other.

Upon submission to the server your agent will be evaluated against a staff baseline implementation

(staffTeam.py). Your agent will be evaluated for 49 games on 7 different maps. Within each game

there will be a time limit and the agent that has succeeded in eating the most food during that time

will win that game. To win the match with staff baseline, your team must win convincingly: 28 out of

49 games.

By winning the staffTeam.py, you will get a pass for Criteria 1 competition score (see more details

in marking rubrics), have a position on the leaderboard, and your agent will be evaluated against

all other participants on the leaderboard to get additional victory points.

With each opponent on the leaderboard (excluding the staff baseline implementation), your agent

will be evaluated on the same set of problems:

● In the event of a convincing win (win at least 28 out of 49 games), you will get 3 victory

points.

● In the event of a tie (win less than 28 games, but lose less than 28 games as well,

considering tie games neither win nor lost.), you will get 0.5 victory points.

● In the event of a loss (lose 28 games or more), you will get 0 victory points.

If you win the staff baseline implementation, your marks for Criteria 1 competition will be decided

by (Your total victory points / All available victory points)%, where the all available victory points is

3 × (number of all participants - 1).

You are expected to clearly win against the staff baseline implementation, otherwise you will

not proceed to the next stage of competing with entries from other students. In this case your

grade for Criteria 1 of the marking rubric will be a fail.FIT5222 Planning and automated reasoning

The server will keep a record of your matches against every opponent on the leaderboard. When

someone have a new best implementation recorded, your score against that student will be

updated as well.

More details regarding the competition setup will be provided in a supplementary document

“Assignment 2 Contest Submission Instruction.pdf”.

Part 5: Report

Together with your implementation source code you will be required to submit a report that

describes your approach. You will need to write a description of your strategy and give a

justification for algorithmic choices. Make sure that your report is as detailed and complete as

possible, including appropriate references.

When describing your model you should explain not only what it does but also analyse its strengths

and weaknesses (e.g., complexity, guarantees etc). You should also discuss how your code

implements that model: for example, important data structures, necessary optimisations and

parameter choices. Justify your design and implementation choices with discussion and, where

appropriate, with experiments.

You can ask questions about the assignment and you can discuss the merits of different design

choices (e.g., on Ed). However all submitted work must be completely your own. Do not copy

implementations you happen to find online. Do not ask your friends or colleagues for their

implementations. You will need to work independently and take reasonable steps to make sure

others’ don’t copy or misuse your work. For more information please see

https://www.monash.edu/students/study-support/academic-integrity

PLAGIARISM IS NOT ACCEPTABLE.

YOU WILL FAIL THE ASSIGNMENT AND POSSIBLY THE ENTIRE UNIT.FIT5222 Planning and automated reasoning

Part 6: Marking Rubric

The rubric specifies the marking criteria for your implementation and report. To avoid disappointment, pay close attention to the requirements.

Criteria

0%-49%

50%-59%

60%-69%

70%-79%

80%-100%

Weight

(%)

1. Competition

Loses to the staff

baseline

implementation

Outranks staff baseline

implementation.

Get 25% ~ 49% of all

available victory points.

Get 50% ~ 74% of all

available victory points.

Get 75% ~ 100% of all

available victory points.

33.3%

2. Agent strategy

implementation.

Minor updates to

example PDDL

implementation in

the code base (for

example, only

changing parameter

values)

Addresses some

drawbacks in existing

implementation.

Improves the model and

high level planner by:

(1) Introducing new actions

to the pddl model, which

uses more listed predicates

or new predicates you

introduced. Or

(2) Design new goal

functions.

Same as previous criteria

plus:

Implements alternative high

level goal prioritisation

scheme (agents switch

goals based on new

observations).

Improves low level planner;

e.g. more features, better

reward functions, or uses

heuristic search for low

level planning.

Same as previous

criteria plus:

Customised low level

plans for distinct high

level actions; e.g.

different low-level

strategies for eat food

(which?) vs. go to

opposing side (where?).

Same as previous criteria

plus:

High level and low level

decisions take into account

the current action and

strategy of the teammate

agent. (Two agents should

cooperate with each other,

and share information with

each other.)

Or design your own model

and agent workflow.

33.3%

3. Report -

Description of

your approach

Incomplete or

insufficient

description of the

approach.

Insufficient or

High level description of the

approach with sufficient

pseudo-code.

Discusses how the new

implementation improves

Same as previous criteria

plus:

Discussion and algorithmic

analysis (as applicable).

E.g., time, space,

Same as previous

criteria plus:

Reflections:

Discuss and compare

the advantages and

Same as previous criteria

plus:

Well documented numerical

experiments, analysing the

efficiency, advantages,

23.3%FIT5222 Planning and automated reasoning

missing

justifications.

Little or no

pseudo-code.

upon the staff baseline,

staffTeam.py

completeness, optimality.

Provide motivation for your

implementation choices.

Link the agent strategy to

lectures and tutorials.

Appropriate referencing

(where applicable).

disadvantages of at

least 2-3 implemented

strategies/attempts.

drawbacks of your different

implementations. E.g.

improvements on success

rate between different

strategies/attempts, runtime

statistics on planning time,

and appropriate reference

algorithms.

4. Report -

Communication

skills

Hard to follow with

no clear narrative.

Inadequate or no

separation of

discussion text into

coherent sections.

Writing is not

accurate or

articulate.

Inadequate

supporting materials.

Inadequate or

missing referencing.

The writing has a tenuously

logical narrative. Some

attempt at the expected

structural elements (e.g.

Intro, conclusion).

Writing is not accurate or

articulate most of the time.

The document has few

supporting materials

(tables, images,

pseudo-code) .

The student has

attempted to undertake

citing and referencing

with frequent errors.

The text has a clear logical

narrative and expected

structural elements (e.g.

intro, conclusion).

Writing is not accurate or

articulate most of the time.

There are some supporting

materials (tables, images,

pseudo-code) but not well

integrated with the rest of

the text.

The student follows the

requirements for citing and

referencing, with

some notable errors.

The writing is well

composed and has a

clear and logical

narrative and is well

structured.

Writing is generally

accurate and articulate.

The document has

appropriate supporting

materials that are well

integrated with the rest

of the text.

The student follows the

requirements for citing

and referencing, with

some minor errors.

The writing is very well

composed and has a very

clear and logically

formed a narrative.

Writing is accurate and

articulate.

The document is expertly

structured in the style of a

scientific report, including

appropriate supporting

materials that clearly improve

the quality of associated

discussion.

The student follows the

requirements for citing and

referencing.

10%FIT5222 Planning and automated reasoning

Part 7: Submission Guide

Submission to Moodle:

The following submission items are required:

1. Your implementation source codes, in a single directory called "src" (you can copy

everything in the piglet folder to "src"). Zip the codes directory with file name

last_name_student_id_pacman.zip. (For example, Chen_123456_pacman.zip)

2. The report describing your approaches as a single pdf file. Name the pdf as

last_name_student_id_report_pacman.pdf. (For example,

Chen_123456_report_pacman.pdf)

Submission to Contest Server:

You must make a successful submission on the contest server as part of your mark.

Submit early and submit often. Don’t wait for the last day, when you might be subject

to circumstances out of your control (like server outages!).

Submission deadline : 11:55 pm Sunday 28 May 2022