程序辅导案例 > Program >

程序代写案例-CSU33061

欢迎使用51辅导，51作业君孵化低价透明的学长辅导平台，服务保持优质，平均费用压低50%以上！ 51fudao.top

CSU33061
Online Exam: CSU33061 Artificial Intelligence 1
17 May 2021, noon-14:00 (+20 minutes for LENS students)
Instructions:
Answer two of the three questions, labelled Q1, Q2 and Q3. If you answer all three
questions, your answer to question Q3 will be ignored (and left unmarked). Each
question is worth 50 marks; two questions together are 100 marks.
Please type your answers and upload a plaintext file or PDF. (PDF converters are
available online.)
If you have any questions or encounter any issues with the test, please contact
[email protected].
Tip: As partial credit will be awarded where possible, try not to leave any part of a
question you are answering blank. Write down some thoughts that show effort and
understanding (even if the understanding is that the effort falls short). But to avoid
penalties for overly long answers, please be concise.
Declaration: By taking this exam, you are declaring
I understand that this is an individual assessment and that collaboration is
not permitted.
I have not received any assistance with my work for this assessment.
Where I have used the published work of others, I have indicated this with
appropriate citation.
I have not and will not share any part of my work on this assessment,
directly or indirectly, with any other student.
I have read and I understand the plagiarism provisions in the General
Regulations of the University Calendar for the current year, found at
http://www.tcd.ie/calendar.
I have also completed the Online Tutorial on avoiding plagiarism, located
at https://libguides.tcd.ie/plagiarism/ready-steady-write.
I understand that by taking this exam, I am agreeing with the statement
above.
[end of Declaration]
Page 1 of 7
CSU33061
Question Q1 This question is about searching a graph for a path from a given node
Start to some goal node. The predicate frontierSearch/1 below outlines a general
approach on which to add definitions of goal/1, arc/2 and add2frontier/3.
search(Start):- frontierSearch([Start]).
frontierSearch([Node|_]):- goal(Node).
frontierSearch([Node|More]):- findall(Next,arc(Node,Next),Children),
add2frontier(Children,More,New),
frontierSearch(New).
(a) Revise frontierSearch(+Frontier) to froSearch(+Frontier,?Path) so
that froSearch([[Start]],Path) returns a Path from Start to a goal node.
As with the definition of frontierSearch above, leave the predicates goal/1,
arc/2 and add2frontier/3 unspecified.
[4 marks]
(b) To define add2frontier, recall that A-star uses two functions, cost and h, to
assign each arc a cost, and each node a heuristic estimate of its distance to a
goal node.
(i) One way to define distance between nodes is the minimum cost of a path
between the nodes. How is the cost of a path computed from costs of arcs?
(Answer in Prolog or in concise unambiguous English.)
[3 marks]
(ii) True or False: If the heuristic estimate of every node is 0, then A-star does
a min-cost search.
[2 marks]
(iii) How can we define the functions cost and h so that A-star on these
functions does breadth-first search? (Answer in Prolog or in concise
unambiguous English.)
[3 marks]
(iv) True or False: if A-star searches breadth-first then A-star is admissible.
[3 marks]
(v) True or False: If the heuristic function h overestimates the cost of a path
to a goal node, then A-star is not admissible. Briefly explain your answer.
Page 2 of 7
CSU33061
[6 marks]
(c) Recall from Homework 1 that a propositional Prolog knowledge base such as
q:- a.
q:- b,c.
b:- c.
c.
can be represented as the list
[[q,a],[q,b,c],[b,c],[c]]
and used as KB in arc(Node,Next,KB) to pick out arcs (Node,Next) along
which Prolog tries to find a path to [] (the goal node), where
arc([H|T],Node,KB) :- member([H|B],KB), append(B,T,Node).
Thus, we can analyze the Prolog query ?-q against the knowledge base above as
a search for a path from the node [q] to []. The arc from [q] to [a] fails to
lead to a path to [] because
(†) there is no arc from [a] (i.e., no member of KB has head a)
whereas the arc from [q] to [b,c] leads to the path
[q], [b,c], [c,c], [c], []
which we can shorten to
[q], [b,c], [c], []
if we replace arc(Node,Next,KB) by arc2(Node,Next,KB) to reduce Next to
a set (without repeating members) as in
arc2(Node,Next,KB) :- arc(Node,Nx,KB),
makeSet(Nx,Next).
makeSet([],[]).
makeSet(List,Set) :- setof(X,member(X,List),Set).
Line (†) above suggests revising arc2(Node,Next,KB) further to
arc3(Node,Next,KB) :- arc2(Node,Next,KB),
allHeads(Next,KB).
allHeads([],_).
allHeads([H|T],KB):- member([H|_],KB), allHeads(T,KB).
Page 3 of 7
CSU33061
That is, arc3(Node,Next,KB) implies Next is a set of heads of members of KB.
(i) Which of (A), (B) and (C), if any, are true for all instantiations of Node,
Next and KB?
(A) arc3(Node,Next,KB) implies arc2(Node,Next,KB)
(B) arc2(Node,Next,KB) implies arc(Node,Next,KB)
(C) arc3(Node,Next,KB) implies arc(Node,Next,KB)
[5 marks]
(ii) True or False: if arc3(Node,Next,KB) then there is a path from Next to
[] along arcs satisfying arc3. Briefly explain your answer.
[6 marks]
(iii) True or False: if there is a path of length k from Start to [] along arcs
satisfying the predicate arc, then there is a path of length ≤ k from Start
to [] along arcs satisfying arc3. Briefly explain your answer.
[6 marks]
(iv) True or False: there is a path from Start to [] along arcs satisfying the
predicate arc if and only if there is a path from Start to [] along arcs
satisfying arc3. Briefly explain your answer.
[6 marks]
(v) Briefly describe the advantages and disadvantages (if any) of replacing
arc(Node,Next,KB) by arc3(Node,Next,KB).
[6 marks]
Page 4 of 7
CSU33061
Question Q2 This question is about Markov Decision Proceses (MDPs) and
Q-learning.
(a) What is the decision that an MDP is set up to analyze?
[4 marks]
(b) True or False: in a MDP 〈S , A, p, r , γ〉, an agent can do any action a ∈ A at any
state s ∈ S to reach some state s ′ ∈ S for an immediate reward of r(s, a, s ′).
Briefly explain your answer.
[4 marks]
(c) In an MDP 〈S , A, p, r , γ〉 with discount factor γ = 0, what action a ∈ A should
an agent do at a state s ∈ S?
[4 marks]
(d) What is so odd about a discount factor γ equal to 1?
[4 marks]
(e) The remainder of Q2 is about a baby variant of the grid from Homework 2,
reduced from 5× 5 to 3× 3, and differing in other respects. To be precise,
let us fix a discount factor γ = .9 and flesh out an MDP 〈S , A, p, r , .9〉
where a state s ∈ S is a (row,column)-pair of integers from {1, 2, 3},
an action a ∈ A is one of: up, down, left, right
S = {1, 2, 3} × {1, 2, 3}
A = {up, down, left, right}
(1,1) (1,2) (1,3)
(2,1) (2,2) (2,3)
(3,1) (3,2) (3,3)
and every immediate reward r(s, a, s ′) is 0 unless s ′ is either (2,2) for a losing −1
or (3,3) for a winning +1
r(s, a, s ′) =

−1 if s ′ = (2, 2)
+1 if s ′ = (3, 3)
0 otherwise
−1
+1
which leaves transition probabilities p(s, a, s ′) to be specified below.
(i) Let us suppose that if s is either (2,2) or (3,3) then for every action a,
p(s, a, s) = 1. Recall that the optimal γ-discounted reward Q(s, a) can be
approximated by value iterations Qn(s, a) converging to it at the limit
Q(s, a) = lim
n→∞
Qn(s, a).
Page 5 of 7
CSU33061
What are Qn((2, 2), a) and Qn((3, 3), a) for every integer n ≥ 0?
[9 marks]
(ii) Let us turn next to a state s different from (2,2) and (3,3). If s has a
square in the direction of an action a from it (in the 3× 3 grid), let sa be
that state; otherwise, let sa be s. For example,
(1, 1)right = (1, 2) but (1, 1)left = (1, 1)up = (1, 1).
Now, suppose
p(s, a, sa) = 1 for every a ∈ A and s ∈ S different from (2,2) and (3, 3).
What are Qn((2, 3),left) and Qn((2,3),down) for every integer n ≥ 0?
[9 marks]
(iii) Suppose that over an environment described by the MDP above, we
implemented Q-learning, starting every episode at state (1,1) and ending
the episode as soon as we reach (2,2) or (3,3). What learning rate α and
change in learning rate would you recommend so that Q-learning might
most quickly prescribe the optimal policy for γ-discounted rewards
lim
n→∞
Qn(s, a) for Qn(s, a) from parts (i) and (ii)?
Can we expect our Q-learning approach to give Q-values approaching these
limits as we increase the number of training episodes? Briefly explain your
answer.
[16 marks]
Page 6 of 7
CSU33061
Question Q3
(a) Let P be a a Constraint Satisfaction Problem [Var,Dom,Con] over n variables,
each ranging over s values; i.e., Var = [x1, ... , xn] and Dom(xi) has cardinality s,
for every i from 1 to n. To reduce P to Boolean Satisfiability SAT, is it enough
to consider k Boolean variables for a search space of 2k variable assignments,
where k is the smallest integer ≥ n log2 s? Briefly explain your answer.
[15 marks]
(b) True or False: the Herbrand interpretation of a Datalog knowledge base KB is
the same as the Herbrand interpretation of a Datalog knowledge base KB′ if and
only if KB is the same as KB′. Briefly explain your answer.
[10 marks]
(c) What belief network do the transition probabilities p in a Markov decision
process describe?
[15 marks]
(d) Briefly explain the sense in which the temperature T in simulated annealing plays
the same role as in an -greedy algorithm for Q-learning.
[10 marks]
Page 7 of 7

欢迎咨询51作业君