代写辅导接单-CS264A Automated Reasoning

欢迎使用51辅导,51作业君孵化低价透明的学长辅导平台,服务保持优质,平均费用压低50%以上! 51fudao.top

   Notation

variable

literal conjunction disjunction negation sentence

term

clause normal forms

CNF DNF world

models

2020 Fall By Patricia Xiao

Existential Quantification Useful Equations

Semantic Relationships

Equivalence: α and β are equivalent iff Mods(α) = Mods(β)

Mutually Exclusive: α and β are equivalent iff Mods(α ∧ β) = Mods(α) ∩ Mods(β) = ∅

Exhaustive: α and β are exhaustive iff Mods(α ∨ β) = Mods(α) ∪ Mods(β) = W

that is, when α ∨ β is valid. Entailment: α entails β (α |= β) iff

Mods(α) ⊆ Mods(β)

That is, satisfying α is stricter than satisfying β.

Monotonicity: the property of relations, that • if α implies β, then α∧γ implies β;

• if α entails β, then α∧γ entails β;

it infers that adding more knowledge to the existing KB (knowledge base) never recalls anything. This is considered a limitation of traditional logic. Proof:

Mods(α ∧ γ) ⊆ Mods(α) ⊆ Mods(β) Quantified Boolean Logic: Notations

Our discussion on quantified Boolean logic centers around conditioning and restriction. (|, ∃, ∀) With a propositional sentence ∆ and a variable P :

• condition ∆ on P: ∆|P

i.e. replacing all occurrences of P by true.

• condition ∆ on ¬P: ∆|¬P

i.e. replacing all occurrences of P by false.

Boolean’s/Shanoon’s Expansion:

∆=?P ∧(∆|P)?∨?¬P ∧(∆|¬P)?

it enables recursively solving logic, e.g. DPLL.

CS264A Automated Reasoning Review Note

   x, α, β, . . . (a.k.a. propositional variable / Boolean variable)

x, ¬x

conjunction of α and β: α ∧ β disjunction of α and β: α ∨ β negation of α: ¬α

variables are sentences; nega- tion, conjunction, and disjunc- tion of sentences are sentences conjunction (∧) of literals disjunction (∨) of literals universal format of all lo- gic sentences (everyone can be transformed into CNF/DNF) conjunctive normal form, con- junction (∧) of clauses (∨) disjunctive normal form, dis- junction (∨) of terms (∧)

ω: truth assignment of all varia- bles (e.g. ω |= α means sentence α holds at world ω) Mods(α)={ω:ω|=α}

α ⇒ β = ¬α ∨ β

α ⇒ β = ¬β ⇒ ¬α ¬(α ∨ β) = ¬α ∧ ¬β ¬(α ∧ β) = ¬α ∨ ¬β

γ ∧ (α ∨ β) = (γ ∧ α) ∨ (γ ∧ β) γ ∨ (α ∧ β) = (γ ∨ α) ∧ (γ ∨ β)

       Models

         Main Content of CS264A

• Foundations: logic, quantified Boolean logic, SAT solver, Max-SAT etc., compiling kno- wledge into tractable circuit (the book chapters)

• Application: three modern roles of logic in AI

1. logic for computation

2. logic for leaning from knowledge / data 3. logic for meta-learning

Syntax and Semantics of Logic

Logic syntax, “how to express”, include the literal, etc. all the way to normal forms (CNF/DNF).

Logic semantic, “what does it mean”, could be dis- cussed from two perspectives:

• properties: consistency, validity etc. (of a sen- tence)

• relationships: equivalence, entailment, mutual exclusiveness etc. (of sentences)

Listing the 2n worlds wi involving n variables, we have a truth table.

If sentence α is true at world ω, ω |= α, we say:

• sentence α holds at world ω • ω satisfies α

• ω entails α

otherwise ω ̸|= α.

Mods(α) is called models/meaning of α:

Mods(α) = {ω : ω |= α} Mods(α ∧ β) = Mods(α) ∩ Mods(β)

Mods(α ∨ β) = Mods(α) ∪ Mods(β) Mods(¬α) = Mods(α)

ω |= α: world ω entails/satisfies sentence α. α ⊢ β: sentence α derives sentence β.

Semantic Properties

Defining ∅ as empty set and W as the set of all worlds. Consistency: α is consistent when

Mods(α) ̸= ∅ Validity: α is valid when

Mods(α) = W α is valid iff ¬α is inconsistent.

α is consistent iff ¬α is invalid.

 

   Existential & Universal Qualification

Existential Qualification: ∃P∆=∆|P ∨ ∆|¬P

Universal Qualification: ∀P∆=∆|P ∧ ∆|¬P

Resolution / Inference Rule

Modus Ponens (MP):

α, α ⇒ β

Clausal Form of CNF

CNF, the Conjunctive Normal Form, is a conjunction of clauses.

∆ = C1 ∧ C2 ∧ . . . written in clausal form as:

∆ = {C1,C2 ...}

where each clause Ci is a disjuntion of literals:

Ci =li1 ∨li2 ∨li3 ∨... written in clausal form as:

Ci ={li1,li2,li3}

Resolution in the clausal form is formalized as:

• Given clauses Ci and Cj where literal P ∈ Ci and literal ¬P ∈ Cj

• The resolvent is (Ci \{P }) ∪ (Cj \{¬P }) (Nota- tion: removing set {P } from set Ci is written as Ci \{P })

If the clausal form of a CNF contains an empty clause (∃i,Ci = ∅ = {}), then it makes the CNF inconsistent / unsatisfiable.

Existential Quantification via Resolution

1. Turning KB ∆ into CNF.

2. To existentially Quantify B, do all B-resolutions 3. Drop all clauses containing B

Unit Resolution

Unit resolution is a special case of resolution, where min(|Ci|,|Cj|) = 1 where |Ci| denotes the size of set Ci. Unit resolution corresponds to modus ponens (MP). It is NOT refutation complete. But it has benefits in efficiency: could be applied in linear time.

Refutation Theorem

∆ |= α iff ∆ ∧ ¬α is inconsistent. (useful in proof)

• resolution finds contradiction on ∆∧¬α: ∆ |= α

• resolution does not find any contradiction on ∆∧¬α: ∆0α

     Duality:

Resolution:

equivalent to:

β

α ∨ β, ¬β ∨ γ

α∨γ

¬α ⇒ β, β ⇒ γ

¬α ⇒ γ

∃P ∆ = ¬?∀P ¬∆? ??

∀P∆=¬ ∃P¬∆

 The quantified Boolean logic is different from first- order logic, for it does not express everything as ob- jects and relations among objects.

Forgetting

The right-hand-side of the above-mentioned equation: ∃P∆=∆|P ∨ ∆|¬P

doesn’t include P.

Herewehaveanexample: ∆={A⇒B,B⇒C},

Above the line are the known conditions, below the line is what could be inferred from them.

In the resolution example, α ∨ γ is called a “resolvent”. We can say it either way:

• resolveα∨βwith¬β∨γ • resolve over β

• do β-resolution

MP is a special case of resolution where α = true. It is always written as:

∆={α∨β,¬β∨γ}⊢R α∨γ Applications of resolution rules:

1. existential quantification

2. simplifying KB (∆)

3. deduction (strategies of resolution, directed re- solution)

Completeness of Resolution / Inference Rule

We say rule R is complete, iff ∀α, if ∆ |= α then ∆ ⊢R α.

In other words, R is complete when it could “discover everything from ∆”.

Resolution / inference rule is NOT complete. A counterexampleis: ∆={A,B},α=A∨B. However, when applied to CNF, resolution is refuta- tion complete. Which means that it is sufficient to discover any inconsistency.

  then:

∆ = (¬A ∨ B) ∧ (¬B ∨ C) ∆|B = C

∆|¬B = ¬A

∴ ∃E∆ = ∆|B ∨ ∆|¬E = ¬A ∨ C

  • ∆|=∃P∆

• If α is a sentence that does not mention P then

∆|=α ⇐⇒ ∃P∆|=P

We can safely remove P from ∆ when considering

existential qualification. It is called:

• forgetting P from ∆

• projecting P on all units / variables but P

 

   Resolution Strategies: Linear Resolution

All the clauses that are originally included in CNF ∆ are root clauses.

Linear resolution resolved Ci and Cj only if one of them is root or an ancestor of the other clause.

An example: ∆ = {¬A, C}, {¬C, D}, {A}, {¬C, ¬D}. {¬A,C} {¬C,D} {A} {¬C,¬D}

{C}

Resolution Strategies: Directed Resolution

Directed resolution is based on bucket elimination, and requires pre-defining an order to process the va- riables. The steps are as follows:

1. With n variables, we have n buckets, each cor- responds to a variable, listed from the top to the bottom in order.

2. Fill the clauses into the buckets. Scanning top- side-down, putting each clause into the first buc- ket whose corresponding variable is included in the clause.

3. Process the buckets top-side-down, whenever we

have a P -resolvent Cij , put it into the first fol- lowing bucket whose corresponding variable is included in Cij.

An example: ∆ = {¬A, C}, {¬C, D}, {A}, {¬C, ¬D}, with variable order A, D, C, initialized as:

A: {¬A, C}, {A}

D: {¬C, D}, {¬C, ¬D} C:

After processing finds {} ({C} is the A-resolvent, {¬C} is the B-resolvent, {} is a C-resolvent):

Directed Resolution: Forgetting

Directed resolution can be applied to forgetting / pro- jecting.

When we do existential quantification on variables P1,P2,...Pm, we:

1. put them in the first m places of the variable order

2. after processing the first m (P1, P2, . . . Pm) buc- kets, remove the first m buckets

3. keep the clauses (original clause or resolvent) in the remaining buckets

then it is done.

Utility of Using Graphs

Primal Graph: Each node represents a variable P. Given CNF ∆, if there’s at least a clause ∃C ∈ ∆ such that li,lj ∈ C, then the corresponding nodes Pi and Pj are connected by an edge.

The tree width (w) (a property of graph) can be used to estimate time & space complexity. e.g. comple- xity of directed resolution. e.g. Space complexity of n variables is O(n exp (w)).

For more, see textbook — min-fill heuristic. Decision Tree: Can be used for model-counting. e.g. ∆=A∧(B∨C),wheren=3,then:

SAT Solvers

The SAT-solvers we learn in this course are: • requiring modest space

• foundations of many other things

Along the line there are: SAT I, SAT II, DPLL, and other modern SAT solvers.

They can be viewed as optimized searcher on all the worlds ωi looking for a world satisfying ∆.

SAT I

     {D}

{¬C}

{}

1. 2. 3. 4. 5. 6. 7. 8. 9.

SAT-I (∆, n, d): If d = n:

If ∆ = {}, return {}

If ∆ = {{}}, return FAIL

If L = SAT-I(∆|Pd+1, n, d + 1) ̸= FAIL:

return L ∪ {Pd+1}

If L = SAT-I(∆|¬Pd+1 , n, d + 1) ̸= FAIL:

return L ∪ {¬Pd+1} return FAIL

     A

C

true

high child

 low child

∆: a CNF, unsat when {} ∈ ∆, satisfied when ∆ = {} n: number of variables, P1, P2 . . . Pn

d: the depth of the current node

• root node has depth 0, corresponds to P1

• nodes at depth n−1 try Pn

• leave nodes are at depth n, each represents a world ωi

Typical DFS (depth-first search) algorithm.

• DFS, thus O(n) space requirement (moderate) • No pruning, thus O(2n) time complexity

SAT II

1. SAT-II (∆, n, d):

2. If∆={},return{}

3. If ∆ = {{}}, return FAIL

4. IfL=SAT-II(∆|Pd+1,n,d+1)̸=FAIL: 5. return L ∪ {Pd+1}

6. IfL=SAT-II(∆|¬Pd+1,n,d+1)̸=FAIL: 7. return L ∪ {¬Pd+1}

8. return FAIL

Mostly SAT I, plus early-stop.

B

true

false

false

  (true)

  (false)

  A: D: C:

{¬A, C}, {A}

{¬C, D}, {¬C, ¬D}

n3

for counting purpose we assign value 2 = 2 = 8 to

the root (A in this case), and 2n−1 = 4 to the next level (its direct children), etc. and finally we sum up the values assigned to all true values. Here we have: 2 + 1 = 3. |Mods(∆)| = 3. Constructing via:

• If inconsistent then put false here.

• Directed resolution could be used to build a de-

{C}, {¬C}, {}

cision tree. P-bucket: P nodes.

 

   Termination Tree

Termination tree is a sub-tree of the complete search space (which is a depth-n complete binary tree), in- cluding only the nodes visited while running the algo- rithm.

When drawing the termination tree of SAT I and SAT II, we put a cross (X) on the failed nodes, with {{}} label next to it. Keep going until we find an answer — where ∆ = {}.

Unit-Resolution

1. Unit-Resolution (∆):

2. I = unit clauses in ∆

3. IfI={}:return(I,∆)

4. Γ=∆|I

5. IfΓ=∆:return(I,Γ)

6. return Unit-Resolution(Γ)

Used in DPLL, at each node.

DPLL

01. DPLL (∆):

02. (I, Γ) = Unit-Resolution(∆)

03. If Γ = {}, return I

04. If {} ∈ Γ, return FAIL

05. choose a literal l in Γ

06. If L = DPLL(Γ ∪ {{l}}) ̸= FAIL:

07. return L ∪ I

08. If L = DPLL(Γ ∪ {{¬l}}) ̸= FAIL:

09. return L ∪ I

10. return FAIL

Mostly SAT II, plus unit-resolution. Unit-Resolution is used at each node looking for entailed value, to save searching steps.

If there’s any implication made by Unit- Resolution, we write down the values next to the node where the implication is made. (e.g. A = t, B = f, . . . )

This is NOT a standard DFS. Unit-Resolution component makes the searching flexible.

Non-chronological Backtracking

Chronological backtracking is when we find a con- tradiction/FAIL in searching, backtrack to parent. Non-chronological backtracking is an optimiza- tion that we jump to earlier nodes. a.k.a. conflict- directed backtracking.

Implication Graphs

Implication Graph is used to find more clauses to add to the KB, so as to empower the algorithm.

An example of an implication graph upon the first conflict found when running DPLL+ for ∆:

Asserting Clause & Assertion Level

Asserting Clause: Including only one variable at the last (highest) decision level. (The last decision-level means the level where the last deci- sion/implication is made.)

Assertion Level (AL): The second-highest level in the clause. (Note: 3 is higher than 0.)

An example (following the previous example, on the learned clauses):

   1.{A,B}

2.{B,C} 3.{¬A,¬X,Y} 4.{¬A,X,Y} ∆ 5.{¬A,¬Y,Z} 6.{¬A,X,¬Z} 7.{¬A,¬Y,¬Z}

decision implication

Decision-Levels

{¬A,¬X} {0,3} Yes 0

    depth A0

tB1

C2

{¬A, ¬Y, ¬Z} DPLL+

{0, 3, 3}

No 0

Asserting? AL {¬A,¬Y} {0,3} Yes 0

Clause

  0/A=t

1/B=t

2/C=t

3/X=t

7 3 5 3/Z=t7

5 7 3/{} 3 3/Y=t

  t

  t

01. 02. 03. 04. 05. 06. 07. 08. 09. 10. 11. 12. 13. 14. 15. 15.

DPLL+ (∆): D ← () Γ ← {}

tX3

Y=t Z=t

While true Do:

(I, L) = Unit-Resolution(∆ ∧ Γ ∧ D) If {} ∈ L:

If D = (): return false

Else (backtrack to assertion level):

α ← asserting clause

m ← AL(α)

D ← first m + 1 decisions in D Γ ← Γ∪{α}

Else:

find l where {l} ∈/ I and {¬l} ∈/ I Ifanlisfound: D←D;l

Else: return true

 There, the decisions and implications assignments of variables are labeled by the depth at which the value is determined.

The edges are labeled by the ID of the correspon- ding rule in ∆, which is used to generate a unit clause (make an implication).

Implication Graphs: Cuts

Cuts in an Implication Graph can be used to identify the conflict sets. Still following the previous example:

Γ is the learned clauses, D is the decision sequence. 2.{B,C} 0/A=t 5 3/Z=t Idea: Backtrack to the assertion level, add the

  true if the CNF ∆ is satisfiable, otherwise false.

   1.{A,B} 7

 3.{¬A,¬X,Y} 1/B=t 3 5 7 ∆= 4.{¬A,X,Y} 7

   7.{¬A,¬Y,¬Z}

Here

Cut#2 learned clause clause {¬A, ¬Y , ¬Z }.

conflict-driven clause to the knowledge base, apply unit resolution.

Selecting α: find the first UIP.

UIP (Unique Implication Path)

The variable that set on every path from the last de- cision level to the contradiction.

The first UIP is the closest to the contradiction. For example, in the previous example, the last UIP is3/X=t,whilethefirstUIPis3/Y =t.

5.{¬A,¬Y,Z} 2/C=t 3 3/Y=t 6.{¬A,X,¬Z} 3/X=t Cut#2

3/{}

Cut#1 Cut#3 A=t,X=t A=t,Y=t,Z=t

learned clause {¬A, ¬X }, {¬A, ¬Y }, Cut#3 learned

A=t,Y=t

    Cut#1

results

in

 

   Exhaustive DPLL

Exhaustive DPLL: DPLL that doesn’t stop when finding a solution. Keeps going until explored the whole search space.

It is useful for model-counting.

However, recall that, DPLL is based on that ∆ is sa- tisfiable iff ∆|P is satisfiable or ∆|¬P is satisfiable, which infers that we do not have to test both bran- ches to determine satisfiability.

Therefore, we have smarter algorithm for model- counting using DPLL: CDPLL.

CDPLL

1. CDPLL (Γ, n):

2. IfΓ={}:return2n

4. If{}∈Γ:return0

5. choose a literal l in Γ

6. (I+, Γ+) = Unit-Resolution(Γ ∪ {{l}})

7. (I−, Γ−) = Unit-Resolution(Γ ∪ {{¬l}})

8. return CDPLL(Γ+ , n − |I+ |)+

Certifying UNSAT: Method #2

Verifying the Γ generated from the SAT solver after running on ∆ is a correct one.

• Will ∆ ∪ Γ produce any inconsistency? – Can use Unit-Resolution to check.

• CNF Γ = {α1,α2,...,αn} comes from ∆?

– ∆ ∧ ¬αi is inconsistent for all clauses αi.

– Can use Unit-Resolution to check.

Why Unit-Resolution is enough: {αi}ni=1 are gene-

rated from cuts in an implication graph. The im- plication graph is built upon conflicts found by Unit- Resolution. Therefore, the conflicts can be detected by Unit-Resolution.

UNSAT Cores

ForCNF∆={α1,α2,...,αn},anUNSATcoreisany subsets consisting of some αi ∈ ∆ that is inconsistent together. There exists at least one UNSAT core iff ∆ is UNSAT.

A minimal UNSAT core is an UNSAT core of ∆ that, if we remove a clause from this UNSAT core, the remaining clauses become consistent together.

More on SAT

• Can SAT solver be faster than linear time? – 2-literal watching (in textbook)

• The “phase-selection” / variable ordering pro- blem (including the decision on trying P or ¬P first)?

– An efficient and simple way: “try to try the phase you’ve tried before”. — This is be- cause of the way modern SAT solvers work (cache, etc.).

SAT using Local Search

The general idea is to start from a random guess of the world ω, if UNSAT, move to another world by flipping one variable in ω (P to ¬P, or ¬P to P).

• Random CNF: n variables, m clauses. When

   

m/n gets extremely small or large, it is ea-

sier to randomly generate a world (thinking of

? n ?: when m/n → 0 it is almost always SAT, m

m/n → ∞ will make it almost always UNSAT). In practice, the split point is m/n ≈ 4.24.

Two ideas to generate random clauses: – 1st idea: variable-length clauses

– Use a cost function to determine the qua- lity of a world.

∗ Simplest cost function: the number of unsatisfied clauses.

∗ A lot of variations.

∗ Intend to go to lower-cost direction.

(“hill-climbing”)

– Termination Criteria: No neighbor is bet- ter (smaller cost) than the current world. (Local, not global optima yet.)

– Avoid local optima: Randomly restart multiple times.

Algorithms:

– GSAT: hill-climbing + side-move (moving

to neighbors whose cost is equal to ω)

– WALKSAT: iterative repair

∗ randomly pick an unsatisfied clause

∗ pick a variable within that clause to flip, such that it will result in the fewest previously satisfied clauses be-

coming unsatisfied, then flip it

– Combination of logic and randomness:

∗ randomly select a neighbor, if bet- ter than current node then move, otherwise move at a probability (de- termined by how much worse it is)

  nd

idea: fixed-length clauses (k-SAT, e.g. Strategy of Taking a Move:

– 2 3-SAT)

  9.

CDPLL(Γ−, n − |I−|)

n is the number of variables, it is very essential when counting the models.

An example of the termination tree:

2

When a query is satisfiable, we have an answer to certify.

However, when it is unsatisfiable, we also want to va- lidate this conclusion.

One method is via verifying UNSAT directly (example ∆ from implication graphs), example:

1.{¬A,B} t A f

∆=2.{¬B,C} B=t,C=t1

C=t 1

 tBf Certifying UNSAT: Method #1

   level

assignment

reason

¬A∨¬X ∨Y ¬A∨¬Y ∨Z

 -1 0A 1B 2C 3X

Y Z

 And then learned clause ¬A ∨ ¬Y is applied. Learned clause is asserting, AL = 0 so we add ¬Y to level 0, right after A, then keep going from ¬Y .

 

   Max-SAT

Max-SAT is an optimization version of SAT. In other words, Max-SAT is an optimizer SAT solver.

Goal: finding the assignment of variables that maxi- mizes the number of satisfied clauses in a CNF ∆. (We can easily come up with other variations, such as Min-SAT etc.)

• We assign a weight to each clause as the score of satisfying it / cost of violating it.

• We maximize the score. (This is only one way of solving the problem, we can also do it by mi- nimizing the cost. — Note: score is different from cost.)

Solving Max-SAT problems generally goes into three directions:

• Local Search

• Systematic Search (branch and bound etc.) • Max-SAT Resolution

Max-SAT Example

We have images I1, I2, I3, I4, with weights (impor- tance) 5, 4, 3, 6 respectively, knowing: (1) I1, I4 can’t be taken together (2) I2, I4 can’t be taken together (3) I1 , I2 if overlap then discount by 2 (4) I1 , I3 if overlap

Max-SAT Resolution

In Max-SAT, in order to keep the same

cost/score before and after resolution, we: • Abandon the resolved clauses;

• Add compensation clauses.

Considering the following two clauses to resolve:

c1

z }| {

Directed Max-SAT Resolution: Example ∆ = (¬a∨c)∧(a)∧(¬a∨b)∧(¬b∨¬c)

Variable order: a, b, c. First resolve on a:

(¬a∨c) (a) (¬a∨b) (¬b∨¬c)

(c) (a∨¬c)

                  (b∨¬c)

                 (¬a∨b∨c)

                (a∨¬c∨¬b)

Then resolve on b:

(¬a∨c) (a) (¬a∨b) (¬b∨¬c)

    x∨l1 ∨l2 ∨···∨lm ¬x∨o ∨o ∨···∨o

12n | {z }

c2

The results are the resolvent c1 ∨c2, and the compen-

sation clauses:

c1 ∨ c2

x∨c1 ∨¬o1 x∨c1 ∨o1 ∨¬o2

.

x∨c1 ∨o1 ∨o2 ∨···∨¬on ¬x∨c2 ∨¬l1

¬x∨c2 ∨l1 ∨¬l2

.

.

¬x∨c2 ∨l1 ∨l2 ∨···∨¬lm Directed Max-SAT Resolution

1. Pick an order of the variables, say, x1, x2, . . . , xn 2. For each xi, exhaust all possible Max-SAT resolutions, the move on to xi+1 .

When resolving xi, using only the clauses that does notmentionanyxj,∀j<i.

Resolve two clauses on xi only when there isn’t a xj ̸= xi that xj and ¬xj belongs to the two clauses each. (Formally: do not contain complementary literals on xj ̸= xi.)

Ignore the resolvent and compensation clauses when they’ve appeared before, as original clauses, resolvent clauses, or compensation clauses.

In the end, there remains k false (conflicts), and Γ (guaranteed to be satisfiable). k is the minimum cost, each world satisfying Γ achieves this cost.

(c) (a∨¬c)

       (b∨¬c)

      (¬a∨b∨c)

      (a∨¬c∨¬b)

(¬c)

       then discount by 1 (5) I2 , I3 if overlap then by 1.

Then we have the knowledge base ∆ as:

∆ :(I1,5) (I2,4) (I3,3) (I4,6)

(¬I1 ∨ ¬I2 , 2)

(¬I1 ∨¬I3,1)

(¬I2 ∨ ¬I3 , 1)

(¬I1 ∨ ¬I4 , ∞)

(¬I2 ∨¬I4,∞)

discount

(¬a∨c) (a) (¬a∨b) (¬b∨¬c)

(c) (a∨¬c)

            (b∨¬c)

           (¬a∨b∨c)

          (a∨¬c∨¬b)

false

Finally:

    To simply the example we look at I1 and I2 only: I1 I2 score cost

??90 ?#54 #?45 ##09

In practice we list the truth table of I1 through I4 (24 = 16 worlds).

The final output is: false,?(¬a∨b∨c),(a∨¬b∨¬c)?

Where Γ = (¬a∨b∨c)∧(a∨¬b∨¬c), and k = 1, indicating that there must be at least one clause in ∆ that is not satisfiable.

Beyond NP

Some problems, even those harder than NP problems can be reduced to logical reasoning.

(¬c)

 

   Complexity Classes

Shown in the figure are some example of the complete problems.

SDP PPPP

MAP NPPP

PP MAR NP MPE

meaning

Same-Decision Probability Maximum A Posterior hypothesis MArginal Probabilities Most Probable Explanation

Bayesian Network to MAJ-SAT Problem

A Maj-SAT problem consists of:

• #SAT Problem (model counting)

• WMC Problem (weighted model counting)

Consider WMC (weighted model counting) problem,

NNF Properties

Property

Decomposability Determinism Smoothness Flatness Decision Ordering

On Whom

and

or

or whole NNF or each node

Satisfied NNF

DNNF d-NNF s-NNF f-NNF BDD (FBDD) OBDD

     abbr. SDP MAP MAR MPE

B A

C

e.g. three variables A, B, C, weight of world t,B=t,C=fshouldbe:

A =

w(A, B, ¬C) = w(A)w(B)w(¬C) Typically, in a Bayesian network, where both B and

Decomposability: for any and node, any pair of its children must be on disjoint variable sets. (e.g. one child A∨B, the other C ∨D)

Determinism: for any or node, any pair of its chil- dren must be mutually exclusive. (e.g. one child A∧B, the other ¬A∧B)

Smoothness: for any or node, any pair of its chil- dren must be on the same variable set. (e.g. one child A∧B, the other ¬A∧¬B)

Flatness: the height of each sentence (sentence: from root — select one child when seeing or ; all children when seeing and — all the way to the leaves / literals) is at most 2 (depth 0, 1, 2 only). (e.g. CNF, DNF) Decision: a decision node N can be true, false, orbeinganor-node(X∧α)∨(¬X∧β)(X: variable, α,β: decision nodes, decided on dVar(N) = X). Ordering: make no sense if not decision (FBDD); variables are decided following a fixed order.

C depend on A:

 A complete problem means that it is one of the hardest problems of its complexity class. e.g. NP- complete: among all NP problem, there is not any problem harder than it.

Our goal: Reduce complete problems to prototy- pical problems (Boolean formula), then transform them into tractable Boolean circuits.

Prototypical Problems

MAJ-MAJ-SAT

And we therefore have:

Prob(A = t,B = t,C = t) = θAθB|AθC|A

where Θ = {θA, θ¬A} ∪ {θB|A, θ¬B|A, θB|¬A, θ¬B|¬A} ∪ {θC|A, θ¬C|A, θC|¬A, θ¬C|¬A} are the parameters within the Bayesian network at nodes A, B, C respec- tively, indicating the probabilities.

Though slightly more complex than treating each va- riable equally, by working on Θ we can safely reduce any Bayesian network to a Maj-SAT problem.

NNF (Negation Normal Form)

NNF is the form of Tractable Boolean Circuit we are specifically interested in.

In an NNF, leave nodes are true, false, P or ¬P; internal nodes are either and or or, indicating an ope- ration on all its children.

Tractable Boolean Circuits

WedrawanNNFasifitismadeupoflogic. Froma circuit perspective, it is made up of gates.

and or not

NNF Queries

      PPPP

Abbr. CO VA SE CE IM EQ CT ME

Spelled Name

consistency check validity check sentence entailment check clausal entailment check implicant testing equivalence testing model counting model enumeration

description

SAT (∆) ¬SAT (¬∆) ∆1 |= ∆2 ∆ |= clause α ∆ |= term l ∆1 = ∆2 |Mods(∆)| ω ∈ Mods(∆)

  E-MAJ-SAT NPPP

  MAJ-SAT NP SAT

abbr. meaning SAT satisfiability

MAJ-SAT majority-instantiation satisfiability

PP

    E-MAJ-SAT MAJ-MAJ-SAT

with (X, Y )-split of the variables, exists an X-instantiation that satisfies the ma- jority of Y -instantiation.

with (X, Y )-split of the variables, the majority of X-instantiation satisfies the majority of Y -instantiation.

Our goal is to get the above-listed queries done on our circuit within polytime.

Besides, we also seek for polytime transformations: Projection (existential quantification), Conditioning, Conjoin, Disjoin, Negate, etc.

 Again, those are all complete problems.

 

 The Capability of NNFs on Queries

 NNF d-NNF s-NNF

VA, IM, CT BDD d-DNNF EQ?

FBDD EQ? sd-DNNF EQ

OBDD

SE EQ,SE OBDD< MODS

CO,

VA,IM, EQ,SE CO,CE,EQ, SE,ME

IP PI

  CE, ME

DNNF f-NNF

VA, IM

DNF CNF

          The Capability of NNFs on Transformations

DNNF

CO: check consistency in polytime, because: SAT(A ∨ B) = SAT(A) ∨ SAT(B)

// DNNF only

   CDFOSFO∧C∧BC∨C∨BC¬C

 NNF ? ◦ d-NNF ? ◦ s-NNF ? ◦ f-NNF ? ◦ DNNF ? ?

d-DNNF ? ◦ FBDD ? # OBDD ? #

OBDD< ? # BDD ? ◦ sd-DNNF ? ? DNF ? ? CNF ? ◦ PI ? ? IP ? # MODS ? ?

? ?

? ?

? ? ?### # ?◦◦? ? ◦◦◦◦ ◦

? ? ? ? ? ? ? ?

?? ?

? ? ◦ ?

 

 ◦ # ? # ? # ? ? ? ? ? # ? ? ? # # # ? #

◦ # ◦ ?

SAT(A ∧ B) = SAT(A) ∧ SAT(B) 

SAT(X ) = true

SAT(¬X ) = true 

SAT(true) = true 

◦# ◦ ?# ?

? ? ? ?

? ◦ ?

? ? ? # ? # ? # # # ? # ? # # #

? ?

   CO VA CE IM EQ SE CT ME

?: can be done in polytime

◦: cannot be done in polytime unless P = NP. #: cannot be done in polytime even if P = NP ?: remain unclear (no proof yet)

Variations of NNF

Acronym Description

NNF Negation Normal Form

d-NNF Deterministic Negation Normal Form s-NNF Smooth Negation Normal Form f-NNF Flat Negation Normal Form

DNNF Decomposable Negation Normal Form

? # #

?

#

SAT(false) = false

CE: clausal entailment, check ∆ |= α (α = l1 ∨

l2 . . . ln ) by checking the consistency of: ∆∧¬l1 ∧¬l2 ∧···∧¬ln

constructing a new NNF of it by making NNF of ∆ and the NNF of ¬α direct child of root-node and. When a variable P appear in both α and ∆, the new NNF is not DNNF. We fix this by conditioning ∆’s NNF on P or ¬P, depending on either P or ¬P ap- pearsinα. (∆→(¬P∧∆|¬P)∨(P∧∆|P))IfP in α, then ¬P in ¬α, we do ∆|¬P.

Interestingly, this transformation might turn a non- DNNF NNF (troubled by A) into DNNF.

CD: conditioning, ∆|A is to replace all A in NNF with true and ¬A with false. For ∆|¬A, vice versa. ME: model enumeration, CO + CD → ME, we keep checking ∆|X, ∆|¬X, etc.

DNNF: Projection / Existential Qualification

Recall: ∆ = A ⇒ B,B ⇒ C,C ⇒ D, existential qua- lifying B,C, is the same with forgetting B,C, is in other words projecting on A, D.

In DNNF, we existential qualifying {Xi}i∈S (S is a selected set) by:

• replacing all occurrence of Xi (both positive and negative, both Xi and ¬Xi) in the DNNF with true (Note: result is still DNNF);

• check if the resulting circuit is consistent. This can be done to DNNF, because:

 NNF ◦ d-NNF ◦ s-NNF ◦ f-NNF ◦ DNNF ?

d-DNNF ? FBDD ? OBDD ?

◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ? ? ? ? ? ? ?

◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ? ? ◦ ? ? ◦ ? ? ◦

◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ? ? ? ? ? ? ?

   OBDD< ????????

 BDD ◦ sd-DNNF ? DNF ? CNF ◦ PI ? IP ? MODS ?

◦ ◦

? ? ◦ ? ? ◦ ? ? ? ? ? ?

◦ ◦

? ? ◦ ◦ ?◦ ?? ?? ??

◦ ◦ ◦ ◦ ? ? ◦ ◦ ?

  ?: can be done in polytime

◦: cannot be done in polytime unless P = N P . #: cannot be done in polytime even if P = NP ?: remain unclear (no proof yet)

NNF Transformations

◦ ◦ ? ◦ ? ◦ ? ?

? ? ?

d-DNNF Deterministic Decomposable Negation Normal Form

sd-DNNF Smooth Deterministic Decomposable Negation Normal Form

BDD Binary Decision Diagram

FBDD Free Binary Decision Diagram OBDD Ordered Binary Decision Diagram

OBDD< Ordered Binary Decision Diagram (using order <)

DNF Disjunctive Normal Form CNF Conjunctive Normal Form

PI Prime Implicates

IP Prime Implicants MODS Models

FBDD: the intersection of DNNF and BDD. OBDD<: if N and M are or-nodes, and if N is an ancestor of M , then dVar(N ) < dV ar(M ).

OBDD: the union of all OBDD< languages. In this course we always use OBDD to refer to OBDD<. MODS is the subset of DNF where every sentence satisfies determinism and smoothness.

PI: subset of CNF, each clause entailed by ∆ is sub- sumed by an existing clause; and no clause in the sentence ∆ is subsumed by another.

IP: dual of PI, subset of DNF, each term entailing ∆ subsumes some existing term; and no term in the sentence ∆ is subsumed by another.

    notation

CD

FO SFO ∧C ∧BC ∨C ∨BC ¬C

transformation

conditioning forgetting singleton forgetting conjunction bounded conjunction disjunction bounded disjunction negation

description

∆|P ∃P, Q, . . . ∆ ∃P.∆ ∆1 ∧ ∆2 ∆1 ∧ ∆2 ∆1 ∨ ∆2 ∆1 ∨ ∆2 ¬∆

 Our goal is to transform in polytime while still keep the properties (e.g. DNNF still be DNNF).

Bounded conjunction / disjunction: KB ∆ is boun- ded on conjunction / disjunction operation. That is, taking any two formula from ∆, their conjunction / disjunction also belong to ∆.

(∃X.(α ∨ β) = (∃x.α) ∨ (∃x.α) ∃X.(α ∧ β) = (∃x.α) ∧ (∃x.α)

// DNNF only In DNNF, ∃X.(α ∧ β) is α ∧ (∃X.β) or (∃X.α) ∧ β.

 

   Minimum Cardinality

Cardinality: in our case, by default, defined as the number of false in an assignment (in a world, how many variables’ truth value are false). We seek for its minimum. a

minCard(X) = 0 minCard(¬X) = 1 minCard(true) = 0

minCard(false) = ∞

minCard(α ∨ β) = min ?minCard(α), minCard(β)? minCard(α ∧ β) = minCard(α) + minCard(β)

Again, the last rule holds only in DNNF.

Filling the values into DNNF circuit, we can easily compute the minimum cardinality.

• minimizing cardinality requires smoothness;

• it can help us optimizing the circuit by “killing”

the child of or-nodes with higher cardinality, and further remove dangling nodes.

aCould easily be other definitions, such as defined as the number of true values, and seek for its maximum.

d-DNNF

CT: model counting. MC(α) = |Mods(α)| (decomposable) MC(α ∧ β) = MC(α) × MC(β) (deterministic) MC(α ∨ β) = MC(α) + MC(β) counting graph: replacing ∨ with + and ∧ with ∗ in a d-DNNF. Leaves: MC(X) = 1, MC(¬X) = 1, MC(true) = 1, MC(false) = 0.

weighted model counting (WMC): can be com- puted similarly, replacing 0/1 with weights.

Note: smoothness is important, otherwise there can be wrong answers. Guarantee smoothness by adding trivial units to a sub-circuit (e.g. α ∧ (A ∨ ¬A)). Marginal Count: counting models on some conditi- ons (e.g. counting ∆|{A, ¬B}) CD+CT.

It is not hard to compute, but the marginal counting is bridging CT to some structure that we can compute partial-derivative upon (input: the conditions / as- signment of variables), similar to Neural Networks. FO: forgetting / projection / existential qualification. Note: a problem occur — the resulting graph might no longer be deterministic, thus d-DNNF is not con- sidered successful on polytime FO.

Arithmetic Circuits (ACs)

The counting graph we used to do CT on d-DNNF is a typical example of Arithmetic Circuits (ACs). Other operations could be in ACs, such as by repla- cing “+” by “max” in the counting graph, running it results in the most-likely instantiation. (MPE)

If a Bayesian Net is decomposable, deterministic and smooth, then it could be turned into an Arithmetic Circuits.

Succinctness v.s. Tractability

Succinctness: not expensive; Tractability: easy to use. Along the line: OBDD → FBDD → d-DNNF → DNNF, succinctness goes up (higher and higher space efficiency), but tractable operations shrunk.

Knowledge-Base Compilation

Top-down approaches:

• Based on exhaustive search;

Bottom-up approaches:

• Based on transformations.

Top-Down Compilation via Exhaustive DPLL

Top-down compilation of a circuit can be done by ke- eping the trace of an exhaustive DPLL.

The trace is automatically a circuit equivalent to the original CNF ∆.

It is a decision tree, where:

• each node has its high and low children; • leaves are SAT or UNSAT results.

We need to deal with the redundancy of that circuit.

1. Do not record redundant portion of trace (e.g. too many SAT and UNSAT — keep only one SAT and one UNSAT would be enough);

2. Avoid equivalent subproblems (merge the nodes of the same variable with exactly the same out- degrees, from bottom to top, iteratively).

In practice, formula-caching is essential to reduce the amount of work; trade-off: it requires a lot of space. A limitation of exhaustive DPLL: some conflicts can’t be found in advance.

OBDD (Ordered Binary Decision Diagrams)

In an OBDD there are two special nodes: 0 and 1, always written in a square. Other nodes correspond to a variable (say, xi) each, having two out-edges: high-edge (solid, decide xi = 1, link to high-child), low-edge (dashed, decide xi = 0 link to low-child).

x1

∆=(x1∧x2)∨¬x3 f=x1x2+(1-x3)

01 An example of a DNF

We express KB ∆ as function f by turning all ∧ into multiply and ∨ into plus, ¬ becomes flipping between 0 and 1. None-zero values are all 1. Another exam- ple says we want to express the knowledge base where there are odd-number positive values:

    x2

   x3

   Odd-parity

function

x2

x1

x2

    f=(x1+x2+x3+x4)%2 x

x 33

Reduction rules of OBDD:

x4 x4 01

 z1 z2 z1 z2 x

yy deduction rule

  ABABABAB xxx

merge rule

 An OBDD that can not apply these rules is a reduced OBDD. Reduced OBDDs are canonical. i.e. Gi- ven a fixed variable order, ∆ has only one reduced OBDD.

 

  OBDD: Subfunction and Graph Size

Considering the function f of a KB ∆, we have a fixed variable order of the n variables v1, v2, . . . , vn; after determining the first m variables, we have up to 2m different cases of the remaining function (given the instantiation).

The number of distinct subfunction (range from 1 to 2m) involving vm+1 determines the number of nodes we need for variable vm+1. Smaller is better. An example: f = x1x2 + x3x4 + x5x6, examining two different variable orders: x1,x2,x3,x4,x5,x6, or x1, x3, x5, x2, x4, x6. Check the subfunction after the first three variables are fixed.

The first order has 3 distinct subfunction, only 1 de- pend on x4, thus next layer has 1 node only.

OBDD: Transformations

¬C: negation. Negation on OBDD and on all BDD is simple. Just swapping the nodes 0 and 1 — turning 0 into 1 and 1 into 0, done. O(1) time complexity.

CD: conditioning. O(1) time complexity. ∆|X requires re-directing all parent edges of X be directed to its high-child node, and then remove X; similarly ∆|¬X re-directs all parent edges of X-nodes to its low-child node, and then remove itself.

x1 x1 x1 x1

x2 x2 x2 x3

01010111

∆ ∆|x3 ∆|¬x3 reduced OBDD

∧C : conjunction.

        x1 x2 x3 000 001 010

0 1 1

1 0 0

1 0 1 110 111

subfunction

x5x6 x4 +x5x6 x5x6 x4 +x5x6 x5 x6 x4 +x5x6 1

1

Conjoining BDD is super easy (O(1)): link the root of ∆2 to where was node-1 in ∆1, and then we are done.

Conjoining OBDD, since we have to keep the or- der, will be quadratic. Assuming OBDD f and g have the same variable order, and their size (i.e. #nodes) are n and m respectively, time complexity of generating f ∧ g will be O(nm). This theoretical optimal is achieved in practice, by proper caching.

 The second order has 8 distinct subfunction, 4 depend on x2, thus next layer has 4 nodes.

x1 x3 x5 subfunction

 000 001 010 011 100 101 1 1 0 111

Subfunction is a reliable

graph size, and is useful to determine which variable order is better.

0

x6

x4

x4 + x6 x2

x2 + x6 x2+x4 x2 +x4 +x6

f g f∧g case 1 x x x

measurement of the OBDD

case 2

f1

f1

f0 g1 g0 f1∧g1 f0∧g0 xyx

f0 g1 g0 f1∧g f0∧g assuming x < y, thus x not in g

 

 


51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: abby12468