Inference in Hidden Markov Models Part 2 Blake VanBerlo Lecture 11 Readings: RN 14.2.2. CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 1 / 29 Outline Learning Goals Smoothing Calculations Smoothing Derivations The Forward-Backward Algorithm Revisiting the Learning goals CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 2 / 29 Learning Goals By the end of the lecture, you should be able to ▶ Calculate the smoothing probability for a time step in a hidden Markov model. ▶ Describe the justification for a step in the derivation of the smoothing formulas. ▶ Describe the forward-backward algorithm. CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 3 / 29 Learning Goals Smoothing Calculations Smoothing Derivations The Forward-Backward Algorithm Revisiting the Learning goals CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 4 / 29 The Umbrella Model Let St be true if it rains on day t and false otherwise. Let Ot be true if the director carries an umbrella on day t and false otherwise. P (s0) = 0.5 P (st|st−1) = 0.7 P (st|¬st−1) = 0.3 P (ot|st) = 0.9 P (ot|¬st) = 0.2 St−2 St−1 St St+1 Ot−2 Ot−1 Ot Ot+1 CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 5 / 29 Smoothing Given the observations from day 0 to day t− 1, what is the probability that I am in a particular state on day k? P (Sk|o0:(t−1)), where 0 ≤ k ≤ t− 1 CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 6 / 29 Smoothing through Backward Recursion Calculating the smoothed probability P (Sk|o0:(t−1)): P (Sk|o0:(t−1)) = αP (Sk|o0:k)P (o(k+1):(t−1)|Sk) = α f0:k b(k+1):(t−1) Calculate f0:k using forward recursion. Calculate b(k+1):(t−1) using backward recursion. Backward Recursion: Base case: bt:(t−1) = 1⃗. Recursive case: b(k+1):(t−1) = ∑ sk+1 P (ok+1|sk+1)b(k+2):(t−1)P (sk+1|Sk). CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 7 / 29 A Smoothing Example Consider the umbrella story. Assume that O0 = true, O1 = true, and O2 = true. What is the probability that it rained on day 0 (P (S0|o0 ∧ o1 ∧ o2)) and the probability it rained on day 1 (P (S1|o0 ∧ o1 ∧ o2))? Here are the useful quantities from the umbrella story: P (s0) = 0.5 P (ot|st) = 0.9, P (ot|¬st) = 0.2 P (st|s(t−1)) = 0.7, P (st|¬s(t−1)) = 0.3 CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 8 / 29 A Smoothing Example Calculate P (S1|o0:2). (1) What are the values of k and t? P (S1|o0:2) = P (Sk|o0:(t−1))⇒ k = 1, t = 3 (2) Write the probability as a product of forward and backward messages. P (S1|o0:2) = αP (S1|o0:1) ∗ P (o2:2|S1) = α f0:1 ∗ b2:2 (3) We already calculated f0:1 = ⟨0.883, 0.117⟩ in the last lecture. Next, we will calculate b2:2 using backward recursion. CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 9 / 29 A Smoothing Example Calculate P (S1|o0:2). (1) What are the values of k and t? P (S1|o0:2) = P (Sk|o0:(t−1))⇒ k = 1, t = 3 (2) Write the probability as a product of forward and backward messages. P (S1|o0:2) = αP (S1|o0:1) ∗ P (o2:2|S1) = α f0:1 ∗ b2:2 (3) We already calculated f0:1 = ⟨0.883, 0.117⟩ in the last lecture. Next, we will calculate b2:2 using backward recursion. CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 9 / 29 A Smoothing Example Calculate P (S1|o0:2). (1) What are the values of k and t? P (S1|o0:2) = P (Sk|o0:(t−1))⇒ k = 1, t = 3 (2) Write the probability as a product of forward and backward messages. P (S1|o0:2) = αP (S1|o0:1) ∗ P (o2:2|S1) = α f0:1 ∗ b2:2 (3) We already calculated f0:1 = ⟨0.883, 0.117⟩ in the last lecture. Next, we will calculate b2:2 using backward recursion. CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 9 / 29 A Smoothing Example Calculate P (S1|o0:2). (1) What are the values of k and t? P (S1|o0:2) = P (Sk|o0:(t−1))⇒ k = 1, t = 3 (2) Write the probability as a product of forward and backward messages. P (S1|o0:2) = αP (S1|o0:1) ∗ P (o2:2|S1) = α f0:1 ∗ b2:2 (3) We already calculated f0:1 = ⟨0.883, 0.117⟩ in the last lecture. Next, we will calculate b2:2 using backward recursion. CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 9 / 29 A Backward Recursion Example - Recursive Case Calculate b2:2 = P (o2:2|S1) where k = 1, t = 3. b2:2 = P (o2:2|S1) = ∑ s2 P (o2|s2) ∗ b3:2 ∗ P (s2|S1) = ∑ s2 P (o2|s2) ∗ P (o3:2|s2) ∗ P (s2|S1) = ∑ s2 P (o2|s2) ∗ P (o3:2|s2) ∗ ⟨P (s2|s1), P (s2|¬s1)⟩ = ( P (o2|s2) ∗ P (o3:2|s2) ∗ ⟨P (s2|s1), P (s2|¬s1)⟩ + P (o2|¬s2) ∗ P (o3:2|¬s2) ∗ ⟨P (¬s2|s1), P (¬s2|¬s1)⟩ ) CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 10 / 29 A Backward Recursion Example - Recursive Case Calculate b2:2 = P (o2:2|S1) where k = 1, t = 3. b2:2 = P (o2:2|S1) = ∑ s2 P (o2|s2) ∗ b3:2 ∗ P (s2|S1) = ∑ s2 P (o2|s2) ∗ P (o3:2|s2) ∗ P (s2|S1) = ∑ s2 P (o2|s2) ∗ P (o3:2|s2) ∗ ⟨P (s2|s1), P (s2|¬s1)⟩ = ( P (o2|s2) ∗ P (o3:2|s2) ∗ ⟨P (s2|s1), P (s2|¬s1)⟩ + P (o2|¬s2) ∗ P (o3:2|¬s2) ∗ ⟨P (¬s2|s1), P (¬s2|¬s1)⟩ ) CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 10 / 29 A Backward Recursion Example - Recursive Case Calculate b2:2 = P (o2:2|S1) where k = 1, t = 3. CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 11 / 29 A Backward Recursion Example - Recursive Case Calculate b2:2 = P (o2:2|S1) where k = 1, t = 3. b2:2 = P (o2:2|S1) = ( P (o2|s2) ∗ P (o3:2|s2) ∗ ⟨P (s2|s1), P (s2|¬s1)⟩ + P (o2|¬s2) ∗ P (o3:2|¬s2) ∗ ⟨P (¬s2|s1), P (¬s2|¬s1)⟩ ) = ( 0.9 ∗ 1 ∗ ⟨0.7, 0.3⟩+ 0.2 ∗ 1 ∗ ⟨0.3, 0.7⟩ ) = (0.9 ∗ ⟨0.7, 0.3⟩+ 0.2 ∗ ⟨0.3, 0.7⟩) = (⟨0.63, 0.27⟩+ ⟨0.06, 0.14⟩) = ⟨0.69, 0.41⟩ CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 12 / 29 A Smoothing Example Calculate P (S1|o0:2). CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 13 / 29 A Smoothing Example Calculate P (S1|o0:2). P (S1|o0:2) = αP (S1|o0:1) ∗ P (o2:2|S1) = α f0:1 ∗ b2:2 = α⟨0.883, 0.117⟩ ∗ ⟨0.69, 0.41⟩ = α⟨0.6093, 0.0480⟩ = ⟨0.927, 0.073⟩ CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 14 / 29 A Smoothing Example Calculate P (S0|o0:2). k = 0, t = 3 b1:2 = P (o1:2|S0) = (P (o1|s1) ∗ P (o2:2|s1) ∗ ⟨P (s1|s0), P (s1|¬s0)⟩ + P (o1|¬s1) ∗ P (o2:2|¬s1) ∗ ⟨P (¬s1|s0), P (¬s1|¬s0)⟩) = (0.9 ∗ 0.69 ∗ ⟨0.7, 0.3⟩+ 0.2 ∗ 0.41 ∗ ⟨0.3, 0.7⟩) = ⟨0.4593, 0.2437⟩ P (S0|o0:2) = α f0:0 ∗ b1:2 = α⟨0.818, 0.182⟩ ∗ ⟨0.4593, 0.2437⟩ = ⟨0.894, 0.106⟩ CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 15 / 29 A Smoothing Example Calculate P (S0|o0:2). k = 0, t = 3 b1:2 = P (o1:2|S0) = (P (o1|s1) ∗ P (o2:2|s1) ∗ ⟨P (s1|s0), P (s1|¬s0)⟩ + P (o1|¬s1) ∗ P (o2:2|¬s1) ∗ ⟨P (¬s1|s0), P (¬s1|¬s0)⟩) = (0.9 ∗ 0.69 ∗ ⟨0.7, 0.3⟩+ 0.2 ∗ 0.41 ∗ ⟨0.3, 0.7⟩) = ⟨0.4593, 0.2437⟩ P (S0|o0:2) = α f0:0 ∗ b1:2 = α⟨0.818, 0.182⟩ ∗ ⟨0.4593, 0.2437⟩ = ⟨0.894, 0.106⟩ CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 15 / 29 A Smoothing Example Calculate P (S0|o0:2). k = 0, t = 3 b1:2 = P (o1:2|S0) = (P (o1|s1) ∗ P (o2:2|s1) ∗ ⟨P (s1|s0), P (s1|¬s0)⟩ + P (o1|¬s1) ∗ P (o2:2|¬s1) ∗ ⟨P (¬s1|s0), P (¬s1|¬s0)⟩) = (0.9 ∗ 0.69 ∗ ⟨0.7, 0.3⟩+ 0.2 ∗ 0.41 ∗ ⟨0.3, 0.7⟩) = ⟨0.4593, 0.2437⟩ P (S0|o0:2) = α f0:0 ∗ b1:2 = α⟨0.818, 0.182⟩ ∗ ⟨0.4593, 0.2437⟩ = ⟨0.894, 0.106⟩ CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 15 / 29 Learning Goals Smoothing Calculations Smoothing Derivations The Forward-Backward Algorithm Revisiting the Learning goals CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 16 / 29 Smoothing (time k) How can we derive the formula for P (Sk|o0:(t−1)), 0 ≤ k ≤ t− 1? P (Sk|o0:(t−1)) = P (Sk|o(k+1):(t−1) ∧ o0:k) = αP (Sk|o0:k)P (o(k+1):(t−1)|Sk ∧ o0:k) = αP (Sk|o0:k)P (o(k+1):(t−1)|Sk) = αf0:kb(k+1):(t−1) Calculate f0:k through forward recursion. Calculate b(k+1):(t−1) through backward recursion. CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 17 / 29 Q: Smoothing Derivation Q #1: What is the justification for the step below? P (Sk|o0:(t−1)) = P (Sk|o(k+1):(t−1) ∧ o0:k) (A) Bayes’ rule (B) Re-writing the expression (C) The chain/product rule (D) The Markov assumption (E) The sum rule → Correct answer is (B) Re-writing the expression. CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 18 / 29 Q: Smoothing Derivation Q #1: What is the justification for the step below? P (Sk|o0:(t−1)) = P (Sk|o(k+1):(t−1) ∧ o0:k) (A) Bayes’ rule (B) Re-writing the expression (C) The chain/product rule (D) The Markov assumption (E) The sum rule → Correct answer is (B) Re-writing the expression. CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 18 / 29 Q: Smoothing Derivation Q #2: What is the justification for the step below? = P (Sk|o(k+1):(t−1) ∧ o0:k) = αP (Sk|o0:k)P (o(k+1):(t−1)|Sk ∧ o0:k) (A) Bayes’ rule (B) Re-writing the expression (C) The chain/product rule (D) The Markov assumption (E) The sum rule → Correct answer is (A) Bayes’ rule. CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 19 / 29 Q: Smoothing Derivation Q #2: What is the justification for the step below? = P (Sk|o(k+1):(t−1) ∧ o0:k) = αP (Sk|o0:k)P (o(k+1):(t−1)|Sk ∧ o0:k) (A) Bayes’ rule (B) Re-writing the expression (C) The chain/product rule (D) The Markov assumption (E) The sum rule → Correct answer is (A) Bayes’ rule. CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 19 / 29 Q: Smoothing Derivation Q #3: What is the justification for the step below? = αP (Sk|o0:k)P (o(k+1):(t−1)|Sk ∧ o0:k) = αP (Sk|o0:k)P (o(k+1):(t−1)|Sk) (A) Bayes’ rule (B) Re-writing the expression (C) The chain/product rule (D) The Markov assumption (E) The sum rule Sk−1 Sk Sk+1 Sk+2 Ok−1 Ok Ok+1 Ok+2 → Correct answer is (D) The Markov assumption. CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 20 / 29 Q: Smoothing Derivation Q #3: What is the justification for the step below? = αP (Sk|o0:k)P (o(k+1):(t−1)|Sk ∧ o0:k) = αP (Sk|o0:k)P (o(k+1):(t−1)|Sk) (A) Bayes’ rule (B) Re-writing the expression (C) The chain/product rule (D) The Markov assumption (E) The sum rule Sk−1 Sk Sk+1 Sk+2 Ok−1 Ok Ok+1 Ok+2 → Correct answer is (D) The Markov assumption. CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 20 / 29 Backward Recursion Formula Derivations How did we derive the formula for backward recursion? P (o(k+1):(t−1)|Sk) = ∑ s(k+1) P (o(k+1):(t−1) ∧ s(k+1)|Sk) (1) = ∑ s(k+1) P (o(k+1):(t−1)|s(k+1) ∧ Sk) ∗ P (s(k+1)|Sk) (2) = ∑ s(k+1) P (o(k+1):(t−1)|s(k+1)) ∗ P (s(k+1)|Sk) (3) = ∑ s(k+1) P (o(k+1) ∧ o(k+2):(t−1)|s(k+1)) ∗ P (s(k+1)|Sk) (4) = ∑ s(k+1) P (o(k+1)|s(k+1)) ∗ P (o(k+2):(t−1)|s(k+1)) ∗ P (s(k+1)|Sk) (5) CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 21 / 29 Q: Backward Recursion Derivation Q #4: What is the justification for the step below? P (o(k+1):(t−1)|Sk) = ∑ s(k+1) P (o(k+1):(t−1) ∧ s(k+1)|Sk) (A) Bayes’ rule (B) Re-writing the expression (C) The chain/product rule (D) The Markov assumption (E) The sum rule → Correct answer is (E) The sum rule. CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 22 / 29 Q: Backward Recursion Derivation Q #4: What is the justification for the step below? P (o(k+1):(t−1)|Sk) = ∑ s(k+1) P (o(k+1):(t−1) ∧ s(k+1)|Sk) (A) Bayes’ rule (B) Re-writing the expression (C) The chain/product rule (D) The Markov assumption (E) The sum rule → Correct answer is (E) The sum rule. CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 22 / 29 Q: Backward Recursion Derivation Q #5: What is the justification for the step below? = ∑ s(k+1) P (o(k+1):(t−1) ∧ s(k+1)|Sk) = ∑ s(k+1) P (o(k+1):(t−1)|s(k+1) ∧ Sk)P (s(k+1)|Sk) (A) Bayes’ rule (B) Re-writing the expression (C) The chain/product rule (D) The Markov assumption (E) The sum rule → Correct answer is (C) The chain/product rule. CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 23 / 29 Q: Backward Recursion Derivation Q #5: What is the justification for the step below? = ∑ s(k+1) P (o(k+1):(t−1) ∧ s(k+1)|Sk) = ∑ s(k+1) P (o(k+1):(t−1)|s(k+1) ∧ Sk)P (s(k+1)|Sk) (A) Bayes’ rule (B) Re-writing the expression (C) The chain/product rule (D) The Markov assumption (E) The sum rule → Correct answer is (C) The chain/product rule. CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 23 / 29 Q: Backward Recursion Derivation Q #6: What is the justification for the step below? = ∑ s(k+1) P (o(k+1):(t−1)|s(k+1) ∧ Sk)P (s(k+1)|Sk) = ∑ s(k+1) P (o(k+1):(t−1)|s(k+1))P (s(k+1)|Sk) (A) Bayes’ rule (B) Re-writing the expression (C) The chain/product rule (D) The Markov assumption (E) The sum rule Sk−1 Sk Sk+1 Sk+2 Ok−1 Ok Ok+1 Ok+2 → Correct answer is (D) The Markov assumption. CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 24 / 29 Q: Backward Recursion Derivation Q #6: What is the justification for the step below? = ∑ s(k+1) P (o(k+1):(t−1)|s(k+1) ∧ Sk)P (s(k+1)|Sk) = ∑ s(k+1) P (o(k+1):(t−1)|s(k+1))P (s(k+1)|Sk) (A) Bayes’ rule (B) Re-writing the expression (C) The chain/product rule (D) The Markov assumption (E) The sum rule Sk−1 Sk Sk+1 Sk+2 Ok−1 Ok Ok+1 Ok+2 → Correct answer is (D) The Markov assumption. CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 24 / 29 Q: Backward Recursion Derivation Q #7: What is the justification for the step below? = ∑ s(k+1) P (o(k+1):(t−1)|s(k+1)) ∗ P (s(k+1)|Sk) = ∑ s(k+1) P (o(k+1) ∧ o(k+2):(t−1)|s(k+1)) ∗ P (s(k+1)|Sk) (A) Bayes’ rule (B) Re-writing the expression (C) The chain/product rule (D) The Markov assumption (E) The sum rule → Correct answer is (B) Re-writing the expression. CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 25 / 29 Q: Backward Recursion Derivation Q #7: What is the justification for the step below? = ∑ s(k+1) P (o(k+1):(t−1)|s(k+1)) ∗ P (s(k+1)|Sk) = ∑ s(k+1) P (o(k+1) ∧ o(k+2):(t−1)|s(k+1)) ∗ P (s(k+1)|Sk) (A) Bayes’ rule (B) Re-writing the expression (C) The chain/product rule (D) The Markov assumption (E) The sum rule → Correct answer is (B) Re-writing the expression. CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 25 / 29 Q: Backward Recursion Derivation Q #8: What is the justification for the step below? = ∑ s(k+1) P (o(k+1) ∧ o(k+2):(t−1)|s(k+1)) ∗ P (s(k+1)|Sk) = ∑ s(k+1) P (o(k+1)|s(k+1)) ∗ P (o(k+2):(t−1)|s(k+1)) ∗ P (s(k+1)|Sk) (A) Bayes’ rule (B) Re-writing the expression (C) The chain/product rule (D) The Markov assumption (E) The sum rule Sk−1 Sk Sk+1 Sk+2 Ok−1 Ok Ok+1 Ok+2 → Correct answer is (D) The Markov assumption. CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 26 / 29 Q: Backward Recursion Derivation Q #8: What is the justification for the step below? = ∑ s(k+1) P (o(k+1) ∧ o(k+2):(t−1)|s(k+1)) ∗ P (s(k+1)|Sk) = ∑ s(k+1) P (o(k+1)|s(k+1)) ∗ P (o(k+2):(t−1)|s(k+1)) ∗ P (s(k+1)|Sk) (A) Bayes’ rule (B) Re-writing the expression (C) The chain/product rule (D) The Markov assumption (E) The sum rule Sk−1 Sk Sk+1 Sk+2 Ok−1 Ok Ok+1 Ok+2 → Correct answer is (D) The Markov assumption. CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 26 / 29 Learning Goals Smoothing Calculations Smoothing Derivations The Forward-Backward Algorithm Revisiting the Learning goals CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 27 / 29 The Forward-Backward Algorithm Consider a hidden Markov model with 4 time steps. We can calculate the smoothed probabilities using one forward pass and one backward pass through the network. S0 S1 S2 S3 O0 O1 O2 O3 P (Sk|o0:(t−1)) = αP (Sk|o0:k)P (o(k+1):(t−1)|Sk) = α f0:k b(k+1):(t−1) CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 28 / 29 Revisiting the Learning Goals By the end of the lecture, you should be able to ▶ Calculate the smoothing probability for a time step in a hidden Markov model. ▶ Describe the justification for a step in the derivation of the smoothing formulas. ▶ Describe the forward-backward algorithm. CS 486/686: Intro to Artificial Intelligence Lecturer: Blake VanBerlo Slides by: Alice Gao 29 / 29
欢迎咨询51作业君