School of Computer Science, University of St Andrews, CS5011, 2019-20 CS5011: Assignment 3 - Reasoning with Uncertainty - Bayesian Networks Assignment: A3 - Assignment 3 Deadline: 18 November 2019 Weighting: 25% of module mark Please note that MMS is the definitive source for deadline and credit details. You are expected to have read and understood all the information in this specification and any accompanying documents at least a week before the deadline. You must contact the lecturer regarding any queries well in advance of the deadline. 1 Objective This practical aims to gain understanding on how tomodel and use Bayesian networks to reason with uncertainty. 2 Competencies • Develop understanding of probabilistic inferences. • Design and document a Bayesian network. • Develop a method for probabilistic inference in Bayesian networks, with understanding of conditional probabilities. 3 Practical Requirements 3.1 Introduction Bayesian Belief networks (BNs) are a way of representing probabilistic relationships among a set of variables. They are often much more compact than the full joint probability distribution. They constitute an effective systematic way to represent uncertain representation of the world. Bayesian Networks are probabilistic graphical models representing a set of random variables and their conditional dependencies via directed acyclic graphs. Seminal research in the con- text of Bayesian Networks and causal reasoning, has led Judea Pearl1 one of the first pioneer of BNs, to being awarded the prestigious Turing Award in 20112. Bayesian Networks have important applications in diagnostic expert systems particularly for medical applications, inves- tigation of faulty systems, and legal domains. In recent years, Bayesian Networks have also seen interesting applications in classification tasks, as basis for Bayesian Network Classifiers 1http://amturing.acm.org/bib/pearl_2658896.cfm 2http://amturing.acm.org/alphabetical.cfm 1 (See Chapter 20 of Russell & Norvig3). In this practical, you will gain familiarity with modelling and using Bayesian Networks. In the first part of this practical, we will use the AIspace Belief and Decision Networks tool (version 5.1.10, August 9th, 2016) available from http://aispace.org/. Download and save the bayes.jar file from the download page, then run it from terminal as: java -jar bayes.jar Other methods to run the tool can be found on the AIspace website. We will refer to this tool as the AIspace tool. 3.2 Part 1 - Bayesian Network modelling In this part of the practical, we are tasked to design a bayesian solution for predicting the risk of traffic congestion at the entrance of Meridonia City Hospital 4, using the AIspace tool. Suppose you are designing an alert system for traffic congestion to ensure that emergency services can be alerted if the access to the hospital is congested. The system relies on the following factors gathered from information systems and that are related to the traffic congestion. 1. A visibility sensor indicates that there is either High or Low visibility depending on the weather (sunny, overcast, raining, snowing). The sensor is pretty good at detecting visi- bility parameters but may give the wrong value with 5% probability. 2. Hospital statistics show that the peak time for accessing the hospital is between 6-10 am and 5-10pm. 3. The council information system also provides information on whether it is a working day or a holiday day (assume 80 days holidays in 360 days). Furthermore, national statistics show that on holiday periods 30% of people take their car to circulate around the area, while during non-holiday periods there is a probability of 60% that people will take the car. 4. If there is a prediction of traffic congestion, the emergency service must be alerted. The alert is triggered only 90% of the times, to avoid false alarms. 5. A camera placed in proximity of the entrance of the hospital indicates whether there is either a High or Low traffic flow. The camera has a bad fault tolerance and may give the wrong value with 5% probability. While not used for triggering alerts, these can be useful for checking the predictions. Tasks to be performed: 1. Construct two Bayesian Networks (BN1 and BN2) to represent the problem above: • BN1 should represent all above under the assumption that the factors are indepen- dent (singly connected network). • BN2 should represent all above relaxing the assumption that the factors are indepen- dent, and considering what factors might influence other factors (multiply connected network). 2. The steps to follow are: 3S. J. Russell and P. Norvig. Artificial Intelligence: A Modern Approach. Pearson Education, 3 ed., 2010. 4Location from https://wiki.opengeofiction.net/wiki/index.php/Meridonia, all data in this practical is purely fictional. 2 • Decide what your domain variables are and what the relationships are between the domain variables • Add the conditional probabilities for nodes that have parents, and the prior probabil- ities for nodes without parents. • Use the information given to fill these probabilities, and for those probabilities that have not been given explicitly choose values that seem reasonable. • Construct the Bayesian networks in the tool, save them as bn1.xml and bn2.xml and submit them with your assignment. 3. Include an additional section in your report containing the following (max 500 extra words): • An explanation of the two BN designs, how they differ, and how the probabilities were assigned. • Using the Query function, design three queries that can be answered by both net- works. Report the resulting probabilities for both BN1 and BN2 and motivate the results on the basis of each network structure noting the differences between BN1 and BN2. The queries must be one per following type: a predictive, a diagnostic and a profiling query. 3.3 Part 2 - A bayesian inference agent In this part of the practical, you are tasked with developing an agent that can construct a bayesian network and make inferences using the variable elimination algorithm. Details of the algorithm can be found in the slides, similar to the Russell & Norvig5 algorithm, Elimination- ASK, page 528. The algorithm should be developed for singly connected network, where each node has at most one parent and two children, and events have binary domains. Identify a small version of the network presented in Part 1 - BN1, to demonstrate that the algo- rithm works correctly. For this task, the main class should instantiate the network and provide examples on how the query is performed. Queries should only be of predictive type. An exam- ple for testing composed of a singly connected network with six nodes is provided in studres and can be accessed in the AIspace tool, similar to the sample problem Fire Alarm Belief Network. 3.4 Part 3 - Additional requirements It is strongly recommended that you ensure you have completed the Requirements in part 1 and 2 before attempting any of these additional requirements. You may wish to consider one or two (max) of the following additional tasks: • Define your own problem involving reasoning with evidence and uncertainty. Write down a text description of the problem, thenmodel it using a Bayesian network in the AIspace tool. Make the problem sufficiently complex such that there are 5 nodes and some degree of correlation among variables. Save and submit your network, and use the Query function to investigate changes of probability distribution giving examples of predictive, diagnostic, and profiling tasks. Present these examples and summarise your findings in your report. (300 additional words) • Investigate other algorithms for inference making in BNs, and develop an alternative method or mode advanced method to perform inferences in a BN. 5S. J. Russell and P. Norvig. Artificial Intelligence: A Modern Approach. Pearson Education, 3 ed., 2010. 3 • Implement an algorithm for merging BNs such as that of Feng et al, 20146. • Suggest and develop your own advanced functionality, a significant advanced option should consider the design of a bayesian network or inference making in a bayesian network. 4 Code and Report Specifications 4.1 Code Submission and Running For part 1 you are required to submit XML files created with the BN tool from AISpaces. Please name your files as bn1.xml bn2.xml For Part 2, the program must be written in Java and your implementation must compile and run on the School lab Machines. Source code should be placed in a folder A3src and the code should run with the following command: java A3main [any param] Please do not use a library that performs the variable elimination inference algorithm for you in part 2. Libraries are not core to the requirements of this assignment can be used. For part 3, depending on your choices, if a new network is proposed, please submit an .xml file and indicate the name of the network in your report. Otherwise please submit the source code ensuring that it can be run with additional parameters to be distinguished from part 2. Please note that code that does not adhere to these instructions will not be accepted. 4.2 Report You are required to submit a report describing your submission in PDF with the structure and requirements presented in the additional document CS5011_A_Reports. The core sections (Design, Implementation and Evaluation) have a limit of 1500 words in total. The report has an additional section required to explain part 1 (of max 500 additional words). The report should include clear instructions on how to run your code. In the report, ensure that the design of the networks is described and motivated, in addition to the general requirements. 5 Deliverables A single ZIP file must be submitted electronically via MMS by the deadline. Submissions in any other format will be rejected. Your ZIP file should contain: 1. A PDF report as discussed in Section 4.2 2. Depending on the tasks achieved, a set of BNs in XML format and any code implemented, as discussed in Section 4.1 6Feng G., Zhang J., Liao S., A novel method for combining Bayesian networks, theoretical analysis, and its applications. Pattern Recognition, 2014 4 6 Assessment Criteria Marking will follow the guidelines given in the school student handbook. The following issues will be considered: • Achieved requirements • Quality of the solution provided • Examples and testing • Insights and analysis demonstrated in the report Some guideline descriptors for this assignment are given below: • For a mark of 8 or higher: the submission partially achieves part 1, adequately docu- mented or reported. • For a mark of 11 or higher: the submission achieves fully part 1 with good motivation. The report describes clearly what was done, with good style. • For a mark of 14 or higher: the submission achieves fully part 1 with good motivation, and an attempt to part 2. It contains clear and well-structured code and a clear report showing a good level of understanding of bayesian networks. • For a mark of 17: the submission achieves fully part 1 and part 2. It contains clear, well- designed code, together with a clear, insightful and well-written report, showing in-depth understanding of bayesian networks. • Above 17: the submission achieves fully part 1 and part 2, and one or two (max) addi- tional functionalities. The submission should demonstrate unusual clarity of implemen- tation, together with an excellent and well-written report showing evidence of extensive understanding of design and inferencing in bayesian networks. 7 Policies and Guidelines Marking: See the standard mark descriptors in the School Student Handbook https://info.cs.st-andrews.ac.uk/student-handbook/learning-teaching/feedback.html# Mark_-Descriptors Lateness Penalty: The standard penalty for late submission applies (Scheme B: 1 mark per 8 hour period, or part thereof): https://info.cs.st-andrews.ac.uk/student-handbook/learning-teaching/assessment.html# latenesspenalties Good Academic Practice: The University policy on Good Academic Practice applies: https://www.st-andrews.ac.uk/students/rules/academicpractice/ 5 Alice Toniolo (
[email protected]) November 7, 2019 6