ICP-2152: Java Technologies XML Mini-Project 2 Description This mini-project is intended to cover the parsing, use, and creation of XML documents. In this mini-project you will: • create various applications to handle the BBC Weather RSS Feed; • perform basic parsing and processing of XML data; • extend your knowledge of Swing UIs; • become familiar with and use Java’s API for XML Processing (JAX-P). This Mini-Project is marked out of 100 points, and contributes 15% to your module mark. Conventions File New Project... are menu items to be selected, they may be right click menus or the top level window menus. Finish , Alt are either buttons or keys that need to be pushed. App.java is the name of a file, or input that can be used verbatim. L indicates a task that you will need to complete prior to moving to the next. XML Processing XML is short for eXtensible Mark-up Language, and Java has several families of APIs to handle this as part of the base JDK. XML is a free-form, hierarchical data format - similar in looks to HTML. XML can use DTDs (Document Type Definitions) and Schemas to dictate what tags and data is valid in various parts of the document. The most elementary operation with XML is to be able to read and understand the data contained within the files. This is known as parsing the document. Java provides two main APIs to parse and XML document. These are the Simple API for XML-Processing (SAX) or the Document Object Model API (DOM). Both of these are found within the Java API for XML Processing (JAX-P) part of the JDK. DOM parsers and their use are described in Big Java. For more information about SAX parsers see: http:// docs.oracle.com/javase/tutorial/jaxp/sax/parsing.html. To provide additional help, there is an example SAX parser provided on Blackboard SAXParserExample which requires both the Employee class and employees.xml. For those of you know very little about XML you should begin by looking at Chapter 22 in Horstmann’s Big Java (4th ed.). When reading this chapter make sure that you distinguish between XML and the various Java APIs for processing XML documents. Also have a look at the Additional Notes in the Appendix. Finally, when creating your own XML documents use and IDE or text editor to provide maximum support (e.g. syntax checking/highlighting, formatting, and code completion). Eclipse can be used to help here, and will create XML documents in one of three ways: Mini-Project 2 1 ICP-2152: Java Technologies 1. From a DTD; 2. From an XML Schema; 3. From a Well-Formed Template. To access these functions, right click on the folder which should contain the new file and select New Other... XML XML File . This will show the New XML File Wizard. To begin with, simply name your file and choose Finish . This will open a new XML File in the XML Editor – this provides additional support that the Java Source editor and plain file editor won’t. The reason XML is used on the Internet is its’ universal and self- documenting nature. The description of the data is contained in the hierarchy of elements in the document and (if present) in the schema or type definition. This means that programs on different systems, written in different languages can consume the same data and ‘understand’ it in the same way: whether the data is presented in the same way and for the same purpose is neither here nor there. Task 1 Consume the BBCWeather RSS Feed (15%) RSS, short for Really Simple Syndication, is a type of XML data feed. It is popular with news, weather and other current affairs sites, such as blogs, because it allows them to offer their content in an easy to consume form. https://www.bbc.co.uk/weather/about/17543675 has details about the RSS Feed service. To begin, you will need to find the GeoID of a place to find the weather for. To do this, visit https: //www.bbc.co.uk/weather and use the search bar at the top to select your desired location, for example ‘Bangor, Gwynedd’. After clicking the link for your place, you will be taken to the observations page. This page will have a URL similar to https://www.bbc.co.uk/weather/0/2656397. The last set of numbers (2656397 in this case) is the GeoID. You will need to keep that safe for later. On the RSS Feeds page, the BBC give an example of the URL for the ‘Latest Observations’https:// weather-broker-cdn.api.bbci.co.uk/en/observation/rss/2643123. You would replace the last seven digits (the GeoID) with the one for the location you are interested in. Visiting the URL in Chrome will show the XML code natively, in Firefox the code will be shown or a download popup depending on the version. Safari will ask you to open the News app. We are interested in the code, which will look similar to: 1 2 3 xmlns:dc="http://purl.org/dc/elements/1.1/" 4 xmlns:georss="http://www.georss.org/georss" version="2.0"> 5 6 BBC Weather - Observations for Manchester, GB 7 https://www.bbc.co.uk/weather/2643123 8 Latest observations for Manchester from BBC Weather, including weather, temperature and wind information 9 en 10 Copyright: (C) British Broadcasting Corporation, see http://www.bbc. co.uk/terms/additional_rss.shtml for more details 11 Tue, 15 Jan 2019 08:00:00 GMT 12 2019-01-15T08:00:00Z 13 en 14 Copyright: (C) British Broadcasting Corporation, see http://www.bbc. co.uk/terms/additional_rss.shtml for more details 15 16 17 Tuesday - 08:00 GMT: Not available, 7C (44F) Mini-Project 2 2 ICP-2152: Java Technologies 18 https://www.bbc.co.uk/weather/2643123 19 Temperature: 7C (44F), Wind Direction: South South Westerly, Wind Speed: 4mph, Humidity: 81\%, Pressure: 1017mb, Rising, Visibility: --description> 20 Tue, 15 Jan 2019 08:00:00 GMT 21 https://www.bbc.co.uk/weather/2643123-2019-01-15 T08:00:00.000Z 22 2019-01-15T08:00:00Z 23 53.4809 -2.2374 24 25 26 Now that you have a URL and content, your task is to write a command- line client to retrieve the document, parse the XML and display the current observation headline (the text highlighted in red). The current headline is found in the element body. Your client must be contained within a Maven project, even though you will not need any external dependencies. You may not simply use String processing to extract this element from the retrieved URL. The client should produce output similar to that below. Notice that the content selected for display belongs to the second occurrence of a element; do not display the content of the first occurrence of this tag. 200 OK Tuesday - 08:00 GMT: Not available, 7C (44F) You will need to use the following class from the Java Class Library to access the site and recover the XML file. java.net.URL You will also need to use one of the following classes depending upon your choice of parser. org.xml.sax.helpers.DefaultHandler javax.xml.parsers.DocumentBuilderFactory You may notice that depending on the type of Java XML parser you have chosen to employ, you do not need to explore every element and attribute in the XML document. Think about the differences between the types of parsers available and what that means when dealing with more complex or larger XML documents. L Once you have this client working, make sure to save the code unaltered so that it can be assessed. Task 2 Building a Weather App (20%) Use the WindowBuilder to build a simple Swing based application to display the information recovered in the previous exercise. The GUI should consist of: • a JTextBox for the URL; • a JButton to obtain a forecast; • a JTextArea to present the results. The application you develop must look as pcitured in Figure 1. Mini-Project 2 3 ICP-2152: Java Technologies Figure 1: The UI needed for Task 2. The bottom section is a JTextArea. Figure 2: Weather Symbols - From Top Left: Sunny, Cloudy, Light Clouds, Windy, Heavy Rain, Rain Showers/Light Rain, Snow, Lightning L Once you have this client working, make sure to save the code unaltered so that it can be assessed. L You will also need to alter the Maven POM to cause the resulting JAR to run your GUI class automatically. This will involve using the Maven Assembly Plugin, and is also known as ‘creating an executable JAR’. You will find instructions on using this plugin at http://maven.apache.org/ plugins/maven-assembly-plugin/usage.html. Task 3 Adding Weather Icons (15%) Using the text description of the weather condition generate a BBC style symbol for that condition and place this symbol onto the frame used for the GUI. The condition is contained in the element and occurs after the colon symbol and is terminated by the first comma in the text. Thus in the element below the weather condition is “Light Cloud” Friday - 10:00 GMT: Light Cloud, 9C (48F) Some BBC style weather symbols shown in Figure 2. Mini-Project 2 4 ICP-2152: Java Technologies When there is no symbol that matches the description, you will need to make a decision on how best to present the information to the user. The images can be produced in a number of ways. For instance, you could build a local library of image files by searching the Web. These images can then be used build instances of the class ImageIcon which can then be used to construct a JLabel. Alternatively, you could use 2D graphics to produce your own set of weather symbols. If you decide on the second option speak with the lecturer to obtain further guidance. L Once you have this client working, make sure to save the code unaltered so that it can be assessed. Task 4 Searching for Locations (20%) The BBC uses the GeoNames ID system. In order to match names of specific locations to IDs (e.g. Bangor, Gwynedd to 2656397), you will need to register with the GeoNames service at http://www.geonames.org/login. Once your account details have been confirmed via the e-mail, you will need to enable the ‘search’ web service at http://www.geonames.org/ manageaccount. A description of the service can be found at http://www.geonames.org/export/ geonames-search.html. One way to use the GeoNames search service is to supply a URL with search parameters to your web browser. For example, the URL below will return a single search result, in English, for the ‘place’ or text ‘london’ and user ‘eesa03’ (i.e. Steve Marriott): http://api.geonames.org/search?q=london&maxRows=1&lang=en&username= eesa03 This search returns; 1 2 3 5699 4 5 London 6 London 7 51.50853 8 -0.12574 9 2643743 10 GB 11 United Kingdom 12 P 13 PPLC 14 15 Using this web service amend your weather application so that instead of getting the user to supply a lengthy URL he or she simply has to supply the name of the location, for which a weather forecast is desired. Obviously, this means that you should alter the GUI to accommodate this modification. Once the user has entered the location name, your program should pass it to the appropriate GeoNames service. The service will then return an XML document, which can be parsed so that the geonameId value can be extracted and used to build the required URL for the BB weather feed. For example, the URL for London would become; https://weather-broker-cdn.api.bbci. co.uk/en/observation/rss/2643743. Mini-Project 2 5 ICP-2152: Java Technologies You must remember to use your own username in the search URL supplied to GeoNames. Additionally, the GeoNames service will return IDs for more places than the BBC can return weather forecasts for, although forecasts are not limited to just the UK. When this happens the feed will return an HTTP 404 – Not Found error. You will need to handle this event gracefully; an error message would be appropriate. Write a new version of your Weather Application to allow searches using the GeoNames API to find the IDs before showing the weather for that location. L Once you have this client working, make sure to save the code unaltered so that it can be assessed. Task 5 Additional Information from the Description (10%) This exercise involves parsing and extracting the information provided in the description element of the BBC Weather RSS feed. In particular you will need to recover the six pieces of information highlighted in red below: Temperature: 9C (48F), Wind Direction: South South Westerly, Wind Speed: 19mph, Humidity: 87%, Pressure: 1018mb, Rising, Visibility: Good Having recovered this information you will need to find some suitable way of presenting it visually. Marks will be awarded for use of advanced Swing components such as JTables, JTrees and JLists or further Java2D drawing/image processing where appropriate. This aspect of the mini-project is free form, however the assessor will ask questions regarding your design choices. This means that you cannot simply decide to use the most complicated component possible to get good marks. The choice must be appropriate for the information you are trying to display. L Once you have this client working, make sure to save the code unaltered so that it can be assessed. Task 6 Serialising Search Data to XML (20%) This final exercise requires you to serialise all of the search data, and retrieved GeoNames IDs back into an XML file. The file needs to be written to as each search is completed so that the information doesn’t need to be held in memory. You should use the Streaming API for XML (StAX) in Java to act as both a reader (i.e. read from an XML document) and a writer (write to an XML document). You’ll find Oracle’s tutorial for StAX at http://docs.oracle.com/javase/tutorial/jaxp/stax/ index.html. The XML document you are going to create should conform to the basic structure shown below (the elements emphasised in italics must be replaced with the actual values): 1 2 3 entered search term 4 true/false at GeoNames 5 filled in if found 6 7 ... repeated for each search completed ... 8 The element is repeated each time the user runs another search. You should ensure that the file is updated after each write, so that your code can be tested completely. If the file you Mini-Project 2 6 ICP-2152: Java Technologies have chosen to use already exists, you can erase the contents of the file on each new execution of your program. You will need to handle the fact that your program is running from a JAR and so will not necessarily have a file system location to use. There is a suitable method in the java.io.File class to assist in creating temporary files safely. You will also need to find a suitable way to inform the user of the location of this file. Submission This must be submitted as a .zip file (no other archive formats will be accepted) of your entire project folder. This will include your POM, source files, web files, as well as compiled and packaged project. This can be exported from Eclipse directly by right-clicking on the Project, selecting Export.Choose the General > Archive File option, then select your project, ZIP as the archive type and enter a name/location for the file and click Finish. Useful Links In addition to looking at Horstmman I strongly recommend the following two sites for more information about XML. http://www.ibm.com/developerworks/xml/ http://www.w3schools.com/DTD/default.asp Another important source of information is the W3 consortium site which maintains a section devoted to XML. The home page is listed below: http://www.w3.org/XML/ Perhaps the most useful area at W3 can be found at http://www.w3.org/standards/xml/ This link will take you to an authoritative description of the XML and the various standards associated with this markup language. For an interesting article on XML design by Uche Ogbuji see http://www.ibm.com/developerworks/xml/library/x-eleatt.html For information about the various Java APIs associated with XML check out the following links: http://docs.oracle.com/javase/tutorial/jaxp/index.htmlhttp://www.ibm. com/developerworks/java/library/x-jaxp/ Appendix 1: XML Namespaces Due to the fact that various XML authors may make use of the same tags to represent different data (especially since quite often a word will have multiple possible meanings), naming collisions can occur. In order to prevent such collisions, we typically declare our XML tags as belonging to a namespace. For example, imagine the following XML document representing a university, which has many courses and many students: 1 2 3 4 6 5 John Smith Mini-Project 2 7 ICP-2152: Java Technologies 6 67 7 8 9 5 10 Computing Laboratory 2 11 30 12 13 Due to the fact that various XML authors may make use of the same tags to represent different data (especially since quite often a word will have multiple possible meanings), naming collisions can occur. In order to prevent such collisions, we typically declare our XML tags as belonging to a namespace. For example, imagine the following XML document representing a university, which has many courses and many students: 1 2 3 4 6 5 John Smith 6 67 7 8 9 5 10 Computing Laboratory 2 11 30 12 13 In the above XML document, there are two collisions - one between id tags and another between name tags. In the first instance, our id and name tags are in reference to a student ID and a student name whereas, in the second we are referring to a course ID and course name. This would likely cause confusion when attempting to validate our XML documents against a DTD as it would appear (by choosing the same name for these tags) that the id and name tags should store the same information whereas in reality, this is not what we want. In order to distinguish between these, we can use namespaces e.g. 1 2 3 4 6 5 John Smith 6 67 7 8 9 5 10 Computing Laboratory 2 11 30 12 13 In the above example, we have used two namespaces in order to ensure that our tags are now unique in meaning – naming collisions no longer occur and we cannot confuse one tag for another. The second line of the XML document, which would normally be used to declare the root element of an XML document now also declares two namespaces – student and course. We assign each namespace a URL purely in order to ensure that these namespaces are unique. The theory is that since we are the only one to which a particular URL belongs, by using our URL as part of our namespace, this should ensure that our namespace is truly unique – using the Bangor University is probably not the best example since in theory every student / staff member of the university might make use of this Mini-Project 2 8 ICP-2152: Java Technologies domain in order to define a namespace so it would probably be better to make use of a more personal domain name should you happen to own one. Whilst URLs are very commonly used in practice in order to make namespaces unique, we are in fact allowed to use any URI i.e. we are not limited to URLs. Notice that http://www.bangor. ac.uk/xmlns-student and http://www.bangor.ac.uk/xmlns-course aren’t actually URLs which exist, the URL is included as part of the namespace purely for the purposes of making it unique, for the same reason that Java packages are also typically named after (reversed) domains e.g. uk.ac.bangor. Having declared our namespaces, we simply prefix each tag with the namespace to which it should belong. Mini-Project 2 9