ECE3073 2024
Final Exam - Revision Questions
Before you begin - these questions have been drafted up as extra practice and aim to provide some indication of the exam format and topics. Please keep in mind that:
The time required to complete all these questions is not indicative of the time required to complete the final exam
They may not necessarily cover all examinable topics, so it is always best to revise the course material from Moodle in addition to answering these questions.
Q1: Pipeline Hazards
Question 1.1
Identify and explain the data/control hazards in the code below. Assume you have a classical 5-stage pipelined Nios II/f processor.
start: addi r5, r0, 5
addi r6, r0, 7
subi r4, r5, r6
blez r4, r0, add
sub: addi r1, r0, 5
addi r2, r0, 4
sub r3, r1, r2
add: addi r1, r0, 5
addi r2, r0, 6
add r3, r1, r2
Question 1.2
Discuss what code modifications as well as hazard management strategies can be used to mitigate the identified hazards.
Q2: I/O and Communication
Question 2.1
Explain the concept of I2C communication and its advantages in embedded systems.
Describe the key components of an I2C communication setup, including controller and target devices, SCL (clock) and SDA (data) lines, and addressing schemes.
Describe the concept of parity bits in data communication.
Discuss the role of parity bits in error detection and correction.
Question 2.2
Explain how I2C allows for devices with different clock speeds to communicate with each other.
Question 2.3
Describe the UART data transfer process, including start/stop bits, data framing, and baud rate.
Discuss the advantages and limitations of UART communication compared to parallel communication methods.
Explain how UART reduces the required clock synchronisation accuracy between endpoints.
Question 2.4
Suppose you had the following hardware setup for the RS232 on the Nios II processor:
UART connected to IRQ 5
When a char is ready to transmit, this will trigger an interrupt for the receiver side
You want to transmit strings
How would you:
Use low-level Nios II macros (ie. NIOSII_WRITE_STATUS etc.) to set up the interrupts correctly
Setup the interrupt handler to transmit strings
To answer this question, write pseudo code for the interrupt handler, but you will need to write out the correct Nios II macros and interrupt masks for the UART where needed. Please refer to the Embedded Peripherals for more information.
Question 2.5
You are a graduate engineer implementing a high-speed embedded system involving a host processor, LIDAR sensor, and wireless communications module. From the sensor datasheet, you know your LIDAR sensor produces significant RF interference during operation.
Your wireless module supports multiple communications protocols, including 16-bit parallel port, I2C, and RS-485. Your application requires >1Mbit/s of bandwidth, and you have a 100kHz clock available. Choose an appropriate communications protocol, and justify your choice.
[Hint: These questions test your general understanding of different communication protocols, along with the implementations of the RS232 done in the workshop]
Q3: Architectures and Processors
Question 3.1
Explain the characteristics of the following ISAs: stack-based, accumulator-based, load-store, and register-memory architectures. Include advantages, disadvantages, and typical use cases for each.
Question 3.2
Provide code examples (in assembly language) for a simple operation (e.g., addition or multiplication) in each of the four ISAs mentioned above. Highlight the differences in instruction formats and operand handling.
Question 3.3
Explain the role and functionality of the following components: ALU, Register File, Control Unit, and Data Memory.
Question 3.4
d. Explain the concept of pipelining in processor design. How does pipelining improve performance?
Question 3.5
Discuss strategies for optimizing performance in Nios II-based systems. Include techniques such as cache, pipeline, out-of-order instruction scheduling, and parallel execution.
Question 3.6
You are trying to create a 16x16 array of integers as part of implementing an image processing algorithm, where your image sensor scans left-to-right and top-to-bottom. Your embedded system is using a Nios ii/f processor which has 64-byte cache lines and has high latency when accessing main memory. Since your kernel produces the same results when running on columns vs rows, is there an access pattern that is more performant? Explain your choice, with specific reference to cache hits and misses.
[Hint: These are theory questions that require you to compare the differences in architectures and processor design]
Q4 Microarchitecture Performance
Consider three different processors P1, P2 and P3 with the following cycles per instruction (CPI) and clock rate.
Processor
CPI
Clock rate
P1
1.5
3 Ghz
P2
1.0
4 Ghz
P3
2.2
2.2 Ghz
Assuming that all processors are implementing and executing the same instruction set, answer the following questions. Justify your answer with the corresponding calculations
Question 4.1
Which processor has the highest performance expressed in instructions per second?
Question 4.2
If each processor executes a program in 10 seconds, compute the number of CPU clock cycles required for this program and the number of instructions in this program for each processor
Question 4.3
Changes to each processor led to an increase of 20% in their respective CPI. If we want to reduce the execution time of a program from 10 to 7 seconds, what clock rate is required for each processor to achieve this time reduction?
[Hint: This tests your understanding of the different factors that influence CPU execution time as well as the different ways in which CPU performance can be computed]
Q5: uC/OS-II RTOS Coding
[The extent of coding in this question is not expected in the actual exam - you might have to troubleshoot code but not write code from scratch]
Question 5.1
Complete the following code according to the task. Be as complete as possible when it comes to syntax. Pseudocode will result in capped marks
// A task polls a PIO as fast as it can and tries to detect a change. When
// there is a change there are two tasks that are run, one of which stores
// in a database, another which updates a display. The polling task has to
// wait for for the other tasks to be finished before it can poll again.
// Criterion
// 1 mark for task initialisation
// 1 mark for communication primitives
// 1 mark for correct behaviour
// Fill in the skeleton code below
#define SENSOR_BASE 0x1000
// Function called to store data in database, not to be completed, just used
void store_database(int data);
// Function called to display data, not to be completed, just used
void display_data(int data);
// Task that produces new data after it has been consumed by the other tasks
// if a change is detected when polling.
void data_task(void *pdata)
{
}
// Task that stores the data
void store_task(void *pdata)
{
}
// Task that displays data
void display_task(void *padata)
{
}
int main()
{
// Setup tasks and Initialise OS Event structures etc.
}
[Hint: To tackle this question, break down the relevant information first and if you are not confident with coding, then write out the pseudo code. Remember to protect any shared variables, and use signalling semaphores were appropriate]
Question 5.2
How have you protected any shared variables between the different tasks? How have you ensured that deadlock has been prevented?
[Hint: This relates to semaphores, and how you have used the semaphore macros to create, pend, and create. The value used to create the semaphore is important!]
Question 5.3
Extend your code so that it takes in interrupts instead where the display_task will pend for a semaphore called sem_signalling_display.
[Hint: Think about what is needed for an ISR to occur, including the Embedded Peripherals Document]
Q6: NIOS-II Hardware Setup
You are a graduate ECSE student and you were getting bored with your graduate role. In your spare time, you decide to start a new hobby which is to set up internet of things (IoT) devices around your property. To give yourself a challenge, you have decided to use FPGA to set the system up - rather than a microprocessor such as a STM32. Here are IoT devices you have on hand, that can send data to the FPGA:
A 480x360 camera that uses grayscale images
6 photodiode sensors to detect luminescence (ie. light).
Moisture sensor to detect the moisture levels within the soil
Some other IoT devices you have are:
Alarm system that can trigger notifications on your phone
Lights around the property that will turn on
Sprinklers to maintain moisture
Assume there are intermediary control signals that can send data back to the GPIO pins of the FPGA, so your FPGA can perform some processing and decide on what outputs to assert later.
Question 6.1
To start with designing this system, you have decided to use the NiosII/e processor because you forgot you did not have access to Monash VPN, so you cannot use anything more complex. How would you go about designing the hardware level to integrate all 3 devices?
[Hint:Talk about platform designer, and how you would set up the I/Os with the appropriate configurations]
Question 6.2
You have decided to implement the code using interrupts. To keep the resource usage low for the FPGA, you have decided to use one PIO for both the photodiodes, and the moisture sensor.
However, you realise you only needed one photodiode, and the moisture sensor returns a 8-bit as an input to the FPGA. The input provides a byte of numerical value that indicates the moisture level, but for now you are only concerned with a high moisture level which you simplify by taking the MSB of this 8-bit input. With this information in mind, indicate what you need to do for the hardware level, and how you would implement interrupts on a low level. Write out the code where necessary, with the correct offsets and macros. Pseudo-code is also allowed, but this will result in lower marks.
[Hint: You will need to list out the Nios II macros and the relevant numbers corresponding to your design in the previous sub question]
Question 6.3
You realised that using the MSB of the 8-bits input from the moisture sensor is not enough! You need a more granular indication of the moisture level, so you can control the sprinkler system. This results in you having to use the top 3 bits instead. How will you expand the PIO, and what else do you need to change at the code level to satisfy this new system?
[Hint: Talk about the changes in the PIO, and what you need to update to enable interrupts for the additional bits in the sensor]
Question 6.4
Now that you have implemented the sensors, you realise you have a new problem. Depending on the moisture level, your sprinkler system needs to adjust its level of moisture produced according to the moisture level. Assume that the sprinkler system has an ENABLE input signal which you can control from the FPGA. How will you adjust the sprinkler system to provide moisture based on the detected moisture level?
[Hint: Think about what you have learnt in the labs, and how you could control the amount of moisture being distributed]]
Question 6.5
You decide you earn enough money at your graduate role that you can afford to get a Quartus licence, so now you upgrade the NiosII/e processor to the NiosII/f processor. Now that you have transitioned over, you realise that the outputs are no longer being driven properly (ie. the sprinkler). What could be a potential issue(s) due to the transition and how would you resolve it?
[Hint: Think about the differences between Nios II/e and Nios II/f - how will your setup need to be updated?]
Question 6.6
[The extent of coding in this question is not expected in the actual exam - you might have to troubleshoot code but not write code from scratch]
For your IoT project, you found and purchased a bootleg humidity sensor from eBay, but after receiving it, you realise that it uses some weird 1-way SPI-like protocol to transfer data. After some thorough testing with an oscilloscope, you come to the following conclusions:
The input pin CLK is the clock input to the device, for some reason, this controls both the rate at which the sensor measures data, and transmits it.
The output pin CTRL is set high by the sensor to indicate that it is currently outputting valid data, if it is low when you are expecting data, it means that the sensor knows its measurement was wrong, and you should not input the reading given.
The 8 output data pins ODATA[7:0] contain the 7-bit sensor reading, but this should be ignored if CTRL is not currently active.
For some ungodly reason, the sensor requires the input MSR pin to be active in order to make a reading, but then MSR must go low on the next clock cycle to allow for the sensor to send data.
If MSR is high, the CTRL pin will never go high.
IF MSR stays high for more than 1 clock cycle, the previous measurement will be lost, and the device will overwrite it with a new measurement
If the sensor receives a clock that is faster than 100 Hz, it will burn out.
In verilog, write a module to facilitate the data transfer between the sensor and your FPGA, you can simply export the received data out of the module, and assume you have a memory handler setup to deal with it. You may not use any IP-Modules for this task, the header for your verilog task has been made for you below.
module (
input [7:0] ODATA,
input CTRL,
output MSR,
output CLK,
input CLOCK_50, // 50MHz clock
output DATA_REC // Data after being received by the sensor
);
[Hint: Start slow, and understand the main parts before diving deeper into the code. It always helps to write out pseudo code or flow charts first]
Q7 RTOS CPU Utilisation
[The extent of coding in this question is not expected in the actual exam - you might have to troubleshoot code but not write code from scratch]
Consider the following specifications for a (not yet implemented) uC/OS-II function.
This function, labelled OSStatPend(INT16U usageThresh, INT16U time), will delay the task it is called from until the processor has a CPU usage rolling average of less than ‘usageThresh’ for a period of ‘time’ seconds.
For example, OSStatPend(50, 10), will delay the current task until the rolling average for the CPU usage over the last 10 seconds is less than 50%.
Question 7.1
Implement this function in C, and make sure it takes up as little time as possible, as the CPU should be able to execute other code while this task is pending. To implement this, you should calculate the rolling average by maintaining a list of the last ‘time’-usage numbers and checking the mean of this against the input usageThresh.
[Hint: Make assumptions and write them where relevant. ]
Question 7.2
Answer the following questions based on your implementation:
What is the maximum number of seconds this function can measure the average CPU usage?
Why would it be useless for the ‘time’ input to be in ms instead of seconds, even though it would give the user more control over the processor?
Describe a situation in which this function might be useful.
[Hint: This question invites you to think past the initial implementation into its use cases. You can think about it in terms of the scheduler, or how RTOS works in general]
Q8 RTOS Scheduling
Question 8.1
Identify some of the limitations of existing commercial real-time kernels for the development of different mission- and safety-critical applications.
Show with an example that EDF is no longer an optimal scheduling policy if preemption is not allowed.
Question 8.2
Please refer to the following table for questions 8.2 and 8.3.
Task
Execution time (in ticks)
Period (in ticks)
1
2
8
2
2
6
3
1
3
What is the LCM for the task allocation?
Question 8.3
Fill out the table according to the LCM - Add more rows where needed. Also, use Simso to show that all the tasks are schedulable. Add a SIMSO output to show your result matches the table
Tick
Task executing
1
2
3
Question 8.4Implement the earliest deadline first scheduling for a task set of arbitrary length, specified using the constant value NUM_SCHEDULED_TASKS. You may assume you have C standard library functions available (e.g., qsort).
Question 8.5
Prove earliest deadline first scheduling is optimal for uniprocessor systems.
Question 8.6
Is earliest deadline first scheduling optimal for simultaneous multiprocessor systems? Explain your answer.
[Hint: This tests your understanding of the EDF scheduler. Some questions may be difficult, so ensure you focus on the ones that you can complete first. ]
Q9 - Memory System and Cache Organisation
Question 9.1
Explain the different cache organisations. Identify their main advantages and drawbacks.
Question 9.2
Consider the following assembly code running in a processor with a single-level cache. The single-level cache includes a small fully associative instruction cache (I-cache) of size 16 B and uses 8-byte blocks. The instruction length is 4 B. Assume that the registers r1 and r3 are initialised to values 10 and 0 respectively.
LOOP: cmplt r2, r0, r1
beq r2, r0, DONE
addi r1, r1, -1
addi r3, r3, 2
jmp LOOP
DONE:
What is the success rate of the instruction cache for all loop iterations?
Assuming a hit access time of 0.66 ns and a memory access time of 70 ns, what is the instruction cache's effective access time (referred to as AMAT in the lectures)?
[Hint: This tests your understanding of basic cache concepts and the implications of a given cache organisation in memory performance]