ECE/CS 472/572 Computer Architecture: Background Prof. Lizhong Chen Opening the Box Capacitive multitouch LCD screen 3.8 V, 25 Watt-hour battery Computer board 2 Inside the Processor ◼ Apple A5 3 What is Computer Architecture ? ◼ Computer Architecture is the science and art of selecting and interconnecting hardware components to create computers that meet functional, performance and cost goals. Application Algorithm Programming Language Operating System Circuits Devices Technology 4 Computer Architecture How Computer Is Made https://www.youtube.com/watch?v=UvluuAIiA50 5 Relative Performance ◼ Define: Performance = 1/Execution Time ◼ “X is n time faster than Y” n== XY YX time Executiontime Execution ePerformancePerformanc ◼ Example: time taken to run a program ◼ 10s on A, 15s on B ◼ Execution TimeB / Execution TimeA = 15s / 10s = 1.5 ◼ So A has a speedup of 1.5 over B 6 Instruction Count and CPI ◼ Instruction Count for a program ◼ Determined by program, ISA and compiler ◼ Average cycles per instruction (CPI) ◼ Determined by CPU hardware ◼ Different instructions may have different CPI ◼ Average CPI affected by instruction mix Time Cycle ClockCPICount nInstructioTime CPU = 7 Uniprocessor Performance Constrained by power, instruction-level parallelism, memory latency Intel Tick-Tock Model 8 Embedded processors Broadcom XLP-II: 20-core Cavium Octeon: 48-core Tilera Tile-Gx8072: 72-core Mobile devices – MPSoCs CPU, GPU, DSP, etc. (GP)GPUs Nvidia Kepler: 192x15 cores AMD Liverpool: 1152 cores [1] http://www.theregister.co.uk/2010/02/03/intel_westmere_ep_preview/ [2] http://www.theregister.co.uk/2012/10/03/ibm_power7_plus_server_launch/ [3] http://www.theregister.co.uk/2012/09/04/oracle_sparc_t5_processor/ [4] http://www.intel.com/pressroom/archive/releases/2009/20091202comp_sm.htm [5] http://www.scientificcomputing.com/news/2013/02/intel-xeon-phi-coprocessor/ Oracle SPARC T5: 16-core[3]IBM Power7+: 8-core[2] Intel Westmere-EP: 6-core[1] Intel SCC: 48-core[4] Intel Xeon Phi: 60-core[5] Challenge: On-chip communication for parallel computing 9 Instruction Set ◼ The repertoire of instructions of a computer ◼ Different computers have different instruction sets ◼ But with many aspects in common ◼ Early computers had very simple instruction sets ◼ Simplified implementation ◼ Many modern computers also have simple instruction sets ◼ CISC vs. RISC 10 The MIPS Instruction Set ◼ Large share of embedded core market ◼ Applications in consumer electronics, network/storage equipment, cameras, printers, … ◼ Typical of many modern ISAs (see Appendixes E) ◼ We will examine two implementations of MIPS ISA ◼ A simplified version ◼ A more realistic pipelined version ◼ Simple subset, shows most aspects ◼ Memory reference instructions: lw, sw ◼ Arithmetic-logical instructions : add, sub, and, or, slt ◼ Control transfer instructions: beq, j 11 MIPS Instruction Examples ◼ C code: g = h + A[8]; ◼ g in $s1, h in $s2, base address of A in $s3 ◼ Compiled MIPS code: ◼ Index 8 requires offset of 32 ◼ 4 bytes per word lw $t0, 32($s3) # load word add $s1, $s2, $t0 offset base register 12 MIPS Instruction Examples ◼ C code: A[12] = h + A[8]; ◼ h in $s2, base address of A in $s3 ◼ Compiled MIPS code: ◼ Index 8 requires offset of 32 lw $t0, 32($s3) # load word add $t1, $s2, $t0 sw $t1, 48($s3) # store word 13 MIPS Instruction Examples ◼ Conditional ◼ Branch to a labeled instruction if a condition is true; otherwise, continue sequentially ◼ beq rs, rt, L1 ◼ If (rs == rt) branch to instruction labeled L1; ◼ Unconditional ◼ j L1 ◼ Unconditional jump to instruction labeled L1 14 MIPS R-format Instructions ◼ Instruction fields ◼ op: operation code (opcode) ◼ rs: first source register number ◼ rt: second source register number ◼ rd: destination register number ◼ shamt: shift amount (00000 for now) ◼ funct: function code (extends opcode) op rs rt rd shamt funct 6 bits 6 bits5 bits 5 bits 5 bits 5 bits 15 R-format Example add $t0, $s1, $s2 0 17 18 8 0 32 000000 10001 10010 01000 00000 100000 000000100011001001000000001000002 = 0232402016 op rs rt rd shamt funct 6 bits 6 bits5 bits 5 bits 5 bits 5 bits 16 MIPS I-format Instructions ◼ Immediate arithmetic ◼ addi $s1, $s2, 20 ◼ rs/rt: source/destination register number ◼ Constant: –2^15 to +2^15 – 1 ◼ Load/store instructions ◼ lw $t0, 32($s3) ◼ rs/rt: source/destination register number ◼ Address: offset added to base address in rs op rs rt constant or address 6 bits 5 bits 5 bits 16 bits 17 MIPS J-format Instructions ◼ Jump (j) targets could be anywhere in text segment ◼ j L1 ◼ Encode full address in instruction op address 6 bits 26 bits ◼ (Pseudo)Direct jump addressing ◼ Target address = PC31…28 : (address × 4) 18 Branch Addressing ◼ Branch instructions specify ◼ beq rs, rt, L1 ◼ Opcode, two registers, target address ◼ Most branch targets are near branch ◼ Forward or backward op rs rt constant or address 6 bits 5 bits 5 bits 16 bits ◼ PC-relative addressing ◼ Target address = PC + offset × 4 ◼ PC already incremented by 4 by this time 19 Addressing Mode Summary 20
欢迎咨询51作业君