Find the word or phrase from the list below that best matches the description in the following questions. Use the numbers to the left of the words in the answer. Each answer should be used only once. 1. virtual worlds 14. operating system 2. desktop computers 15. compiler 3. servers 16. Bit 4. low-end servers 17. Instruction 5. supercomputers 18. assembly language 6. terabyte 19. machine language
7. petabyte 20. C
8. datacenters 21. Assembler
9. embedded computers 22. high-level language
10. multicore processors 23. system software
11. VHDL 24. application software
12. RAM 25. COBOL 13. CPU 26. FORTRAN 1.1.1 Computer used to run large problems and usually accessed via a network ==> 3) servers 1.1.2 10^15 or 2^50 bytes (i.e., 10 to the 15 or 2 to the 50) ==> 7) petabyte
1.1.3 Computer composed of hundreds to thousands of processors and terabytes of memory ==> 5) supercomputers
1.1.4 Today's science fiction application that probably will be available in near future ==> 1) virtual worlds
1.1.5 A kind of memory called random access memory==> 12) RAM 1.1.6 Part of a computer called central processor unit ==> 13) CPU
1.1.7 Thousands of processors forming a large cluster ==> 8) datacenters
1.1.8 A microprocessor containing several processors in the same chip==> 10) multi-core processors 1.1.9 Desktop computer without screen or keyboard usually accessed via a network==> 4) low-end servers
1.1.10 Currently the largest class of computer (i.e., there are more of these types of computers than any others) that runs one application or one set of related applications ==> 9) embedded computers
1.1.11 Special language used to describe hardware components ==> 11) VHDL 1.1.12 Personal computer delivering good performance to single users at low cost ==> 2) desktop computers
1.1.13 Program that translates statements in high-level language to assembly language==> 15) compiler
1.1.14 Program that translates symbolic instructions to binary instructions==> 21) assembler 1.1.15 High-level language for business data processing ==> 25) cobol
1.1.16 Binary language that the processor can understand==> 19) machine language 1.1.17 Commands that the processors understand ==> 17) instruction 1.1.18 High-level language for scientific computation ==> 26) fortran
1.1.19 Symbolic representation of machine instructions ==> 18) assembly language 1.1.20 Interface between user's program and hardware providing a variety of services and supervision functions ==> 14) operating system
1.1.21 Software/programs developed by the users ==> 24) application software 1.1.22 Binary digit (value 0 or 1) ==> 16) bit
1.1.23 Software layer between the application software and the hardware that includes the operating system and the compilers ==> 23) system software
1.1.24 High-level language used to write application and system software==> 20) C 1.1.25 Portable language composed of words and algebraic expressions that
must be translated into assembly language before run in a computer ==> 22) high-level language 1.1.26 10^12 or 2^40 bytes==> 6) terabyte
Exercise 1.2
1.2.1 For a color display using 8 bits for each of the primary colors (red, green, blue) per pixel, what should be the minimum size in bytes of the frame buffer to store a frame?
8 bits × 3 colors = 24 bits/pixel => 4 bytes/pixel. 1280 × 800 pixels = 1,024,000 pixels.
1,024,000 pixels × 4 bytes/pixel = 4,096,000 bytes (approxitly 4 Mbytes).
1.2. 2 How many frames could it store, assuming the memory contains no other information? 2 GB = 2000 Mbytes
Number of frames = 2000 Mbytes/4 Mbytes = 500 frames
1.2.3 If a 256 Kbytes file is sent through the Ethernet connection, how long it would take? 1 gigabit network ==> 1 gigabit/per second = 125 Mbytes/second.
File size: 256 Kbytes = 0.256 Mbytes.
Time= Mbytes/network speed: 0.256/125 = 2.048 ms.
Exercise 1.3
1.3.1 Which processor has the highest performance expressed in instructions per second? P2 has the highest performance
P1 (Performance = instructions/sec) = 2 × 10^9/1.5 = 1.33 × 10^9 P2 (Performance = instructions/sec) = 1.5 × 10^9/1.0 = 1.5 × 10^9 P3 (Performance = instructions/sec) = 3 × 10^9/2.5 = 1.2 × 10^9
1.3.2 If the processors each execute a program in 10 seconds, find the number of cycles and the number of instructions. Number of cycles = time × clock rate
Cycles (P1) = 10 × (2 × 10^9) = 20 × 10^9 s Cycles (P2) = 10 × (1.5 × 10^9) = 15 × 10^9 s Cycles (P3) = 10 × (3 × 10^9) = 30 × 10^9 s
Time = (Number of instructions × CPI)/clock rate; Number of instructions = Number of cycles/CPI
Instructions (P1) = 20 × 10^9/1.5 = 13.33 × 10^9 Instructions (P2) = 15 × 10^9/1 = 15 × 10^9 Instructions (P3) = 30 × 10^9/2.5 = 12 × 10^9
1.3.3 We are trying to reduce the time by 30% but this leads to an increase of 20% in the CPI. What clock rate should we have to get this time reduction?
Timenew = Timeold × 0.7 = 7 s
CPInew = CPIold × 1.2 => CPI (P1) = 1.8, CPI (P2) = 1.2, CPI (P3) = 3 Time = Number of instructions × CPI/clock rate Time (P1) = 13.33 × 10^9 × 1.8/7 = 3.42 GHz Time (P2) = 15 × 10^9 × 1.2/7 = 2.57 GHz Time (P3) = 12 × 10^9 × 3/7 = 5.14 GHz
1.3.4 Find the IPC (instructions per cycle) for each processor. IPCnew = 1/CPIold = Number of instructions/ (time × clock rate)
IPC (P1) = (20 x 10^9)/(7 x 3) = 1.42 IPC (P2) = (30 x 10^9)/(10 x 2.5) = 2 IPC (P3) = (90 x 10^9)/(9 x 4) = 3.33
1.3.5 Find the clock rate for P2 that reduces its execution time to that of P1. Timenew/Timeold = 7/10 = 0.7.
Clock ratenew = Clock rateold/0.7 = 1.5 GHz/0.7 = 2.14 GHz
1.3.6 Find the number of instructions for P2 that reduces its execution time to that of P3. Timenew/Timeold = 9/10 = 0.9.
Instructionsnew = Instructionsold × 0.9 = (30 × 10^9)× 0.9 = 27 × 10^9
Exercise 1.4
1.4.1 Given a program with 106 instructions divided into classes as follows: 10% class A, 20% class B, 50% class C and 20% class D, which implementation is faster? P2 has a faster implementation time
Class A: 10^5 instr. Class B: 2 × 10^5 instr. Class C: 5 × 10^5 instr. Class D: 2 × 10^5 instr.
Time = Number of instructions × CPI/clock rate P1: Time class A = 0.66 × 10^?4
Time class B = 2.66 × 10^?4 Time class C = 10 × 10^?4 Time class D = 5.33 × 10^?4 Total time P1 = 18.65 × 10^?4
P2: Time class A = 10^?4
Time class B = 2 × 10^?4 Time class C = 5 × 10^?4 Time class D = 3 × 10^?4 Total time P2 = 11 × 10^?4
1.4.2 What is the global (i.e., average) CPI for each implementation? CPI = time × clock rate/ number of instructions
CPI (P1) = (18.65 × 10^?4) × (1.5 × 10^9)/10^6= 2.79 CPI (P2) = (11 × 10^?4) × (2 × 10^9)/10^6 = 2.2
1.4.3 Find the clock cycles required in both cases.
Clock cycles (P1) = (10^5 × 1) + ((2 × 10^5) × 2) + ((5 × 10^5) × 3) + ((2 × 10^5) × 4) = 28 × 10^5 Clock cycles (P2) = (10^5 × 2) + ((2 × 10^5) × 2) + ((5 × 10^5) × 2) + ((2 × 10^5) × 3) = 22 × 10^5
1.4.4 Assuming that arithmetic instructions take 1 cycle, load and store 5 cycles and branch 2 cycles, what is the execution time of the program in a 2 GHz processor? (500 × 1) + (50 × 5) + (100 × 5) + (50 × 2) × (0.5 × 10^–9) = 675 ns 1.4.5 Find the CPI for the program.
CPI = time × clock rate/ Number of instructions
CPI = ((675 × 10^–9) × (2 × 10^9))/700 = 1.92
1.4.6 If the number of load instructions can be reduced by one-half, what is the speed-up and the CPI?
Time = ((500 × 1) + (50 × 5) + (50 × 5) + (50 × 2)) × (0.5 × 10^–9) = 550 ns Speed-up = 675 ns/550 ns = 1.22
CPI = ((550 × 10^–9) × (2 × 10^9))/700 = 1.57
Exercise 1.5
1.5.1 Assume that peak performance is defined as the fastest rate that a computer can execute any instruction sequence. What are the peak performances of P1 and P2 expressed in instructions per second?
1.5.2 If the number of instructions executed in a certain program is divided equally among the classes of instructions except for class A, which occurs twice as often as each of the others. Which computer is faster? How much faster is it?
1.5.3 If the number of instructions executed in a certain program is divided equally among the classes of instructions except for class E, which occurs twice as often as each of the others? Which computer is faster? How much faster is it?
1.5.4 Assuming that computes take 1 cycle, loads and store instructions take 10 cycles, and branches take 3 cycles, find the execution time of the program on a 3 GHz MIPS processor.
1.5.6 Assuming that computes take 1 cycle, loads and store instructions take 2 cycles, and branches take 3 cycles, what is the speed-up of a program if the number of compute instruction can be reduced by one-half?
Exercise 1.6
1.6.1 For the same program, two different compilers are used. The table above shows the execution time of the compiled program. Find the average CPI for the program given that the processor has a clock cycle time of 1 nS.
1.6.2 Assume the average CPI found in 1.6.1, but that the compiled program runs on two difference processors. If the execution times on the two processors are the same, how much faster is the clock of the processor running compiler A’s code versus the clock of the processor running compiler B’s code? 1.6.3 A new compiler is developed that uses only 600 million instructions and has an average CPI of 1.1. What is the speed-up of using this new compiler versus using Compiler A or B on the original processor of 1.6.1?
1.6.4 Assume that peak performance is defined as the fastest rate that a computer can execute any instruction sequence. What is the peak performance of P1 and P2 expressed in instructions per second? 1.6.5 If the number of instructions executed in a certain program is divided equally among the classes of instructions in Problem 2.36.4 except for class A, which occurs twice as often as each of the others, how much faster is P2 than P1?
1.6.6 At what frequency does P2 have the same performance as P1 for the instruction mix given in 1.6.5?
Exercise 1.7
1.7.1 What is the geometric mean of the ratios between consecutive generations for both clock rate and power? (The geometric mean is described in Section 1.7.) Geometric mean clock rate ratio = (1.28 × 1.56 × 2.64 × 3.03 × 10.00 × 1.80 × 0.74)1/7 = 2.15
Geometric mean power ratio = (1.24 × 1.20 × 2.06 × 2.88 × 2.59 × 1.37 × 0.92)1/7 = 1.62
1.7.2 What is the largest relative change in clock rate and power between generations? Largest clock rate ratio = 2000 MHz/200 MHz = 10
Largest power ratio = 29.1 W/10.1 W = 2.88
1.7.3 How much larger is the clock rate and power of the last generation with respect to the first generation? Clock rate: 2.667 × 10^9/12.5 × 106 = 212.8
Power: 95 W/3.3 W = 28.78
1.7.4 Find the average capacitive loads, assuming a negligible static power consumption. Capacitive = Power/Voltage^2 × clock rate
80286: C = 0.0105 × 10^?6 80386: C = 0.01025 × 10^?6 80486: C = 0.00784 × 10^?6 Pentium: C = 0.00612 × 10^?6 Pentium Pro: C = 0.0133 × 10^?6
Pentium 4 Willamette: C = 0.0122 × 10^?6 Pentium 4 Prescott: C = 0.00183 × 10^?6 Core 2: C = 0.0294 × 10^?6
1.7.5 Find the largest relative change in voltage between generations. Pentium Pro/Pentium 4 Willamette = 3.3/1.75 = 1.78
1.7.6 Find the geon1etric mean of the voltage ratios in the generations since the Pentium. Pentium Pro / Pentium => 3.3/5 = 0.66
Pentium 4 Willamette/ Pentium Pro => 1.75/3.3 = 0.53
Pentium 4 Prescott / Pentium 4 Willamette => 1.25/1.75 = 0.71 Core 2 / Pentium 4 Prescott => 1.1/1.25 = 0.88 Geometric mean = 0.68
Exercise 1.8
1.8.1 How much has the capacitive load been reduced between versions if the dynamic power has been reduced by 10%? Power1 = V^2× clock rate × Capacitive; Power2 = 0.9 Power1
C2/C1 = 0.9 × 5^2 × 0.5 × 10^9/3.3^2 × 1 × 10^9 = 1.03
1.8.2 By how much has the dynan1ic power been reduced if the capacitive load does not change? Power2/Power1 = V22 × clock rate2/V12 × clock rate1
Power2/Power1 = 0.87 => Reduction of 13%
1.8.3 Assuming that the capacitive load of version 2 is 80% the capacitive load of version 1, find the voltage for version 2 if the dynamic power of version 2 is reduced by 40% from version 1. Power2 = V22 × 1 × 10^9 × 0.8 × C1 = 0.6 × Power1 Power1 = 52 × 0.5 × 10^9 × C1
V22 × 1 × 10^9 × 0.8 × C1 = 0.6 × 52 × 0.5 × 10^9 × C1
V2 = ((0.6 × 52 × 0.5 × 10^9)/(1 × 10^9 × 0.8))1/2 = 3.06 V
1.8.4 By what factor does the dynamic power scales? Powernew = 1 × Cold × V2old/(2?1/4)2× clock rate × 21/2 = Powerold
According to this the power scales by 1.
1.8.5 Find the scaling of the capacitance per unit area. 1/2?1/2 = 21/2
1.8.6 Using data from Exercise 1.7, find the voltage and clock rate of the Core 2 processor for the next process generation. Voltage = 1.1 × 1/2?1/4 = 0.92 V
Clock rate = 2.667 × 21/2 = 3.771 GHz
百度搜索“70edu”或“70教育网”即可找到本站免费阅读全部范文。收藏本站方便下次阅读,70教育网,提供经典综合文库homework-1在线全文阅读。
相关推荐: