CS 3853 Assignment 2 ISA, Computer Arithmetic and Performance Metrics solved

$35.00

Category: You will receive a download link of the .ZIP file upon Payment

Description

Rate this product

1 Instruction Set Architecture (25 points)

1.1 Instruction Encoding (25 points)

In the RISC ISA, every instruction has a binary encoding with fixed length. This encoding allocates fixedlength fields to represent the opcode and operands. For this problem, consider a simplified RISC ISA with
16-bit instructions. The fields for an instruction are defined in the following table,
Opcode: 4 bits Register Operand 1 (dest/src): 3 Bits Register Operand 2 (src): 3 Bits Immd (src): 6 bits
Table 1: 16-Bit Instruction Encoding Rule

In this ISA, each instruction only has two operands, with one operand used as both source and destination.
For example, if the instruction is “add R1, R2”, the actual operation is R1 = R1 + R2. Note that, the
destination operation must be a register, whereas the other operand can be an immediate value or from
register.

The encoding for opcodes and register operands are given in the following table
Opcode Binary Encoding Register Operands Binary Encoding
add 0000 R1 001
sub 0001 R2 010
mul 0010 R3 011
div 0011 R4 100

Table 2: Binary Encoding for opcodes and register operands
If the second operand is not from a register , put the 111 in the register operand 2 (Reg Op2) field. For
an immediate value, simply put its 6-bit binary representation in the “immd” field.
Given the above ISA definition, please write down the binary encoding for the following instructions:
opcode Reg Op1 Reg Op2 Immd
add R3, R4: 0000 011 100 00 0000
add R3, 7:
sub R1, R2:
mul R4, 26:
div R1, R1:
div R4, R2:
1

2 Performance Metrics and Measurement (45 points)

2.1 Speedups (25 points)

Programmers are debating about what is the best way to write a program. The three options are:
1. Create a standard, sequential C program and run it on a uniprocessor workstation
2. Create a standard, sequential C program and compile it using a parallelizing compiler, and run it on
a 4-way shared memory multiprocessor.
3. Hand-parallelize the problem and run it on a 16-processor shared memory multiprocessor

Consider the table:

Approach Time to Write Program Time to Compile Program Time to Execute Program
1 2 hours 1 second 1 hour
2 2 hours 5 minutes 15 minutes
3 4 hours 5 seconds 5 minutes
(a) What is the execution time speedup of approach 2 compared to approach 1?
(b) What is the execution time speedup of approach 3 compared to approach 1?
(c) If programming, compiling, and execution time are all considered, what is the overall speedup of
approach 2 compared to approach 1?
(d) If programming, compiling, and execution time are all considered, what is the overall speedup of
approach 3 compared to approach 1?
(e) If the program is executed for 1000 times with 1000 different data sets, what is the speedup of approach
3 compared to approach 1? Please consider programming, compiling and all executions.
2

2.2 Amdahl’s Law (20 points)

(Adapted from Computer Architecture: A Quantitative Approach 1.17)
Your company has just bought a new Intel Core i5 dual-core processor and you have been tasked with
optimizing your software for this processor. You will run two applications on this dual core one after another,
but the execution time of the two applications are not equal. Before parallelization, the first application
requires 40 minutes, and the other only 10 minutes. Assume that when you parallelize a portion of the
program, the speedup for the portion is 2.
(a) Given that 40% of the first application is parallelizable, how much speedup would you achieve with
that application if run in isolation?
(b) Given that 99% of the second application is parallelizable, how much speedup would you achieve with
that application if run in isolation?
(c) Given that 40% of the first application is parallelizable, how much overall system speedup (when running
both applications one after another) would you observe if you parallelized it?
(d) Given that 99% of the second application is parallelizable, how much overall system speedup (when
running both applications one after another) would you observe if you parallelized it?
3

3 Computer Arithmetic (30 points)

3.1 Two’s Complement (20 points)

Please give the two’s complement encoding for the following numbers using 5 bits
Decimal Two’s Complement Encoding
1 00001
-1 11111
4
7
11
15
-1
-6
-10
-15

3.2 Floating Point encoding (10 points)

Using the 32-bit floating points encoding discussed in class, please write down the encoding for the following
real numbers. Note that all real numbers are given in binary. For the exponent part, please use a bias of
127.
Sign Exponent Fraction
1.11111 × 10111: 0 1000 0110 111 1100 0000 0000 0000 0000
1.010101 × 100110001:
−1.010101 × 10−0110001:
0.010101 × 100011:
4