We typically use **algorithms** to accomplish complex tasks

Although it is common to execute algorithms on a GPU, a hardware implementation is sometimes needed because of power and performance constraints

*RT methodology* is a design process that describes system operation by a sequence of data transfers and manipulations among **registers** 

This methodology supports sequential execution semantics used by microprocessors to execute a program

Consider an algorithm that computes the sum of 4 numbers, divides by 8 and rounds the result to the nearest integer

```
size = 4;
sum = 0;
for i in (0 to size-1) do
    { sum = sum + a(i); }
```



```
q = sum/8;
r = sum rem 8;
if (r > 3)
  { q = q + 1; }
outp = q;
```

Characteristics of an algorithm:

- Algorithms use **variables**, memory locations with a symbolic addresses, to store intermediate results
- Algorithms are executed sequentially and the order of the steps is important

One approach is to convert **sequential execution** into a **structural data flow**, where the sequence is embedded in the 'flow of data'

This is accomplished by mapping an algorithm into a system of *cascading hardware blocks*, where each block represents a statement in the algorithm

For example, the previous algorithm can be **unrolled** into a data flow diagram

```
sum <= 0;
sum0 <= a(0);
sum1 <= sum0 + a(1);
sum2 <= sum1 + a(2);
sum3 <= sum2 + a(3);
q <= "000" & sum3(8 downto 3);
r <= "00000" & sum3(2 downto 0);
outp <= q + 1 when (r > 3) else q;
```

Note that this is very different from the algorithm -- the circuit is strictly combinational with NO memory elements



The structural data flow model can only be applied to small tasks and is not flexible

**Register Transfer Methodology** introduces hardware that *matches* the variable and sequential execution model

- Registers are used to store intermediate data (model symbolic variables)
- A control path (FSM) is used to specify the order of register operations
- A data path is added to implement the operations (FSMD)

The basic action in RT methodology is the *register transfer operation*:

 $r_{\text{dest}} \leftarrow f(r_{\text{src1}}, r_{\text{src2}}, ..., r_{\text{src3}})$ 

The function *f* uses the contents of the source registers, plus external inputs in some cases

Difference between an algorithm and an RT register is the implicit embedding of *clk* 

- At the rising edge of the clock, the outputs of registers  $r_{src1}$ ,  $r_{src2}$  become available
- The outputs drive the inputs of a combinational circuit that represents f()
- At the **next rising edge** of the clock, the result is stored into  $r_{dest}$

## FSMD

The function f() can be any expression that is representable by a combinational circuit

```
r \leftarrow 1

r \leftarrow r

r0 \leftarrow r1

n \leftarrow n-1

y \leftarrow a \oplus b \oplus c \oplus d

s \leftarrow a^2 + b^2
```

Note that we will continue to use the notation *\_reg* and *\_next* for the current output and next input of a register

The notation

$$r_1 \leftarrow r_1 + r_2$$

is translated as

## FSMD

Be sure to study this carefully because it is heavily used in digital systems

 $r \leftarrow r1 + r2$ 



## **Multiple RT operations**

An algorithm consists of many steps and a *destination* register my be loaded with different values over time

## FSMD

Consider the following sequence of operations



Since  $r_1$  is the destination of multiple operations, we need a MUX to route the proper value to its input

An FSM is used to drive the *control signals* so that the sequence of operations are carried out in the order given

(10/30/19)

#### FSMD and ASMD

An extended ASM chart known as **ASMD** (ASM with datapath) chart can be used to represent the FSMD



State transitions and register updates occur at the same time on the rising edge of the clk

DELAYED STORE: The new value of  $r_1$  is only available when the FSM enters the  $s_2$  state

VHDL Essentials V

## FSMD

NOTE: When a register is NOT being updated with a new value, it is assumed that it maintains its current value, i.e.,

 $r_1 \leftarrow r_1$  These actions are NOT shown in the ASMD/state chart

# Conceptual block diagram of an FSMD





HW/SW Codesign

## FSMD Design Examples Consider a repetitive addition multiplier

```
Basic algorithm: 7*5 = 7+7+7+7+7
 if (a_in=0 or b_in=0) then
     { r = 0; }
 else
     {
    a = a_{in};
    n = b_{in};
    r = 0;
    while (n != 0)
        {
        r = r + a;
        n = n - 1;
     }
     return(r);
```

ECE UNM

(10/30/19)

This code is a better match to an ASMD because ASMD does not have a **loop** construct

```
if (a_in = 0 or b_in = 0) then
       { r = 0; }
    else
       {
       a = a_{in};
       n = b_{in};
       r = 0;
  op: r = r + a;
       n = n - 1;
       if (n = 0) then
          { goto stop; }
       else
          { goto op; }
       }
stop: return(r);
```

To implement this in hardware, we must first define the I/O signals

- *a\_in*, *b\_in*: 8-bit unsigned input
- *clk*, *reset*: 1-bit input
- *start*: 1-bit command input
- *r*: 16-bit unsigned output
- ready: 1-bit status output -- asserted when unit has completed and is ready again

The start and ready signals are added to support sequential operation

- When this unit is embedded in a larger design, and the main system wants to perform multiplication
- It checks *ready*
- If '1', it places inputs on *a\_in* and *b\_in* and asserts the *start* signal

#### VHDL Essentials V

## FSMD Design Examples

The ASMD uses *a*, *n* and *r* data registers to emulate the three variables





With the ASMD chart available, we can refine the original block diagram

We first divide the system into a *data path* and a *control path* 

For the control path, the input signals are *start*, *a\_is\_0*, *b\_is\_0* and *count\_0* -- the first is an external signal, the latter three are status signals from the data path

These signals constitute the inputs to the FSM and are used in the *decision boxes* 

The output of the control path are *ready* and control signals that specify the RT operations of the data path

In this example, we use the state register as the output control signals

Visualizing the data path can be accomplished by doing the following:

- List all RT operations
- Group RT operation according to the destination register
- Add combinational circuit/mux
- Add status circuits

For example

- RT operation with the *r* register
  - $r \leftarrow r$  (in the idle state)
  - $r \leftarrow 0$  (in the load and ab0 states)
  - $r \leftarrow r + a$  (in the op state)
- RT operations with the *n* register
  - $n \leftarrow n$  (in the idle state)
  - $n \leftarrow b_{in}$  (in the load and ab0 state)
  - $n \leftarrow n 1$  (in the op state)
- RT operations with the *a* register
  - $a \leftarrow a$  (in the idle and op states)
  - $a \leftarrow a_{in}$  (in the load and ab0 states)

Note that the **default** operations MUST be included to build the proper data path

Let's consider the circuit associated with the *r* register



The three possible sources, 0, r and r+a are selected using a MUX

The select signals are labeled symbolically with the state names The routing is consistent with what is given on the previous slide

We can repeat this process for the other two registers and combine them

The status signals are implemented using three comparators

HW/SW Codesign

#### VHDL Essentials V

ECE 522

# FSMD Design Examples

The entire (un-optimized) control and data path



ECE UNM

(10/30/19)

```
FSMD Design Examples
    architecture two seq arch of seq mult is
       constant WIDTH: integer := 8;
       type state_type is (idle, ab0, load, op);
        signal state_reg, state_next: state_type;
        signal a req, a next: unsigned(WIDTH-1 downto 0);
       signal n_reg, n_next: unsigned(WIDTH-1 downto 0);
       signal r req, r next: unsigned(2*WIDTH-1 downto 0);
       begin
        -- state and data register
       process(clk, reset)
          begin
           if (reset = '1') then
              state reg <= idle;</pre>
              a reg <= (others => '0');
              n reg <= (others => '0');
              r_reg <= (others => '0');
```

# **Two Segment VHDL Descriptions of FSMDs** elsif (clk'event and clk = '1') then state reg <= state next;</pre> a\_reg <= a\_next;</pre> n\_reg <= n\_next;</pre> r reg <= r next;</pre> end if; end process; -- combinational circuit **process**(start, state\_reg, a\_reg, n\_reg, r\_reg, a\_in, b\_in, n\_next) begin state next <= state req;</pre> a next <= a req; n\_next <= n\_reg;</pre> r next <= r req;</pre> ready <='0';

```
Two Segment VHDL Descriptions of FSMDs
            case state_reg is
               when idle =>
                   if (start = '1') then
                      if (a_in = "00000000" or
                            b in = "0000000") then
                         state_next <= ab0;</pre>
                      else
                         state next <= load;</pre>
                      end if;
                  end if;
                  ready <= '1';
               when ab0 =>
                  a_next <= unsigned(a_in);</pre>
                  n_next <= unsigned(b_in);</pre>
                  r next <= (others => '0');
                   state next <= idle;</pre>
```

HW/SW Codesign

```
Two Segment VHDL Descriptions of FSMDs
               when load =>
                   a_next <= unsigned(a_in);</pre>
                   n_next <= unsigned(b_in);</pre>
                   r next <= (others => '0');
                   state next <= op;</pre>
               when op =>
                   n next <= n reg - 1;
                   r_next <= ("0000000" & a_reg) + r_reg;</pre>
                   if (n next = "00000000") then
                      state next <= idle;</pre>
                   end if;
            end case;
        end process;
        r <= std logic vector(r reg);</pre>
     end two_seg_arch;
```