Tuesday, August 9, 2011

VLSi Design Websites



VLSI INTERVIEW QUESTIONS


FPGA: A Field-Programmable Gate Array (FPGA) is a semiconductor device containing programmable logic components called "logic blocks", and programmable interconnects. Logic blocks can be programmed to perform the function of basic logic gates such as AND, and XOR, or more complex combinational functions such as decoders or mathematical functions. For complete details click here.

ASIC: An application-specific integrated circuit (ASIC) is an integrated circuit designed for a particular use, rather than intended for general-purpose use. Processors, RAM, ROM, etc are examples of ASICs.

FPGA vs ASIC

Speed
ASIC rules out FPGA in terms of speed. As ASIC are designed for a specific application they can be optimized to maximum, hence we can have high speed in ASIC designs. ASIC can have hight speed clocks.

Cost
FPGAs are cost effective for small applications. But when it comes to complex and large volume designs (like 32-bit processors) ASIC products are cheaper.

Size/Area
FPGA are contains lots of LUTs, and routing channels which are connected via bit streams(program). As they are made for general purpose and because of re-usability. They are in-general larger designs than corresponding ASIC design. For example, LUT gives you both registered and non-register output, but if we require only non-registered output, then its a waste of having a extra circuitry. In this way ASIC will be smaller in size.

Power
FPGA designs consume more power than ASIC designs. As explained above the unwanted circuitry results wastage of power. FPGA wont allow us to have better power optimization. When it comes to ASIC designs we can optimize them to the fullest.

Time to Market
FPGA designs will till less time, as the design cycle is small when compared to that of ASIC designs. No need of layouts, masks or other back-end processes. Its very simple: Specifications -- HDL + simulations -- Synthesis -- Place and Route (along with static-analysis) -- Dump code onto FPGA and Verify. When it comes to ASIC we have to do floor planning and also advanced verification. The FPGA design flow eliminates the complex and time-consuming floor planning, place and route, timing analysis, and mask / re-spin stages of the project since the design logic is already synthesized to be placed onto an already verified, characterized FPGA device.

fpga_asic_f.JPG.jpg

Type of Design
ASIC can have mixed-signal designs, or only analog designs. But it is not possible to design them using FPGA chips.

Customization
ASIC has the upper hand when comes to the customization. The device can be fully customized as ASICs will be designed according to a given specification. Just imagine implementing a 32-bit processor on a FPGA!

Prototyping
Because of re-usability of FPGAs, they are used as ASIC prototypes. ASIC design HDL code is first dumped onto a FPGA and tested for accurate results. Once the design is error free then it is taken for further steps. Its clear that FPGA may be needed for designing an ASIC.

Non Recurring Engineering/Expenses
NRE refers to the one-time cost of researching, designing, and testing a new product, which is generally associated with ASICs. No such thing is associated with FPGA. Hence FPGA designs are cost effective.

Simpler Design Cycle
Due to software that handles much of the routing, placement, and timing, FPGA designs have smaller designed cycle than ASICs.

More Predictable Project Cycle
Due to elimination of potential re-spins, wafer capacities, etc. FPGA designs have better project cycle.

Tools
Tools which are used for FPGA designs are relatively cheaper than ASIC designs.

Re-Usability
A single FPGA can be used for various applications, by simply reprogramming it (dumping new HDL code). By definition ASIC are application specific cannot be reused.




1. How do you convert a XOR gate into a buffer and a inverter (Use only one XOR gate for each)?
Answer

xor_buf_inv_f.JPG.jpg


2. Implement an 2-input AND gate using a 2x1 mux.

and_2x1_mux_f.JPG.jpg


3. What is a multiplexer?

A multiplexer is a combinational circuit which selects one of many input signals and directs to the only output.

4. What is a ring counter?

A ring counter is a type of counter composed of a circular shift register. The output of the last shift register is fed to the input of the first register. For example, in a 4-register counter, with initial register values of 1100, the repeating pattern is: 1100, 0110, 0011, 1001, 1100, so on.

5. Compare and Contrast Synchronous and Asynchronous reset.

Synchronous reset logic will synthesize to smaller flip-flops, particularly if the reset is gated with the logic generating the d-input. But in such a case, the combinational logic gate count grows, so the overall gate count savings may not be that significant. The clock works as a filter for small reset glitches; however, if these glitches occur near the active clock edge, the Flip-flop could go metastable. In some designs, the reset must be generated by a set of internal conditions. A synchronous reset is recommended for these types of designs because it will filter the logic equation glitches between clock.
Problem with synchronous resets is that the synthesis tool cannot easily distinguish the reset signal from any other data signal. Synchronous resets may need a pulse stretcher to guarantee a reset pulse width wide enough to ensure reset is present during an active edge of the clock, if you have a gated clock to save power, the clock may be disabled coincident with the assertion of reset. Only an asynchronous reset will work in this situation, as the reset might be removed prior to the resumption of the clock. Designs that are pushing the limit for data path timing, can not afford to have added gates and additional net delays in the data path due to logic inserted to handle synchronous resets.

Asynchronous reset: The major problem with asynchronous resets is the reset release, also called reset removal. Using an asynchronous reset, the designer is guaranteed not to have the reset added to the data path. Another advantage favoring asynchronous resets is that the circuit can be reset with or without a clock present. Ensure that the release of the reset can occur within one clock period else if the release of the reset occurred on or near a clock edge then flip-flops may go into metastable state.

6. What is a Johnson counter?

Johnson counter connects the complement of the output of the last shift register to its input and circulates a stream of ones followed by zeros around the ring. For example, in a 4-register counter, the repeating pattern is: 0000, 1000, 1100, 1110, 1111, 0111, 0011, 0001, so on.

7. An assembly line has 3 fail safe sensors and one emergency shutdown switch.The line should keep moving unless any of the following conditions arise:
(1) If the emergency switch is pressed
(2) If the senor1 and sensor2 are activated at the same time.
(3) If sensor 2 and sensor3 are activated at the same time.
(4) If all the sensors are activated at the same time
Suppose a combinational circuit for above case is to be implemented only with NAND Gates. How many minimum number of 2 input NAND gates are required?

Solve it out!

8. In a 4-bit Johnson counter How many unused states are present?

4-bit Johnson counter: 0000, 1000, 1100, 1110, 1111, 0111, 0011, 0001, 0000.
8 unused states are present.

9. Design a 3 input NAND gate using minimum number of 2 input NAND gates.

3nand_2nand_f.JPG.jpg


10. How can you convert a JK flip-flop to a D flip-flop?

Connect the inverted J input to K input.

jk1_f.JPG.jpg


11. What are the differences between a flip-flop and a latch?

Flip-flops are edge-sensitive devices where as latches are level sensitive devices.
Flip-flops are immune to glitches where are latches are sensitive to glitches.
Latches require less number of gates (and hence less power) than flip-flops.
Latches are faster than flip-flops.

12. What is the difference between Mealy and Moore FSM?

Mealy FSM uses only input actions, i.e. output depends on input and state. The use of a Mealy FSM leads often to a reduction of the number of states.
Moore FSM uses only entry actions, i.e. output depends only on the state. The advantage of the Moore model is a simplification of the behavior.

13. What are various types of state encoding techniques? Explain them.

One-Hot encoding: Each state is represented by a bit flip-flop). If there are four states then it requires four bits (four flip-flops) to represent the current state. The valid state values are 1000, 0100, 0010, and 0001. If the value is 0100, then it means second state is the current state.

One-Cold encoding: Same as one-hot encoding except that '0' is the valid value. If there are four states then it requires four bits (four flip-flops) to represent the current state. The valid state values are 0111, 1011, 1101, and 1110.

Binary encoding: Each state is represented by a binary code. A FSM having '2 power N' states requires only N flip-flops.

Gray encoding: Each state is represented by a Gray code. A FSM having '2 power N' states requires only N flip-flops.

14. Define Clock Skew , Negative Clock Skew, Positive Clock Skew.

Clock skew is a phenomenon in synchronous circuits in which the clock signal (sent from the clock circuit) arrives at different components at different times. This can be caused by many different things, such as wire-interconnect length, temperature variations, variation in intermediate devices, capacitive coupling, material imperfections, and differences in input capacitance on the clock inputs of devices using the clock.
There are two types of clock skew: negative skew and positive skew. Positive skew occurs when the clock reaches the receiving register later than it reaches the register sending data to the receiving register. Negative skew is the opposite: the receiving register gets the clock earlier than the sending register.

15. Give the transistor level circuit of a CMOS NAND gate.

nand_f.JPG.jpg


16. Design a 4-bit comparator circuit.



17. Design a Transmission Gate based XOR. Now, how do you convert it to XNOR (without inverting the output)?



18. Define Metastability.

If there are setup and hold time violations in any sequential circuit, it enters a state where its output is unpredictable, this state is known as metastable state or quasi stable state, at the end of metastable state, the flip-flop settles down to either logic high or logic low. This whole process is known as metastability.

19. Compare and contrast between 1's complement and 2's complement notation.

20. Give the transistor level circuit of CMOS, nMOS, pMOS, and TTL inverter gate.

not_f.JPG.jpg


21. What are set up time and hold time constraints?

Set up time is the amount of time before the clock edge that the input signal needs to be stable to guarantee it is accepted properly on the clock edge.
Hold time is the amount of time after the clock edge that same input signal has to be held before changing it to make sure it is sensed properly at the clock edge.
Whenever there are setup and hold time violations in any flip-flop, it enters a state where its output is unpredictable, which is known as as metastable state or quasi stable state. At the end of metastable state, the flip-flop settles down to either logic high or logic low. This whole process is known as metastability.

22. Give a circuit to divide frequency of clock cycle by two.

divide_2_f.JPG.jpg


23. Design a divide-by-3 sequential circuit with 50% duty circle.



24. Explain different types of adder circuits.



25. Give two ways of converting a two input NAND gate to an inverter.

nand_not2_f.JPG.jpg


26. Draw a Transmission Gate-based D-Latch.



27. Design a FSM which detects the sequence 10101 from a serial line without overlapping.



28. Design a FSM which detects the sequence 10101 from a serial line with overlapping.



29. Give the design of 8x1 multiplexer using 2x1 multiplexers.



30. Design a counter which counts from 1 to 10 ( Resets to 1, after 10 ).



31. Design 2 input AND, OR, and EXOR gates using 2 input NAND gate.

Others_NAND_f.JPG.jpg


32. Design a circuit which doubles the frequency of a given input clock signal.

2xfreq_f.JPG.jpg


33. Implement a D-latch using 2x1 multiplexer(s).

d_latch_2x1Mux_f.JPG.jpg


34. Give the excitation table of a JK flip-flop.



35. Give the Binary, Hexadecimal, BCD, and Excess-3 code for decimal 14.

14:
Binary: 1110
Hexadecimal: E
BCD: 0001 0100
Excess-3: 10001

36. What is race condition?



37. Give 1's and 2's complement of 19.

19: 10011
1's complement: 01100
2's complement: 01101

38. Design a 3:6 decoder.



39. If A*B=C and C*A=B then, what is the Boolean operator * ?

* is Exclusive-OR.

40. Design a 3 bit Gray Counter.



41. Expand the following: PLA, PAL, CPLD, FPGA.

PLA - Programmable Logic Array
PAL - Programmable Array Logic
CPLD - Complex Programmable Logic Device
FPGA - Field-Programmable Gate Array

42. Implement the functions: X = A'BC + ABC + A'B'C' and Y = ABC + AB'C using a PLA.

pla_f.JPG.jpg


43. What are PLA and PAL? Give the differences between them.

Programmable Logic Array is a programmable device used to implement combinational logic circuits. The PLA has a set of programmable AND planes, which link to a set of programmable OR planes, which can then be conditionally complemented to produce an output.
PAL is programmable array logic, like PLA, it also has a wide, programmable AND plane. Unlike a PLA, the OR plane is fixed, limiting the number of terms that can be ORed together.
Due to fixed OR plane PAL allows extra space, which is used for other basic logic devices, such as multiplexers, exclusive-ORs, and latches. Most importantly, clocked elements, typically flip-flops, could be included in PALs. PALs are also extremely fast.

44. What is LUT?

LUT - Look-Up Table. An n-bit look-up table can be implemented with a multiplexer whose select lines are the inputs of the LUT and whose inputs are constants. An n-bit LUT can encode any n-input Boolean function by modeling such functions as truth tables. This is an efficient way of encoding Boolean logic functions, and LUTs with 4-6 bits of input are in fact the key component of modern FPGAs.

45. What is the significance of FPGAs in modern day electronics? (Applications of FPGA.)

  • ASIC prototyping: Due to high cost of ASIC chips, the logic of the application is first verified by dumping HDL code in a FPGA. This helps for faster and cheaper testing. Once the logic is verified then they are made into ASICs.
  • Very useful in applications that can make use of the massive parallelism offered by their architecture. Example: code breaking, in particular brute-force attack, of cryptographic algorithms.
  • FPGAs are sued for computational kernels such as FFT or Convolution instead of a microprocessor.
  • Applications include digital signal processing, software-defined radio, aerospace and defense systems, medical imaging, computer vision, speech recognition, cryptography, bio-informatics, computer hardware emulation and a growing range of other areas.


46. What are the differences between CPLD and FPGA.



47. Compare and contrast FPGA and ASIC digital designing.


48. Give True or False.
(a) CPLD consumes less power per gate when compared to FPGA.
(b) CPLD has more complexity than FPGA
(c) FPGA design is slower than corresponding ASIC design.
(d) FPGA can be used to verify the design before making a ASIC.
(e) PALs have programmable OR plane.
(f) FPGA designs are cheaper than corresponding ASIC, irrespective of design complexity.

(a) False
(b) False
(c) True
(d) True
(e) False
(f) False

49. Arrange the following in the increasing order of their complexity: FPGA,PLA,CPLD,PAL.

Increasing order of complexity: PLA, PAL, CPLD, FPGA.

50. Give the FPGA digital design cycle.

fpga_flow_f.JPG.jpg


51. What is DeMorgan's theorem?

For N variables, DeMorgan’s theorems are expressed in the following formulas:
(ABC..N)' = A' + B' + C' + ... + N' -- The complement of the product is equivalent to the sum of the complements.
(A + B + C + ... + N)' = A'B'C'...N' -- The complement of the sum is equivalent to the product of the complements.
This relationship so induced is called DeMorgan's duality.

52. F'(A, B, C, D) = C'D + ABC' + ABCD + D. Express F in Product of Sum form.

Complementing both sides and applying DeMorgan's Theorem:
F(A, B, C, D) = (C + D')(A' + B' + C)(A' + B' + C' + D')(D')

53. How many squares/cells will be present in the k-map of F(A, B, C)?

F(A, B, C) has three variables/inputs.
Therefore, number of squares/cells in k-map of F = 2(Number of variables) = 23 = 8.

54. Simplify F(A, B, C, D) = S ( 0, 1, 4, 5, 7, 8, 9, 12, 13)

The four variable k-map of the given expression is:

boolean_example2_f.jpg

The grouping is also shown in the diagram. Hence we get,
F(A, B, C, D) = C' + A'BD

55. Simplify F(A, B, C) = S (0, 2, 4, 5, 6) into Product of Sums.

The three variable k-map of the given expression is:

Boolean_example1_2_f.jpg

The 0's are grouped to get the F'.
F' = A'C + BC

Complementing both sides and using DeMorgan's theorem we get F,
F = (A + C')(B' + C')

56. The simplified expression obtained by using k-map method is unique. True or False. Explain your answer.

False. The simplest form obtained is not necessarily unique as grouping can be made in different ways.

57. Give the characteristic tables of RS, JK, D and T flip-flops.

RS flip-flop.
S
R
Q(t+1)
0
0
Q(t)
0
1
0
1
0
1
1
1
?

JK flip-flop
J
K
Q(t+1)
0
0
Q(t)
0
1
0
1
0
1
1
1
Q'(t)

D flip-flop
D
Q(t+1)
0
0
1
1

T flip-flop
T
Q(t+1)
0
Q(t)
1
Q'(t)


58. Give excitation tables of RS, JK, D and T flip-flops.

RS flip-flop.
Q(t)
Q(t+1)
S
R
0
0
0
X
0
1
1
0
1
0
0
1
1
1
X
0

JK flip-flop
Q(t)
Q(t+1)
J
K
0
0
0
X
0
1
1
X
1
0
X
1
1
1
X
0

D flip-flop
Q(t)
Q(t+1)
D
0
0
0
0
1
1
1
0
0
1
1
1

T flip-flop
Q(t)
Q(t+1)
T
0
0
0
0
1
1
1
0
1
1
1
0


59. Design a BCD counter with JK flip-flops



60. Design a counter with the following binary sequence 0, 1, 9, 3, 2, 8, 4 and repeat. Use T flip-flops.
1. What are the differences between a flip-flop and a latch?
Answer

Flip-flops are edge-sensitive devices where as latches are level sensitive devices.
Flip-flops are immune to glitches where are latches are sensitive to glitches.
Latches require less number of gates (and hence less power) than flip-flops.
Latches are faster than flip-flops.

2. What is the difference between Mealy and Moore FSM?

Mealy FSM uses only input actions, i.e. output depends on input and state. The use of a Mealy FSM leads often to a reduction of the number of states.
Moore FSM uses only entry actions, i.e. output depends only on the state. The advantage of the Moore model is a simplification of the behavior.

3. What are various types of state encoding techniques? Explain them.

One-Hot encoding: Each state is represented by a bit flip-flop). If there are four states then it requires four bits (four flip-flops) to represent the current state. The valid state values are 1000, 0100, 0010, and 0001. If the value is 0100, then it means second state is the current state.

One-Cold encoding: Same as one-hot encoding except that '0' is the valid value. If there are four states then it requires four bits (four flip-flops) to represent the current state. The valid state values are 0111, 1011, 1101, and 1110.

Binary encoding: Each state is represented by a binary code. A FSM having '2 power N' states requires only N flip-flops.

Gray encoding: Each state is represented by a Gray code. A FSM having '2 power N' states requires only N flip-flops.

4. Define Clock Skew , Negative Clock Skew, Positive Clock Skew.

Clock skew is a phenomenon in synchronous circuits in which the clock signal (sent from the clock circuit) arrives at different components at different times. This can be caused by many different things, such as wire-interconnect length, temperature variations, variation in intermediate devices, capacitive coupling, material imperfections, and differences in input capacitance on the clock inputs of devices using the clock.
There are two types of clock skew: negative skew and positive skew. Positive skew occurs when the clock reaches the receiving register later than it reaches the register sending data to the receiving register. Negative skew is the opposite: the receiving register gets the clock earlier than the sending register.

5. Give the transistor level circuit of a CMOS NAND gate.

nand_f.JPG.jpg


6. Design a 4-bit comparator circuit.



7. Design a Transmission Gate based XOR. Now, how do you convert it to XNOR (without inverting the output)?



8. Define Metastability.

If there are setup and hold time violations in any sequential circuit, it enters a state where its output is unpredictable, this state is known as metastable state or quasi stable state, at the end of metastable state, the flip-flop settles down to either logic high or logic low. This whole process is known as metastability.

9. Compare and contrast between 1's complement and 2's complement notation.

The only advantage of 1's complement is that it can be calculated easily, just by changing 0's into 1's and 1's into 0's. The 2's complement is calculated in two ways, (i) add 1 to the 1's complement of the number, and (ii) leave all the leading 0s in the least significant positions and keep first 1 unchanged, and then change 0's into 1's and 1's into 0's.

The advantages of 2's complement over 1's complement are:
(i) For subtraction with complements, 2's complement requires only one addition operation, where as for 1's complement requires two addition operations if there is an end carry.
(ii) 1's complement has two arithmetic zeros, all 0's and all 1's.

10. Give the transistor level circuit of CMOS, nMOS, pMOS, and TTL inverter gate.

not_f.JPG.jpg


1. What are set up time and hold time constraints?
Answer

Set up time is the amount of time before the clock edge that the input signal needs to be stable to guarantee it is accepted properly on the clock edge.
Hold time is the amount of time after the clock edge that same input signal has to be held before changing it to make sure it is sensed properly at the clock edge.
Whenever there are setup and hold time violations in any flip-flop, it enters a state where its output is unpredictable, which is known as as metastable state or quasi stable state. At the end of metastable state, the flip-flop settles down to either logic high or logic low. This whole process is known as metastability.

2. Give a circuit to divide frequency of clock cycle by two.

divide_2_f.JPG.jpg

1. What is DeMorgan's theorem?
Answer

2. F'(A, B, C, D) = C'D + ABC' + ABCD + D. Express F in Product of Sum form.

3. How many squares/cells will be present in the k-map of F(A, B, C)?

4. Simplify F(A, B, C, D) = Σ ( 0, 1, 4, 5, 7, 8, 9, 12, 13)

The four variable k-map of the given expression is:

boolean_example2_f.jpg

The grouping is also shown in the diagram. Hence we get,
F(A, B, C, D) = C' + A'BD

5. Simplify F(A, B, C) = Σ (0, 2, 4, 5, 6) into Product of Sums.

The three variable k-map of the given expression is:

Boolean_example1_2_f.jpg

The 0's are grouped to get the F'.
F' = A'C + BC

Complementing both sides and using DeMorgan's theorem we get F,
F = (A + C')(B' + C')

6. The simplified expression obtained by using k-map method is unique. True or False. Explain your answer.

False. The simplest form obtained is not necessarily unique as grouping can be made in different ways.

7. Give the characteristic tables of RS, JK, D and T flip-flops.

RS flip-flop.
S
R
Q(t+1)
0
0
Q(t)
0
1
0
1
0
1
1
1
?

JK flip-flop
J
K
Q(t+1)
0
0
Q(t)
0
1
0
1
0
1
1
1
Q'(t)

D flip-flop
D
Q(t+1)
0
0
1
1

T flip-flop
T
Q(t+1)
0
Q(t)
1
Q'(t)


8. Give excitation tables of RS, JK, D and T flip-flops.

RS flip-flop.
Q(t)
Q(t+1)
S
R
0
0
0
X
0
1
1
0
1
0
0
1
1
1
X
0

JK flip-flop
Q(t)
Q(t+1)
J
K
0
0
0
X
0
1
1
X
1
0
X
1
1
1
X
0

D flip-flop
Q(t)
Q(t+1)
D
0
0
0
0
1
1
1
0
0
1
1
1

T flip-flop
Q(t)
Q(t+1)
T
0
0
0
0
1
1
1
0
1
1
1
0


9. Design a BCD counter with JK flip-flops



10. Design a counter with the following binary sequence 0, 1, 9, 3, 2, 8, 4 and repeat. Use T flip-flops.

>> Introduction
>> Binary Number System
>> Complements
>> 2's Complement vs 1's Complement
>> Binary Logic
>> Logic Gates


Introduction

The fundamental idea of digital systems is to represent data in discrete form (Binary: ones and zeros) and processing that information. Digital systems have led to many scientific and technological advancements. Calculators, computers, are the examples of digital systems, which are widely used for commercial and business data processing. The most important property of a digital system is its ability to follow a sequence of steps to perform a task called program, which does the required data processing. The following diagram shows how a typical digital system will look like.
dld1_f.JPG.jpg

Representing the data in ones and zeros, i.e. in binary system is the root of the digital systems. All the digital system store data in binary format. Hence it is very important to know about binary number system. Which is explained below.

Binary Number System

The binary number system, or base-2 number system, is a number system that represents numeric values using two symbols, usually 0 and 1. The base-2 system is a positional notation with a radix of 2. Owing to its straightforward implementation in digital electronic circuitry using logic gates, the binary system is used internally by all computers. Suppose we need to represent 14 in binary number system.
14 - 01110 - 0x24 + 1x23 + 1x22 + 1x21 + 0x20
similarly,
23 - 10111 - 1x24 + 0x23 + 1x22 + 1x21 + 1x20

Complements

In digital systems, complements are used to simplify the subtraction operation. There are two types of complements they are:
The r's Complement
The (r-1)'s Complement

Given:
  • N a positive number.
  • r base of the number system.
  • n number of digits.
  • m number of digits in fraction part.
The r's complement of N is defined as rn - N for N not equal to 0 and 0 for N=0.

The (r-1)'s Complement of N is defined as rn - rm - N.

Subtraction with r's complement:

The subtraction of two positive numbers (M-N), both are of base r. It is done as follows:
1. Add M to the r's complement of N.
2. Check for an end carry:
(a) If an end carry occurs, ignore it.
(b) If there is no end carry, the negative of the r's complement of the result obtained in step-1 is the required value.

Subtraction with (r-1)'s complement:

The subtraction of two positive numbers (M-N), both are of base r. It is done as follows:
1. Add M to the (r-1)'s complement of N.
2. Check for an end carry:
(a) If an end carry occurs, add 1 to the result obtained in step-1.
(b) If there is no end carry, the negative of the (r-1)'s complement of the result obtained in step-1 is the required value.

For a binary number system the complements are: 2's complement and 1's complement.

2's Complement vs 1's Complement

The only advantage of 1's complement is that it can be calculated easily, just by changing 0s into 1s and 1s into 0s. The 2's complement is calculated in two ways, (i) add 1 to the 1's complement of the number, and (ii) leave all the leading 0s in the least significant positions and keep first 1 unchanged, and then change 0s into 1s and 1s into 0s.

The advantages of 2's complement over 1's complement are:
(i) For subtraction with complements, 2's complement requires only one addition operation, where as for 1's complement requires two addition operations if there is an end carry.
(ii) 1's complement has two arithmetic zeros, all 0s and all 1s.

Binary Logic

Binary logic contains only two discrete values like, 0 or 1, true or false, yes or no, etc. Binary logic is similar to Boolean algebra. It is also called as boolean logic. In boolean algebra there are three basic operations: AND, OR, and NOT.
AND: Given two inputs x, y the expression x.y or simply xy represents "x AND y" and equals to 1 if both x and y are 1, otherwise 0.
OR: Given two inputs x, y the expression x+y represents "x OR y" and equals to 1 if at least one of x and y is 1, otherwise 0.
NOT: Given x, the expression x' represents NOT(x) equals to 1 if x is 0, otherwise 0. NOT(x) is x complement.

Logic Gates

A logic gate performs a logical operation on one or more logic inputs and produces a single logic output. Because the output is also a logic-level value, an output of one logic gate can connect to the input of one or more other logic gates. The logic gate use binary logic or boolean logic. AND, OR, and NOT are the three basic logic gates of digital systems. Their symbols are shown below.

dld2_f.JPG.jpg

AND and OR gates can have more than two inputs. The above diagram shows 2 input AND and OR gates. The truth tables of AND, OR, and NOT logic gates are as follows.

tt1_f.JPG.jpg

The k-map Method

The "Karnaugh Map Method", also known as k-map method, is popularly used to simplify Boolean expressions. The map method is first proposed by Veitch and then modified by Karnaugh, hence it is also known as "Veitch Diagram". The map is a diagram made up of squares (equal to 2 power number of inputs/variables). Each square represents a minterm, hence any Boolean expression can be represented graphically using a k-map.
Boolean1_f.jpg

The above diagram shows two (I), three (II) and four (III) variable k-maps. The number of squares is equal 2 power number of variables. Two adjacent squares will differ only by one variable. The numbers inside the squares are shown for understanding purpose only. The number shown corresponds to a minterm in the the Boolean expression.

Simplification using k-map:
  • Obtain the logic expression in canonical form.
  • Identify all the minterms that produce an output of logic level 1 and place 1 in appropriate k-map cell/square. All others cells must contain a 0.
  • Every square containing 1 must be considered at least once.
  • A square containing 1 can be included in as many groups as desired.
  • There can be isolated 1's, i.e. which cannot be included in any group.
  • A group must be as large as possible. The number of squares in a group must be a power of 2 i.e. 2, 4, 8, ... so on.
  • The map is considered to be folded or spherical, therefore squares at the end of a row or column are treated as adjacent squares.
The simplest Boolean expression contains minimum number of literals in any one in sum of products or products of sum. The simplest form obtained is not necessarily unique as grouping can be made in different ways.

Valid Groups

The following diagram illustrates the valid grouping k-map method.

Boolean_grouping_f.jpg

Simplification: Product of Sums

The above method gives a simplified expression in Sum of Products form. With slight modification to the above method, we can get the simplified expression in Product of Sums form. Group adjacent 0's instead of 1's, which gives us the complement of the function i.e. F'. The complement of obtained F' gives us the required expression F, which is done using the DeMorgan's theorem. See Example-2 below for better understanding.

Examples:

1. Simplify F(A, B, C) = Σ (0, 2, 4, 5, 6).

The three variable k-map of the given expression is:

Boolean_example1_f.jpg

The grouping is also shown in the diagram. Hence we get,
F(A, B, C) = AB' + C'


2. Simplify F(A, B, C) = Σ (0, 2, 4, 5, 6) into Product of Sums.

The three variable k-map of the given expression is:

Boolean_example1_2_f.jpg

The 0's are grouped to get the F'.
F' = A'C + BC

Complementing both sides and using DeMorgan's theorem we get F,
F = (A + C')(B' + C')


3. Simplify F(A, B, C, D) = Σ( 0, 1, 4, 5, 7, 8, 9, 12, 13)

The four variable k-map of the given expression is:

boolean_example2_f.jpg

The grouping is also shown in the diagram. Hence we get,
F(A, B, C, D) = C' + A'BD


Definition

A machine consisting of a set of states, a start state, an input, and a transition function that maps input and current states to a next state. Machine begins in the start state with an input. It changes to new states depending on the transition function. The transition function depends on current states and inputs. The output of the machine depends on input and/or current state.

There are two types of FSMs which are popularly used in the digital design. They are
  • Moore machine
  • Mealy machine
Moore machine

In Moore machine the output depends only on current state.The advantage of the Moore model is a simplification of the behavior.

Mealy machine

In Mealy machine the output depend on both current state and input.The advantage of the Mealy model is that it may lead to reduction of the number of states.

In both models the next state depends on current state and input. Some times designers use mixed models. States will be encoded for representing a particular state.

Representation of a FSM

A FSM can be represented in two forms:
  • Graph Notation
  • State Transition Table
Graph Notation
  • In this representation every state is a node. A node is represented using a circular shape and the state code is written within the circular shape.
  • The state transitions are represented by an edge with arrow head. The tail of the edge shows current state and arrow points to next state, depending on the input and current state. The state transition condition is written on the edge.
  • The initial/start state is sometime represented by a double lined circular shape, or a different colour shade.
The following image shows the way of graph notation of FSM. The codes 00 and 11 are the state codes. 00 is the value of initial/starting/reset state. The machine will start with 00 state. If the machine is reseted then the next state will be 00 state.

fsm1_f.JPG.jpg

State Transition Table

The State Transition Table has the following columns:
  • Current State: Contains current state code
  • Input: Input values of the FSM
  • Next State: Contains the next state code
  • Output: Expected output values
An example of state transition table is shown below.

fsm2_f.JPG.jpg

Mealy FSM

In Mealy machine the output depend on both current state and input.The advantage of the Mealy model is that it may lead to reduction of the number of states.

fsm1_f.JPG_1.jpg

The block diagram of the Mealy FSM is shown above. The output function depends on input also. The current state function updates the current state register (number of bits depends on state encoding used).

mealy_f.JPG.jpg

The above FSM shows an example of a Mealy FSM, the text on the arrow lines show (condition)/(output). 'a' is the input and 'x' is the output.

Moore FSM

In Moore machine the output depends only on current state.The advantage of the Moore model is a simplification of the behavior.

fsm2_f.JPG_1.jpg

The above figure shows the block diagram of a Moore FSM. The output function doesn't depend on input. The current state function updates the current state register.

moore_f.JPG.jpg

The above FSM shows an example of a Moore FSM. 'a' is the input. Inside every circle the text is (State code)/(output). Here there is only one output, in state '11' the output is '1'.

In both the FSMs the reset signal will change the contents of current state register to initial/reset state.

State Encoding

In a FSM design each state is represented by a binary code, which are used to identify the state of the machine. These codes are the possible values of the state register. The process of assigning the binary codes to each state is known as state encoding.
The choice of encoding plays a key role in the FSM design. It influences the complexity, size, power consumption, speed of the design. If the encoding is such that the transitions of flip-flops (of state register) are minimized then the power will be saved. The timing of the machine are often affected by the choice of encoding.
The choice of encoding depends on the type of technology used like ASIC, FPGA, CPLD etc. and also the design specifications.

State encoding techniques

The following are the most common state encoding techniques used.
  • Binary encoding
  • One-hot encoding
  • Gray encoding
In the following explanation assume that there are N number of states in the FSM.
Binary encoding
The code of a state is simply a binary number. The number of bits is equal to log2(N) rounded to next natural number. Suppose N = 6, then the number of bits are 3, and the state codes are:
S0 - 000
S1 - 001
S2 - 010
S3 - 011
S4 - 100
S5 - 101

One-hot encoding
In one-hot encoding only one bit of the state vector is asserted for any given state. All other state bits are zero. Thus if there are N states then N state flip-flops are required. As only one bit remains logic high and rest are logic low, it is called as One-hot encoding. If N = 5, then the number of bits (flip-flops) required are 5, and the state codes are:
S0 - 00001
S1 - 00010
S2 - 00100
S3 - 01000
S4 - 10000

To know more about one-hot encoding click here.

Gray encoding
Gray encoding uses the Gray codes, also known as reflected binary codes, to represent states, where two successive codes differ in only one digit. This helps is reducing the number of transition of the flip-flops outputs. The number of bits is equal to log2(N) rounded to next natural number. If N = 4, then 2 flip-flops are required and the state codes are:
S0 - 00
S1 - 01
S2 - 11
S3 - 10

Designing a FSM is the most common and challenging task for every digital logic designer. One of the key factors for optimizing a FSM design is the choice of state coding, which influences the complexity of the logic functions, the hardware costs of the circuits, timing issues, power usage, etc. There are several options like binary encoding, gray encoding, one-hot encoding, etc. The choice of the designer depends on the factors like technology, design specifications, etc. 









VERILOG STUFF










>> Introduction
>> The VLSI Design Flow
>> Importance of HDLs
>> Verilog HDL
>> Why Verilog ?
>> Digital Design Methods


Introduction

With the advent of VLSI technology and increased usage of digital circuits, designers has to design single chips with millions of transistors. It became almost impossible to verify these circuits of high complexity on breadboard. Hence Computer-aided techniques became critical for verification and design of VLSI digital circuits.As designs got larger and more complex, logic simulation assumed an important role in the design process. Designers could iron
out functional bugs in the architecture before the chip was designed further. All these factors which led to the evolution of Computer-Aided Digital Design, intern led to the emergence of Hardware Description Languages.

Verilog HDL and VHDL are the popular HDLs.Today, Verilog HDL is an accepted IEEE standard. In 1995, the original standard IEEE 1364-1995 was approved. IEEE 1364-2001 is the latest Verilog HDL standard that made significant improvements to the original standard.



The VLSI Design Flow

The VLSI IC circuits design flow is shown in the figure below. The various level of design are numbered and the gray coloured blocks show processes in the design flow.
intro_verilog.JPG.jpg
Specifications comes first, they describe abstractly the functionality, interface, and the architecture of the digital IC circuit to be designed.
  • Behavioral description is then created to analyze the design in terms of functionality, performance, compliance to given standards, and other specifications.
  • RTL description is done using HDLs. This RTL description is simulated to test functionality. From here onwards we need the help of EDA tools.
  • RTL description is then converted to a gate-level net list using logic synthesis tools. A gate-level netlist is a description of the circuit in terms of gates and connections between them, which are made in such a way that they meet the timing, power and area specifications.
  • Finally a physical layout is made, which will be verified and then sent to fabrication.

Importance of HDLs
  • RTL descriptions, independent of specific fabrication technology can be made an verified.
  • functional verification of the design can be done early in the design cycle.
  • Better representation of design due to simplicity of HDLs when compared to gate-level schematics.
  • Modification and optimization of the design became easy with HDLs.
  • Cuts down design cycle time significantly because the chance of a functional bug at a later stage in the design-flow is minimal.
Verilog HDL

Verilog HDL is one of the most used HDLs. It can be used to describe designs at four levels of abstraction:
  1. Algorithmic level.
  2. Register transfer level (RTL).
  3. Gate level.
  4. Switch level (the switches are MOS transistors inside gates).

Why Verilog ?
  • Easy to learn and easy to use, due to its similarity in syntax to that of the C programming language.
  • Different levels of abstraction can be mixed in the same design.
  • Availability of Verilog HDL libraries for post-logic synthesis simulation.
  • Most of the synthesis tools support Verilog HDL.
  • The Programming Language Interface (PLI) is a powerful feature that allows the user to write custom C code to interact with the internal data structures of Verilog. Designers can customize a Verilog HDL simulator to their needs with the PLI.

Digital design methods

Digital design methods are of two types:
  1. Top-down design method : In this design method we first define the top-level block and then we build necessary sub-blocks, which are required to build the top-level block. Then the sub-blocks are divided further into smaller-blocks, and so on. The bottom level blocks are called as leaf cells. By saying bottom level it means that the leaf cell cannot be divided further.
  2. Bottom-up design method : In this design method we first find the bottom leaf cells, and then start building upper sub-blocks and building so on, we reach the top-level block of the design.
In general a combination of both types is used. These types of design methods helps the design architects, logics designers, and circuit designers. Design architects gives specifications to the logic designers, who follow one of the design methods or both. They identify the leaf cells. Circuit designers design those leaf cells, and they try to optimize leaf cells in terms of power, area, and speed. Hence all the design goes parallel and helps finishing the job faster.

Operators

There are three types of operators: unary, binary, and ternary, which have one, two, and three operands respectively.

Unary : Single operand, which precede the operand.
Ex: x = ~y
~ is a unary operator
y is the operand

binary : Comes between two operands.
Ex: x = y || z
|| is a binary operator
y and z are the operands

ternary : Ternary operators have two separate operators that separate three operands.
Ex: p = x ? y : z
? : is a ternary operator
x, y, and z are the operands

List of operators is given here.

Comments

Verilog HDL also have two types of commenting, similar to that of C programming language. // is used for single line commenting and '/*' and '*/' are used for commenting multiple lines which start with /* and end with */.
EX: // single line comment
/* Multiple line
commenting */
/* This is a // LEGAL comment */
/* This is an /* ILLEGAL */ comment */

Whitespace
  • - \b - backspace
  • - \t - tab space
  • - \n - new line
In verilog Whitespace is ignored except when it separates tokens. Whitespace is not ignored in strings. Whitesapces are generally used in writing test benches.

Strings

A string in verilog is same as that of C programming language. It is a sequence of characters enclosed in double quotes. String are treated as sequence of one byte ASCII values, hence they can be of one line only, they cannot be of multiple lines.
Ex: " This is a string "
" This is not treated as
string in verilog HDL "

Identifiers

Identifiers are user-defined words for variables, function names, module names, block names and instance names.Identifiers begin with a letter or underscore and can include any number of letters, digits and underscores. It is not legal to start identifiers with number or the dollar($) symbol in Verilog HDL. Identifiers in Verilog are case-sensitive.

Keywords

Keywords are special words reserved to define the language constructs. In verilog all keywords are in lowercase only. A list of all keywords in Verilog is given below:

always
and
assign
attribute
begin
buf
bufif0
bufif1
case
casex
casez
cmos
deassign
default
defparam
disable
edge
else
end
endattribute
endcase
endfunction
endmodule
endprimitive
endspecify
endtable
endtask
event
for
force
forever
fork
function
highz0
highz1
if
ifnone
initial
inout
input
integer
join
medium
module
large
macromodule
nand
negedge
nmos
nor
not
notif0
notif1
or
output
parameter
pmos
posedge
primitive
pull0
pull1
pulldown
pullup
rcmos
real
realtime
reg
release
repeat
rnmos
rpmos
rtran
rtranif0
rtranif1
scalared
signed
small
specify
specparam
strength
strong0
strong1
supply0
supply1
table
task
time
tran
tranif0
tranif1
tri
tri0
tri1
triand
trior
trireg
unsigned
vectored
wait
wand
weak0
weak1
while
wire
wor
xnor
xor


Verilog keywords also includes compiler directives, system tasks, and functions. Most of the keywords will be explained in the later sections.

Number Specification

Sized Number Specification

Representation: [size]'[base][number]
  • [size] is written only in decimal and specifies the number of bits.
  • [base] could be 'd' or 'D' for decimal, 'h' or 'H' for hexadecimal, 'b' or 'B' for binary, and 'o' or 'O' for octal.
  • [number] The number is specified as consecutive digits. Uppercase letters are legal for number specification (in case of hexadecimal numbers).
Ex: 4'b1111 : 4-bit binary number
16'h1A2F : 16-bit hexadecimal number
32'd1 : 32-bit decimal number
8'o3 : 8-bit octal number

Unsized Number Specification

By default numbers that are specified without a [base] specification are decimal numbers. Numbers that are written without a [size] specification have a default number of bits that is simulator and/or machine specific (generally 32).

Ex: 123 : This is a decimal number
'hc3 : This is a hexadecimal number
Number of bits depends on simulator/machine, generally 32.

x or z values
x - Unknown value.
z - High impedance value
An x or z sets four bits for a number in the hexadecimal base, three bits for a number in the octal base, and one bit for a number in the binary base.

Note: If the most significant bit of a number is 0, x, or z, the number is automatically extended to fill the most significant bits, respectively, with 0, x, or z. This makes it easy to assign x or z to whole vector. If the most significant digit is 1, then it is also zero extended.

Negative Numbers

Representation: -[size]'[base][number]

Ex: -8'd9 : 8-bit negative number stored as 2's complement of 8
-8'sd3 : Used for performing signed integer math
4'd-2 : Illegal

Underscore(_) and question(?) mark
An underscore, "_" is allowed to use anywhere in a number except in the beginning. It is used only to improve readability of numbers and are ignored by Verilog. A question mark "?" is the alternative for z w.r.t. numbers
Ex: 8'b1100_1101 : Underscore improves readability
4'b1??1 : same as 4'b1zz1


Value Set

The Verilog HDL value set consists of four basic values:
  • 0 - represents a logic zero, or a false condition.
  • 1 - represents a logic one, or a true condition.
  • x - represents an unknown logic value.
  • z - represents a high-impedance state.
The values 0 and 1 are logical complements of one another. Almost all of the data types in the Verilog HDL store all four basic values.

Nets
Nets are used to make connections between hardware elements. Nets simply reflect the value at one end(head) to the other end(tail). It means the value they carry is continuously driven by the output of a hardware element to which they are connected to. Nets are generally declared using the keyword wire. The default value of net (wire) is z. If a net has no driver, then its value is z.

Registers
Registers are data storage elements. They hold the value until they are replaced by some other value. Register doesn't need a driver, they can be changed at anytime in a simulation. Registers are generally declared with the keyword reg. Its default value is x. Register data types should not be confused with hardware registers, these are simply variables.

Integers
Integer is a register data type of 32 bits. The only difference of declaring it as integer is that, it becomes a signed value. When you declare it as a 32 bit register (array) it is an unsigned value. It is declared using the keyword integer.

Real Numbers
Real number can be declared using the keyword real. They can be assigned values as follows:
real r_1;

r_1 = 1.234; // Decimal notation.
r_1 = 3e4; // Scientific notation. 

Parameters
Parameters are the constants that can be declared using the keyword parameter. Parameters are in general used for customization of a design. Parameters are declared as follows:

parameter p_1 = 123; // p_1 is a constant with value 123.

Keyword defparam can be used to change a parameter value at module instantiation. Keyword localparam is usedd to declare local parameters, this is used when their value should not be changed.

Vectors
Vectors can be a net or reg data types. They are declared as [high:low] or [low:high], but the left number is always the MSB of the vector.

wire [7:0] v_1; // v_1[7] is the MSB.
reg [0:15] v_2; // v_2[15] is the MSB.

In the above examples: If it is written as v_1[5:2], it is the part of the entire vector which contains 4 bits in order: v_1[5], v_1[4], v_1[3], v_1[2]. Similarly v_2[0:7], means the first half part of the vecotr v_2.
Vector parts can also be specified in a different way:
vector_name[start_bit+:width] : part-select increments from start_bit. In above example: v_2[0:7] is same as v_2[0+:8]. vector_name[start_bit-:width] : part-select decrements from start_bit. In above example: v_1[5:2] is same as v_1[5-:4].

Arrays
Arrays of reg, integer, real, time, and vectors are allowed. Arrays are declared as follows:

reg a_1[0:7];
real a_3[15:0];
wire [0:3] a_4[7:0]; // Array of vector
integer a_5[0:3][6:0]; // Double dimensional array

Strings
Strings are register data types. For storing a character, we need a 8-bit register data type. So if you want to create string variable of length n. The string should be declared as register data type of length n*8.

reg [8*8-1:0] string_1; // string_1 is a string of length 8.

Time Data Type
Time data type is declared using the keyword time. These are generally used to store simulation time. In general it is 64-bit long.

time t_1;
t_1 = $time; // assigns current simulation time to t_1.

There are some other data types, but are considered to be advanced data types, hence they are not discussed here.



A module is the basic building block in Verilog HDL. In general many elements are grouped to form a module, to provide a common functionality, which can be used at many places in the design. Port interface (using input and output ports) helps in providing the necessary functionality to the higher-level blocks. Thus any design modifications at lower level can be easily implemented without affecting the entire design code. The structure of a module is show in the figure below.
module_1.JPG.jpg
Keyword module is used to begin a module and it ends with the keyword endmodule. The syntax is as follows:
module module_name
---
// internals
---
endmodule

Example: D Flip-flop implementation (Try to understand the module structure, ignore unknown constraints/statements).

module D_FlipFlop(q, d, clk, reset);

// Port declarations
output q;
reg q;
input d, clk, reset;

// Internal statements - Logic
always @(posedge reset or poseedge clk)
if (reset)
q < = 1'b0;
else
q < = d;

// endmodule statement
endmodule

Note:
  • Multiple modules can be defined in a single design file with any order.
  • See that the endmodule statement should not written as endmodule; (no ; is used).
  • All components except module, module name, and endmodule are optional.
  • The 5 internal components can come in any order.



Modules communicate with external world using ports. They provide interface to the modules. A module definition contains list of ports. All ports in the list of ports must be declared in the module, ports can be one the following types:
  • Input port, declared using keyword input.
  • Output port, declared using keyword output.
  • Bidirectional port, declared using keyword inout.
All the ports declared are considered to be as wire by default. If a port is intended to be a wire, it is sufficient to declare it as output, input, or inout. If output port holds its value it should be declared as reg type. Ports of type input and inout cannot be declared as reg because reg variables hold values and input ports should not hold values but simply reflect the changes in the external signals they are connected to.

Port Connection Rules
  • Inputs: Always of type net(wire). Externally, they can be connected to reg or net type variable.
  • Outputs: Can be of reg or net type. Externally, they must be connected to a net type variable.
  • Bidirectional ports (inout): Always of type net. Externally, they must be connected to a net type variable.
Note:
  • It is possible to connect internal and external ports of different size. In general you will receive a warning message for width mismatch.
  • There can be unconnected ports in module instances.
Ports can declared in a module in C-language style:

module module_1( input a, input b, output c);
--
// Internals
--
endmodule

If there is an instance of above module, in some other module. Port connections can be made in two types.

Connection by Ordered List:
module_1 instance_name_1 ( A, B, C);
Connecting ports by name:
module_1 instance_name_2 (.a(A), .c(C), .b(B));

In connecting port by name, order is ignored.


Logical Operators

Symbol
Description
#Operators
!
Logical negation
One
||
Logical OR
Two
&&
Logical AND
Two

Relational Operators

Symbol
Description
#Operators
>
Greater than
Two
<
Less than
Two
>=
Greater than or equal to
Two
<=
Less than or equal to
Two

Equality Operators

Symbol
Description
#Operators
==
Equality
Two
!=
Inequality
Two
===
Case equality
Two
!==
Case inequality
Two

Arithmetic Operators

Symbol
Description
#Operators
+
Add
Two
-
Substract
Two
*
Multiply
Two
/
Divide
Two
**
Power
Two
%
Modulus
Two

Bitwise Operators

Symbol
Description
#Operators
~
Bitwise negation
One
&
Bitwise AND
Two
|
Bitwise OR
Two
^
Bitwise XOR
Two
^~ or ~^
Bitwise XNOR
Two

Reduction Operators

Symbol
Description
#Operators
&
Reduction AND
One
~&
Reduction NAND
One
|
Reduction OR
One
~|
Reduction NOR
One
^
Reduction XOR
One
^~ or ~^
Reduction XNOR
One

Shift Operators

Symbol
Description
#Operators
>>
Right shift
Two
<<
Left shift
Two
>>>
Arithmetic right shift
Two
<<<
Arithmetic left shift
Two

Conditional Operators

Symbol
Description
#Operators
?:
Conditional
Two

Replication Operators

Symbol
Description
#Operators
{ { } }
Replication
> One

Concatenation Operators

Symbol
Description
#Operators
{ }
Concatenation
> One

Operator Precedence

operators_1f.JPG.jpg


>> Introduction
>> Gate Primitives
>> Delays
>> Examples


Introduction

In Verilog HDL a module can be defined using various levels of abstraction. There are four levels of abstraction in verilog. They are:
  • Behavioral or algorithmic level: This is the highest level of abstraction. A module can be implemented in terms of the design algorithm. The designer no need to have any knowledge of hardware implementation.
  • Data flow level: In this level the module is designed by specifying the data flow. Designer must how data flows between various registers of the design.
  • Gate level: The module is implemented in terms of logic gates and interconnections between these gates. Designer should know the gate-level diagram of the design.
  • Switch level: This is the lowest level of abstraction. The design is implemented using switches/transistors. Designer requires the knowledge of switch-level implementation details.
Gate-level modeling is virtually the lowest-level of abstraction, because the switch-level abstraction is rarely used. In general, gate-level modeling is used for implementing lowest level modules in a design like, full-adder, multiplexers, etc. Verilog HDL has gate primitives for all basic gates.

Gate Primitives
Gate primitives are predefined in Verilog, which are ready to use. They are instantiated like modules. There are two classes of gate primitives: Multiple input gate primitives and Single input gate primitives.
Multiple input gate primitives include and, nand, or, nor, xor, and xnor. These can have multiple inputs and a single output. They are instantiated as follows:

// Two input AND gate.
and and_1 (out, in0, in1);

// Three input NAND gate.
nand nand_1 (out, in0, in1, in2); 

// Two input OR gate.
or or_1 (out, in0, in1);

// Four input NOR gate.
nor nor_1 (out, in0, in1, in2, in3);

// Five input XOR gate.
xor xor_1 (out, in0, in1, in2, in3, in4);

// Two input XNOR gate.
xnor and_1 (out, in0, in1);

Note that instance name is not mandatory for gate primitive instantiation. The truth tables of multiple input gate primitives are as follows:

tt1f.JPG.jpg

Single input gate primitives include not, buf, notif1, bufif1, notif0, and bufif0. These have a single input and one or more outputs. Gate primitives notif1, bufif1, notif0, and bufif0 have a control signal. The gates propagate if only control signal is asserted, else the output will be high impedance state (z). They are instantiated as follows:

// Inverting gate.
not not_1 (out, in);

// Two output buffer gate.
buf buf_1 (out0, out1, in);

// Single output Inverting gate with active-high control signal.
notif1 notif1_1 (out, in, ctrl);

// Double output buffer gate with active-high control signal.
bufif1 bufif1_1 (out0, out1, in, ctrl);

// Single output Inverting gate with active-low control signal.
notif0 notif0_1 (out, in, ctrl);

// Single output buffer gate with active-low control signal.
bufif0 bufif1_0 (out, in, ctrl);

The truth tables are as follows:

tt2f.JPG.jpg

Array of Instances:
wire [3:0] out, in0, in1; 
and and_array[3:0] (out, in0, in1);

The above statement is equivalent to following bunch of statements:

and and_array0 (out[0], in0[0], in1[0]); 
and and_array1 (out[1], in0[1], in1[1]);
and and_array2 (out[2], in0[2], in1[2]); 
and and_array3 (out[3], in0[3], in1[3]);


Gate Delays:
In Verilog, a designer can specify the gate delays in a gate primitive instance. This helps the designer to get a real time behavior of the logic circuit.
Rise delay: It is equal to the time taken by a gate output transition to 1, from another value 0, x, or z.
Fall delay: It is equal to the time taken by a gate output transition to 0, from another value 1, x, or z.
Turn-off delay: It is equal to the time taken by a gate output transition to high impedance state, from another value 1, x, or z.
  • If the gate output changes to x, the minimum of the three delays is considered.
  • If only one delay is specified, it is used for all delays.
  • If two values are specified, they are considered as rise, and fall delays.
  • If three values are specified, they are considered as rise, fall, and turn-off delays.
  • The default value of all delays is zero.
and #(5) and_1 (out, in0, in1);
// All delay values are 5 time units.

nand #(3,4,5) nand_1 (out, in0, in1);
// rise delay = 3, fall delay = 4, and turn-off delay = 5.

or #(3,4) or_1 (out, in0, in1);
// rise delay = 3, fall delay = 4, and turn-off delay = min(3,4) = 3.

There is another way of specifying delay times in verilog, Min:Typ:Max values for each delay. This helps designer to have a much better real time experience of design simulation, as in real time logic circuits the delays are not constant. The user can choose one of the delay values using +maxdelays, +typdelays, and +mindelays at run time. The typical value is the default value.

and #(4:5:6) and_1 (out, in0, in1);
// For all delay values: Min=4, Typ=5, Max=6.

nand #(3:4:5,4:5:6,5:6:7) nand_1 (out, in0, in1);
// rise delay: Min=3, Typ=4, Max=5, fall delay: Min=4, Typ=5, Max=6, turn-off delay: Min=5, Typ=6, Max=7.

In the above example, if the designer chooses typical values, then rise delay = 4, fall delay = 5, turn-off delay = 6.

Examples:
1. Gate level modeling of a 4x1 multiplexer.

The gate-level circuit diagram of 4x1 mux is shown below. It is used to write a module for 4x1 mux.

4x1_muxf.JPG.jpg

module 4x1_mux (out, in0, in1, in2, in3, s0, s1);

// port declarations
output out; // Output port.
input in0, in1, in2. in3; // Input ports.
input s0, s1; // Input ports: select lines.

// intermediate wires
wire inv0, inv1; // Inverter outputs.
wire a0, a1, a2, a3; // AND gates outputs.

// Inverters.
not not_0 (inv0, s0);
not not_1 (inv1, s1);

// 3-input AND gates.
and and_0 (a0, in0, inv0, inv1);
and and_1 (a1, in1, inv0, s1);
and and_2 (a2, in2, s0, inv1);
and and_3 (a3, in3, s0, s1);

// 4-input OR gate.
or or_0 (out, a0, a1, a2, a3);

endmodule

2. Implementation of a full adder using half adders.

Half adder:

haf.JPG.jpg


module half_adder (sum, carry, in0, in1);

output sum, carry;
input in0, in1;

// 2-input XOR gate.
xor xor_1 (sum, in0, in1);

// 2-input AND gate.
and and_1 (carry, in0, in1);

endmodule

Full adder:
faf.JPG.jpg

module full_adder (sum, c_out, ino, in1, c_in);

output sum, c_out;
input in0, in1, c_in;

wire s0, c0, c1;

// Half adder : port connecting by order.
half_adder ha_0 (s0, c0, in0, in1);

// Half adder : port connecting by name.
half_adder ha_1 (.sum(sum),
                .in0(s0),
                .in1(c_in),
                .carry(c1));

// 2-input XOR gate, to get c_out.
xor xor_1 (c_out, c0, c1);

endmodule


>> Introduction
>> The assign Statement
>> Delays
>> Examples


Introduction

Dataflow modeling is a higher level of abstraction. The designer no need have any knowledge of logic circuit. He should be aware of data flow of the design. The gate level modeling becomes very complex for a VLSI circuit. Hence dataflow modeling became a very important way of implementing the design.
In dataflow modeling most of the design is implemented using continuous assignments, which are used to drive a value onto a net. The continuous assignments are made using the keyword assign.

The assign statement

The assign statement is used to make continuous assignment in the dataflow modeling. The assign statement usage is given below:

assign out = in0 + in1; // in0 + in1 is evaluated and then assigned to out.

Note:
  • The LHS of assign statement must always be a scalar or vector net or a concatenation. It cannot be a register.
  • Continuous statements are always active statements.
  • Registers or nets or function calls can come in the RHS of the assignment.
  • The RHS expression is evaluated whenever one of its operands changes. Then the result is assigned to the LHS.
  • Delays can be specified.
Examples:

assign out[3:0] = in0[3:0] & in1[3:0];

assign {o3, o2, o1, o0} = in0[3:0] | {in1[2:0],in2}; // Use of concatenation.

Implicit Net Declaration:

wire in0, in1;
assign out = in0 ^ in1;

In the above example out is undeclared, but verilog makes an implicit net declaration for out.

Implicit Continuous Assignment:

wire out = in0 ^ in1;

The above line is the implicit continuous assignment. It is same as,

wire out;
assign out = in0 ^ in1;

Delays

There are three types of delays associated with dataflow modeling. They are: Normal/regular assignment delay, implicit continuous assignment delay and net declaration delay.

Normal/regular assignment delay:

assign #10 out = in0 | in1;

If there is any change in the operands in the RHS, then RHS expression will be evaluated after 10 units of time. Lets say that at time t, if there is change in one of the operands in the above example, then the expression is calculated at t+10 units of time. The value of RHS operands present at time t+10 is used to evaluate the expression.

Implicit continuous assignment delay:

wire #10 out = in0 ^ in1;

is same as

wire out;
assign 10 out = in0 ^ in1;

Net declaration delay:

wire #10 out;
assign out = in;

is same as

wire out;
assign #10 out = in;

Examples

1. Implementation of a 2x4 decoder.

module decoder_2x4 (out, in0, in1);

output out[0:3];
input in0, in1;

// Data flow modeling uses logic operators.
assign out[0:3] = { ~in0 & ~in1, in0 & ~in1,
                  ~in0 & in1, in0 & in1 };

endmodule

2. Implementation of a 4x1 multiplexer.

module mux_4x1 (out, in0, in1, in2, in3, s0, s1);

output out;
input in0, in1, in2, in3;
input s0, s1;

assign out = (~s0 & ~s1 & in0)|(s0 & ~s1 & in1)|
             (~s0 & s1 & in2)|(s0 & s1 & in0);

endmodule

3. Implementation of a 8x1 multiplexer using 4x1 multiplexers.

module mux_8x1 (out, in, sel);

output out;
input [7:0] in;
input [2:0] sel;

wire m1, m2;

// Instances of 4x1 multiplexers.
mux_4x1 mux_1 (m1, in[0], in[1], in[2],
               in[3], sel[0], sel[1]);
mux_4x1 mux_2 (m2, in[4], in[5], in[6],
               in[7], sel[0], sel[1]);

assign out = (~sel[2] & m1)|(sel[2] & m2);

endmodule

4. Implementation of a Full adder.

module full_adder (sum, c_out, in0, in1, c_in);

output sum, c_out;
input in0, in1, c_in;

assign { c_out, sum } = in0 + in1 + c_in;

endmodule


>> Introduction
>> The initial Construct
>> The always Construct
>> Procedural Assignments
>> Block Statements
>> Conditional (if-else) Statement
>> Case Statement
>> Loop Statements
>> Examples


Introduction

Behavioral modeling is the highest level of abstraction in the Verilog HDL. The other modeling techniques are relatively detailed. They require some knowledge of how hardware, or hardware signals work. The abstraction in this modeling is as simple as writing the logic in C language. This is a very powerful abstraction technique. All that designer needs is the algorithm of the design, which is the basic information for any design.

Most of the behavioral modeling is done using two important constructs: initial and always. All the other behavioral statements appear only inside these two structured procedure constructs.

The initial Construct

The statements which come under the initial construct constitute the initial block. The initial block is executed only once in the simulation, at time 0. If there is more than one initial block. Then all the initial blocks are executed concurrently. The initial construct is used as follows:
initial
begin
reset = 1'b0;
clk = 1'b1;
end

or

initial
clk = 1'b1;

In the first initial block there are more than one statements hence they are written between begin and end. If there is only one statement then there is no need to put begin and end.

The always Construct

The statements which come under the always construct constitute the always block. The always block starts at time 0, and keeps on executing all the simulation time. It works like a infinite loop. It is generally used to model a functionality that is continuously repeated.

always
#5 clk = ~clk;

initial
clk = 1'b0;

The above code generates a clock signal clk, with a time period of 10 units. The initial blocks initiates the clk value to 0 at time 0. Then after every 5 units of time it toggled, hence we get a time period of 10 units. This is the way in general used to generate a clock signal for use in test benches.

always @(posedge clk, negedge reset)
begin
a = b + c;
    d = 1'b1;
end

In the above example, the always block will be executed whenever there is a positive edge in the clk signal, or there is negative edge in the reset signal. This type of always is generally used in implement a FSM, which has a reset signal.

always @(b,c,d)
begin
    a = ( b + c )*d;
    e = b | c;
end

In the above example, whenever there is a change in b, c, or d the always block will be executed. Here the list b, c, and d is called the sensitivity list.

In the Verilog 2000, we can replace always @(b,c,d) with always @(*), it is equivalent to include all input signals, used in the always block. This is very useful when always blocks is used for implementing the combination logic.

Procedural Assignments

Procedural assignments are used for updating reg, integer, time, real, realtime, and memory data types. The variables will retain their values until updated by another procedural assignment. There is a significant difference between procedural assignments and continuous assignments.
Continuous assignments drive nets and are evaluated and updated whenever an input operand changes value. Where as procedural assignments update the value of variables under the control of the procedural flow constructs that surround them.

The LHS of a procedural assignment could be:
  • reg, integer, real, realtime, or time data type.
  • Bit-select of a reg, integer, or time data type, rest of the bits are untouched.
  • Part-select of a reg, integer, or time data type, rest of the bits are untouched.
  • Memory word.
  • Concatenation of any of the previous four forms can be specified.
When the RHS evaluates to fewer bits than the LHS, then if the right-hand side is signed, it will be sign-extended to the size of the left-hand side.

There are two types of procedural assignments: blocking and non-blocking assignments.

Blocking assignments: A blocking assignment statements are executed in the order they are specified in a sequential block. The execution of next statement begin only after the completion of the present blocking assignments. A blocking assignment will not block the execution of the next statement in a parallel block. The blocking assignments are made using the operator =.

initial
begin
    a = 1;
    b = #5 2;
    c = #2 3;
end

In the above example, a is assigned value 1 at time 0, and b is assigned value 2 at time 5, and c is assigned value 3 at time 7.

Non-blocking assignments: The nonblocking assignment allows assignment scheduling without blocking the procedural flow. The nonblocking assignment statement can be used whenever several variable assignments within the same time step can be made without regard to order or dependence upon each other. Non-blocking assignments are made using the operator <=.
Note: <= is same for less than or equal to operator, so whenever it appears in a expression it is considered to be comparison operator and not as non-blocking assignment.

initial
begin
    a <= 1;
    b <= #5 2;
    c <= #2 3;
end

In the above example, a is assigned value 1 at time 0, and b is assigned value 2 at time 5, and c is assigned value 3 at time 2 (because all the statements execution starts at time 0, as they are non-blocking assignments.

Block Statements

Block statements are used to group two or more statements together, so that they act as one statement. There are two types of blocks:
  • Sequential block.
  • Parallel block.
Sequential block: The sequential block is defined using the keywords begin and end. The procedural statements in sequential block will be executed sequentially in the given order. In sequential block delay values for each statement shall be treated relative to the simulation time of the execution of the previous statement. The control will pass out of the block after the execution of last statement.

Parallel block: The parallel block is defined using the keywords fork and join. The procedural statements in parallel block will be executed concurrently. In parallel block delay values for each statement are considered to be relative to the simulation time of entering the block. The delay control can be used to provide time-ordering for procedural assignments. The control shall pass out of the block after the execution of the last time-ordered statement.

Note that blocks can be nested. The sequential and parallel blocks can be mixed.

Block names: All the blocks can be named, by adding : block_name after the keyword begin or fork. The advantages of naming a block are:
  • It allows to declare local variables, which can be accessed by using hierarchical name referencing.
  • They can be disabled using the disable statement (disable block_name;).

Conditional (if-else) Statement

The condition (if-else) statement is used to make a decision whether a statement is executed or not. The keywords if and else are used to make conditional statement. The conditional statement can appear in the following forms.

if ( condition_1 )
    statement_1;

if ( condition_2 )
    statement_2;
else
    statement_3;

if ( condition_3 )
    statement_4;
else if ( condition_4 )
    statement_5;
else
    statement_6;

if ( condition_5 )
begin
    statement_7;
    statement_8;
end
else
begin
    statement_9;
    statement_10;
end

Conditional (if-else) statement usage is similar to that if-else statement of C programming language, except that parenthesis are replaced by begin and end.

Case Statement

The case statement is a multi-way decision statement that tests whether an expression matches one of the expressions and branches accordingly. Keywords case and endcase are used to make a case statement. The case statement syntax is as follows.

case (expression)
    case_item_1: statement_1;
    case_item_2: statement_2;
    case_item_3: statement_3;
    ...
    ...
    default: default_statement;
endcase

If there are multiple statements under a single match, then they are grouped using begin, and end keywords. The default item is optional.

Case statement with don't cares: casez and casex

casez treats high-impedance values (z) as don't cares. casex treats both high-impedance (z) and unknown (x) values as don't cares. Don't-care values (z values for casez, z and x values for casex) in any bit of either the case expression or the case items shall be treated as don't-care conditions during the comparison, and that bit position shall not be considered. The don't cares are represented using the ? mark.

Loop Statements

There are four types of looping statements in Verilog:

Forever Loop

Forever loop is defined using the keyword forever, which Continuously executes a statement. It terminates when the system task $finish is called. A forever loop can also be ended by using the disable statement.

initial
begin
    clk = 1'b0;
    forever #5 clk = ~clk;
end

In the above example, a clock signal with time period 10 units of time is obtained.

Repeat Loop

Repeat loop is defined using the keyword repeat. The repeat loop block continuously executes the block for a given number of times. The number of times the loop executes can be mention using a constant or an expression. The expression is calculated only once, before the start of loop and not during the execution of the loop. If the expression value turns out to be z or x, then it is treated as zero, and hence loop block is not executed at all.

initial
begin
    a = 10;
    b = 5;
    b <= #10 10;
    i = 0;
    repeat(a*b)
    begin
        $display("repeat in progress");
        #1 i = i + 1;
    end
end

In the above example the loop block is executed only 50 times, and not 100 times. It calculates (a*b) at the beginning, and uses that value only.

While Loop

The while loop is defined using the keyword while. The while loop contains an expression. The loop continues until the expression is true. It terminates when the expression is false. If the calculated value of expression is z or x, it is treated as a false. The value of expression is calculated each time before starting the loop. All the statements (if more than one) are mentioned in blocks which begins and ends with keyword begin and end keywords.

initial
begin
    a = 20;
    i = 0;
    while (i < a)
    begin
    $display("%d",i);
    i = i + 1;
    a = a - 1;
    end
end

In the above example the loop executes for 10 times. ( observe that a is decrementing by one and i is incrementing by one, so loop terminated when both i and a become 10).

For Loop

The For loop is defined using the keyword for. The execution of for loop block is controlled by a three step process, as follows:
  1. Executes an assignment, normally used to initialize a variable that controls the number of times the for block is executed.
  2. Evaluates an expression, if the result is false or z or x, the for-loop shall terminate, and if it is true, the for-loop shall execute its block.
  3. Executes an assignment normally used to modify the value of the loop-control variable and then repeats with second step.
Note that the first step is executed only once.

initial
begin
    a = 20;
    for (i = 0; i < a; i = i + 1, a = a - 1)
    $display("%d",i);
end

The above example produces the same result as the example used to illustrate the functionality of the while loop.

Examples

1. Implementation of a 4x1 multiplexer.

module 4x1_mux (out, in0, in1, in2, in3, s0, s1);

output out;

// out is declared as reg, as default is wire

reg out;

// out is declared as reg, because we will
// do a procedural assignment to it.

input in0, in1, in2, in3, s0, s1;

// always @(*) is equivalent to
// always @( in0, in1, in2, in3, s0, s1 )

always @(*)
begin
  case ({s1,s0})
      2'b00: out = in0;
      2'b01: out = in1;
      2'b10: out = in2;
      2'b11: out = in3;
      default: out = 1'bx;
  endcase
end

endmodule
2. Implementation of a full adder.

module full_adder (sum, c_out, in0, in1, c_in);

output sum, c_out;
reg sum, c_out

input in0, in1, c_in;

always @(*)
  {c_out, sum} = in0 + in1 + c_in;

endmodule

3. Implementation of a 8-bit binary counter.

module ( count, reset, clk );

output [7:0] count;
reg [7:0] count;

input reset, clk;

// consider reset as active low signal

always @( posedge clk, negedge reset)
begin
  if(reset == 1'b0)
      count <= 8'h00;
  else
      count <= count + 8'h01;
end

endmodule

Implementation of a 8-bit counter is a very good example, which explains the advantage of behavioral modeling. Just imagine how difficult it will be implementing a 8-bit counter using gate-level modeling.
In the above example the incrementation occurs on every positive edge of the clock. When count becomes 8'hFF, the next increment will make it 8'h00, hence there is no need of any modulus operator. Reset signal is active low.


>> Introduction
>> Differences
>> Tasks
>> Functions
>> Examples


Introduction

Tasks and functions are introduced in the verilog, to provide the ability to execute common procedures from different places in a description. This helps the designer to break up large behavioral designs into smaller pieces. The designer has to abstract the similar pieces in the description and replace them either functions or tasks. This also improves the readability of the code, and hence easier to debug. Tasks and functions must be defined in a module and are local to the module. Tasks are used when:
  • There are delay, timing, or event control constructs in the code.
  • There is no input.
  • There is zero output or more than one output argument.
Functions are used when:
  • The code executes in zero simulation time.
  • The code provides only one output(return value) and has at least one input.
  • There are no delay, timing, or event control constructs.

Differences

Functions 
Tasks
Can enable another function but not another task.
Can enable other tasks and functions.
Executes in 0 simulation time.
May execute in non-zero simulation time.
Must not contain any delay, event, or timing control statements.
May contain delay, event, or timing control statements.
Must have at least one input argument. They can have more than one input.
May have zero or more arguments of type input, output, or inout.
Functions always return a single value. They cannot have output or inout arguments.
Tasks do not return with a value, but can pass multiple values through output and inout arguments.

Tasks

There are two ways of defining a task. The first way shall begin with the keyword task, followed by the optional keyword automatic, followed by a name for the task, and ending with the keyword endtask. The keyword automatic declares an automatic task that is reentrant with all the task declarations allocated dynamically for each concurrent task entry. Task item declarations can specify the following:
  • Input arguments.
  • Output arguments.
  • Inout arguments.
  • All data types that can be declared in a procedural block
The second way shall begin with the keyword task, followed by a name for the task and a parenthesis which encloses task port list. The port list shall consist of zero or more comma separated ports. The task body shall follow and then the keyword endtask.

In both ways, the port declarations are same. Tasks without the optional keyword automatic are static tasks, with all declared items being statically allocated. These items shall be shared across all uses of the task executing concurrently. Task with the optional keyword automatic are automatic tasks. All items declared inside automatic tasks are allocated dynamically for each invocation. Automatic task items can not be accessed by hierarchical references. Automatic tasks
can be invoked through use of their hierarchical name.

Functions

Functions are mainly used to return a value, which shall be used in an expression. The functions are declared using the keyword function, and definition ends with the keyword endfunction.

If a function is called concurrently from two locations, the results are non-deterministic because both calls operate on the same variable space. The keyword automatic declares a recursive function with all the function declarations allocated dynamically for each recursive call. Automatic function items can not be accessed by hierarchical references. Automatic functions can be invoked through the use of their hierarchical name.

When a function is declared, a register with function name is declared implicitly inside Verilog HDL. The output of a function is passed back by setting the value of that register appropriately.

Examples

1. Simple task example, where task is used to get the address tag and offset of a given address.

module example1_task;

input addr;
wire [31:0] addr;

wire [23:0] addr_tag;
wire [7:0] offset;

task get_tag_and_offset ( addr, tag, offset);

input addr;
output tag, offset;

begin
 tag = addr[31:8];
 offset = addr[7:0];
end
endtask

always @(addr)
begin
 get_tag_and_offset (addr, addr_tag, addr_offset);
end

// other internals of module

endmodule

2. Task example, which uses the global variables of a module. Here task is used to do temperature conversion.

module example2_global;

real t1;
real t2;

// task uses the global variables of the module

task t_convert;
begin
 t2 = (9/5)*(t1+32);
end
endtask

always @(t1)
begin
 t_convert();
end

endmodule





                                                                             MISCHILLANEOUS




A complex programmable logic device (CPLD) is a semiconductor device containing programmable blocks called macro cell, which contains logic implementing disjunctive normal form expressions and more specialized logic operations. CPLD has complexity between that of PALs and FPGAs. It can has up to about 10,000 gates. CPLDs offer very predictable timing characteristics and are therefore ideal for critical control applications.

Applications
  • CPLDs are ideal for critical, high-performance control applications.
  • CPLD can be used for digital designs which perform boot loader functions.
  • CPLD is used to load configuration data for an FPGA from non-volatile memory.
  • CPLD are generally used for small designs, for example, they are used in simple applications such as address decoding.
  • CPLDs are often used in cost-sensitive, battery-operated portable applications, because of its small size and low-power usage.
Architecture

A CPLD contains a bunch of programmable functional blocks (FB) whose inputs and outputs are connected together by a global interconnection matrix. The global interconnection matrix is reconfigurable, so that we can change the connections between the FBs. There will be some I/O blocks which allow us to connect CPLD to external world. The block diagram of architecture of CPLD is shown below.

cpld1_f.JPG.jpg

The programmable functional block typically looks like the one shown below. There will be an array of AND gates which can be programed. The OR gates are fixed. But each manufacturer has their way of building the functional block. A registered output can be obtained by manipulating the feedback signals obtained from the OR ouputs.
cpld2_f.JPG.jpg
CPLD Programming

The design is first coded in HDL (Verilog or VHDL), once the code is validated (simulated and synthesized). During synthesis the target device(CPLD model) is selected, and a technology-mapped net list is generated. The net list can then be fitted to the actual CPLD architecture using a process called place-and-route, usually performed by the CPLD company's proprietary place-and-route software. Then the user will do some verification processes. If every thing is fine, he will use the CPLD, else he will reconfigure it.



Direct memory access (DMA) is a feature of modern computers that allows certain hardware subsystems within the computer to access system memory for reading and/or writing independently of the central processing unit. Computers that have DMA channels can transfer data to and from devices with much less CPU overhead than computers without a DMA channel.

Principle of DMA

DMA is an essential feature of all modern computers, as it allows devices to transfer data without subjecting the CPU to a heavy overhead. Otherwise, the CPU would have to copy each piece of data from the source to the destination. This is typically slower than copying normal blocks of memory since access to I/O devices over a peripheral bus is generally slower than normal system RAM. During this time the CPU would be unavailable for any other tasks involving CPU bus access, although it could continue doing any work which did not require bus access.

A DMA transfer essentially copies a block of memory from one device to another. While the CPU initiates the transfer, it does not execute it. For so-called "third party" DMA, as is normally used with the ISA bus, the transfer is performed by a DMA controller which is typically part of the motherboard chipset. More advanced bus designs such as PCI typically use bus mastering DMA, where the device takes control of the bus and performs the transfer itself.

A typical usage of DMA is copying a block of memory from system RAM to or from a buffer on the device. Such an operation does not stall the processor, which as a result can be scheduled to perform other tasks. DMA is essential to high performance embedded systems. It is also essential in providing so-called zero-copy implementations of peripheral device drivers as well as functionalities such as network packet routing, audio playback and streaming video.

DMA Controller

The processing unit which controls the DMA process is known as DMA controller. Typically the job of the DMA controller is to setup a connection between the memory unit and the IO device, with the permission from the microprocessor, so that the data can be transferred with much less processor overhead. The following figure shows a simple example of hardware interface of a DMA controller in a microprocessor based system.
dma1_f.JPG.jpg

Functioning (Follow the timing diagram for better understanding).
Whenever there is a IO request (IOREQ) for memory access from a IO device. The DMA controller sends a Halt signal to microprocessor. Generally halt signal (HALT) is active low. Microprocessor then acknowledges the DMA controller with a bus availability signal (BA). As soon as BA is available, then DMA controller sends an IO acknowledgment to IO device (IOACK) and chip enable (CE - active low) to the memory unit. The read/write control (R/W) signal will be give by the IO device to memory unit. Then the data transfer will begin. When the data transfer is finished, the IO device sends an end of transfer (EOT - active low) signal. Then the DMA controller will stop halting the microprocessor. ABUS and DBUS are address bus and data bus, respectively, they are included just for general information that microprocessor, IO devices, and memory units are connected to the buses, through which data will be transferred.
dma2_f.JPG.jpg





A Field-Programmable Gate Array (FPGA) is a semiconductor device containing programmable logic components called "logic blocks", and programmable interconnects. Logic blocks can be programmed to perform the function of basic logic gates such as AND, and XOR, or more complex combinational functions such as decoders or mathematical functions. In most FPGAs, the logic blocks also include memory elements, which may be simple flip-flops or more complete blocks of memory.

Applications
  • ASIC prototyping: Due to high cost of ASIC chips, the logic of the application is first verified by dumping HDL code in a FPGA. This helps for faster and cheaper testing. Once the logic is verified then they are made into ASICs.
  • Very useful in applications that can make use of the massive parallelism offered by their architecture. Example: code breaking, in particular brute-force attack, of cryptographic algorithms.
  • FPGAs are sued for computational kernels such as FFT or Convolution instead of a microprocessor.
  • Applications include digital signal processing, software-defined radio, aerospace and defense systems, medical imaging, computer vision, speech recognition, cryptography, bio-informatics, computer hardware emulation and a growing range of other areas.
Architecture

FPGA consists of large number of "configurable logic blocks" (CLBs) and routing channels. Multiple I/O pads may fit into the height of one row or the width of one column in the array. In general all the routing channels have the same width. The block diagram of FPGA architecture is shown below.

fpga_f.JPG.jpg

CLB: The CLB consists of an n-bit look-up table (LUT), a flip-flop and a 2x1 mux. The value n is manufacturer specific. Increase in n value can increase the performance of a FPGA. Typically n is 4. An n-bit lookup table can be implemented with a multiplexer whose select lines are the inputs of the LUT and whose inputs are constants. An n-bit LUT can encode any n-input Boolean function by modeling such functions as truth tables. This is an efficient way of encoding Boolean logic functions, and LUTs with 4-6 bits of input are in fact the key component of modern FPGAs. The block diagram of a CLB is shown below.

fpga1_f.JPG.jpg

Each CLB has n-inputs and only one input, which can be either the registered or the unregistered LUT output. The output is selected using a 2x1 mux. The LUT output is registered using the flip-flop (generally D flip-flop). The clock is given to the flip-flop, using which the output is registered. In general, high fanout signals like clock signals are routed via special-purpose dedicated routing networks, they and other signals are managed separately.

Routing channels are programmed to connect various CLBs. The connecting done according to the design. The CLBs are connected in such a way that logic of the design is achieved.

FPGA Programming

The design is first coded in HDL (Verilog or VHDL), once the code is validated (simulated and synthesized). During synthesis, typically done using tools like Xilinx ISE, FPGA Advantage, etc, a technology-mapped net list is generated. The net list can then be fitted to the actual FPGA architecture using a process called place-and-route, usually performed by the FPGA company's proprietary place-and-route software. The user will validate the map, place and route results via timing analysis, simulation, and other verification methodologies. Once the design and validation process is complete, the binary file generated is used to (re)configure the FPGA. Once the FPGA is (re)configured, it is tested. If there are any issues or modifications, the original HDL code will be modified and then entire process is repeated, and FPGA is reconfigured.




Definitions

FPGA: A Field-Programmable Gate Array (FPGA) is a semiconductor device containing programmable logic components called "logic blocks", and programmable interconnects. Logic blocks can be programmed to perform the function of basic logic gates such as AND, and XOR, or more complex combinational functions such as decoders or mathematical functions. For complete details click here.

ASIC: An application-specific integrated circuit (ASIC) is an integrated circuit designed for a particular use, rather than intended for general-purpose use. Processors, RAM, ROM, etc are examples of ASICs.

FPGA vs ASIC

Speed
ASIC rules out FPGA in terms of speed. As ASIC are designed for a specific application they can be optimized to maximum, hence we can have high speed in ASIC designs. ASIC can have hight speed clocks.

Cost
FPGAs are cost effective for small applications. But when it comes to complex and large volume designs (like 32-bit processors) ASIC products are cheaper.

Size/Area
FPGA are contains lots of LUTs, and routing channels which are connected via bit streams(program). As they are made for general purpose and because of re-usability. They are in-general larger designs than corresponding ASIC design. For example, LUT gives you both registered and non-register output, but if we require only non-registered output, then its a waste of having a extra circuitry. In this way ASIC will be smaller in size.

Power
FPGA designs consume more power than ASIC designs. As explained above the unwanted circuitry results wastage of power. FPGA wont allow us to have better power optimization. When it comes to ASIC designs we can optimize them to the fullest.

Time to Market
FPGA designs will till less time, as the design cycle is small when compared to that of ASIC designs. No need of layouts, masks or other back-end processes. Its very simple: Specifications -- HDL + simulations -- Synthesis -- Place and Route (along with static-analysis) -- Dump code onto FPGA and Verify. When it comes to ASIC we have to do floor planning and also advanced verification. The FPGA design flow eliminates the complex and time-consuming floor planning, place and route, timing analysis, and mask / re-spin stages of the project since the design logic is already synthesized to be placed onto an already verified, characterized FPGA device.
fpga_asic_f.JPG.jpg

Type of Design
ASIC can have mixed-signal designs, or only analog designs. But it is not possible to design them using FPGA chips.

Customization
ASIC has the upper hand when comes to the customization. The device can be fully customized as ASICs will be designed according to a given specification. Just imagine implementing a 32-bit processor on a FPGA!

Prototyping
Because of re-usability of FPGAs, they are used as ASIC prototypes. ASIC design HDL code is first dumped onto a FPGA and tested for accurate results. Once the design is error free then it is taken for further steps. Its clear that FPGA may be needed for designing an ASIC.

Non Recurring Engineering/Expenses
NRE refers to the one-time cost of researching, designing, and testing a new product, which is generally associated with ASICs. No such thing is associated with FPGA. Hence FPGA designs are cost effective.

Simpler Design Cycle
Due to software that handles much of the routing, placement, and timing, FPGA designs have smaller designed cycle than ASICs.

More Predictable Project Cycle
Due to elimination of potential re-spins, wafer capacities, etc. FPGA designs have better project cycle.

Tools
Tools which are used for FPGA designs are relatively cheaper than ASIC designs.

Re-Usability
A single FPGA can be used for various applications, by simply reprogramming it (dumping new HDL code). By definition ASIC are application specific cannot be reused.



Designing a FSM is the most common and challenging task for every digital logic designer. One of the key factors for optimizing a FSM design is the choice of state coding, which influences the complexity of the logic functions, the hardware costs of the circuits, timing issues, power usage, etc. There are several options like binary encoding, gray encoding, one-hot encoding, etc. The choice of the designer depends on the factors like technology, design specifications, etc.

One-hot encoding

In one-hot encoding only one bit of the state vector is asserted for any given state. All other state bits are zero. Thus if there are n states then n state flip-flops are required. As only one bit remains logic high and rest are logic low, it is called as One-hot encoding.
Example: If there is a FSM, which has 5 states. Then 5 flip-flops are required to implement the FSM using one-hot encoding. The states will have the following values:
S0 - 10000
S1 - 01000
S2 - 00100
S3 - 00010
S4 - 00001

Advantages
  • State decoding is simplified, since the state bits themselves can be used directly to check whether the FSM is in a particular state or not. Hence additional logic is not required for decoding, this is extremely advantageous when implementing a big FSM.
  • Low switching activity, hence resulting low power consumption, and less prone to glitches.
  • Modifying a design is easier. Adding or deleting a state and changing state transition equations (combinational logic present in FSM) can be done without affecting the rest of the design.
  • Faster than other encoding techniques. Speed is independent of number of states, and depends only on the number of transitions into a particular state.
  • Finding the critical path of the design is easier (static timing analysis).
  • One-hot encoding is particularly advantageous for FPGA implementations. If a big FSM design is implemented using FPGA, regular encoding like binary, gray, etc will use fewer flops for the state vector than one-hot encoding, but additional logic blocks will be required to encode and decode the state. But in FPGA each logic block contains one or more flip-flops (click here to know why?) hence due to presence of encoding and decoding more logics block will be used by regular encoding FSM than one-hot encoding FSM.
Disadvantages
  • The only disadvantage of using one-hot encoding is that it required more flip-flops than the other techniques like binary, gray, etc. The number of flip-flops required grows linearly with number of states. Example: If there is a FSM with 38 states. One-hot encoding requires 38 flip-flops where as other require 6 flip-flops only.



Designing a FSM is the most common and challenging task for every digital logic designer. One of the key factors for optimizing a FSM design is the choice of state coding, which influences the complexity of the logic functions, the hardware costs of the circuits, timing issues, power usage, etc. There are several options like binary encoding, gray encoding, one-hot encoding, etc. The choice of the designer depends on the factors like technology, design specifications, etc.

One-hot encoding

In one-hot encoding only one bit of the state vector is asserted for any given state. All other state bits are zero. Thus if there are n states then n state flip-flops are required. As only one bit remains logic high and rest are logic low, it is called as One-hot encoding.
Example: If there is a FSM, which has 5 states. Then 5 flip-flops are required to implement the FSM using one-hot encoding. The states will have the following values:
S0 - 10000
S1 - 01000
S2 - 00100
S3 - 00010
S4 - 00001

Advantages
  • State decoding is simplified, since the state bits themselves can be used directly to check whether the FSM is in a particular state or not. Hence additional logic is not required for decoding, this is extremely advantageous when implementing a big FSM.
  • Low switching activity, hence resulting low power consumption, and less prone to glitches.
  • Modifying a design is easier. Adding or deleting a state and changing state transition equations (combinational logic present in FSM) can be done without affecting the rest of the design.
  • Faster than other encoding techniques. Speed is independent of number of states, and depends only on the number of transitions into a particular state.
  • Finding the critical path of the design is easier (static timing analysis).
  • One-hot encoding is particularly advantageous for FPGA implementations. If a big FSM design is implemented using FPGA, regular encoding like binary, gray, etc will use fewer flops for the state vector than one-hot encoding, but additional logic blocks will be required to encode and decode the state. But in FPGA each logic block contains one or more flip-flops (click here to know why?) hence due to presence of encoding and decoding more logics block will be used by regular encoding FSM than one-hot encoding FSM.
Disadvantages
  • The only disadvantage of using one-hot encoding is that it required more flip-flops than the other techniques like binary, gray, etc. The number of flip-flops required grows linearly with number of states. Example: If there is a FSM with 38 states. One-hot encoding requires 38 flip-flops where as other require 6 flip-flops only.



Parallel and serial data transmission are most widely used data transfer techniques. Parallel transfer have been the preferred way for transfer data. But with serial data transmission we can achieve high speed and with some other advantages.

In parallel transmission n bits are transfered simultaneously, hence we have to process each bit separately and line up them in an order at the receiver. Hence we have to convert parallel to serial form. This is known as overhead in parallel transmission.

Signal skewing is the another problem with parallel data transmission. In the parallel communication, n bits leave at a time, but may not be received at the receiver at the same time, some may reach late than others. To overcome this problem, receiving end has to synchronize with the transmitter and must wait until all the bits are received. The greater the skew the greater the delay, if delay is increased that effects the speed.

Another problem associated with parallel transmission is crosstalk. When n wires lie parallel to each, the signal in some particular wire may get attenuated or disturbed due the induction, cross coupling etc. As a result error grows significantly, hence extra processing is necessary at the receiver.

Serial communication is full duplex where as parallel communication is half duplex. Which means that, in serial communication we can transmit and receive signal simultaneously, where as in parallel communication we can either transmit or receive the signal. Hence serial data transfer is superior to parallel data transfer.

Practically in computers we can achieve 150MBPS data transfer using serial transmission where as with parallel we can go up to 133MBPS only.

The advantage we get using parallel data transfer is reliability. Serial data transfer is less reliable than parallel data transfer.

Programmable Logic Array

In Digital design, we often use a device to perform multiple applications. The device configuration is changed (reconfigured) by programming it. Such devices are known as programmable devices. It is used to build reconfigurable digital circuits. The following are the popular programmable device
  • PLA - Programmable Logic Array
  • PAL - Programmable Array Logic
  • CPLD - Complex Programmable Logic Device (Click here for more details)
  • FPGA - Field-Programmable Gate Array (Click here for more details)

PLA: Programmable Logic Array is a programmable device used to implement combinational logic circuits. The PLA has a set of programmable AND planes, which link to a set of programmable OR planes, which can then be conditionally complemented to produce an output. This layout allows for a large number of logic functions to be synthesized in the sum of products canonical forms.

Suppose we need to implement the functions: X = A'BC + ABC + A'B'C' and Y = ABC + AB'C. The following figures shows how PLA is configured. The big dots in the diagram are connections. For the first AND gate (left most), A complement, B, and C are connected, which is first minterm of function X. For second AND gate (from left), A, B, and C are connected, which forms ABC. Similarly for A'B'C', and AB'C. Once the minterms are implemented. Now we have to combine them using OR gates to the functions X, and Y.

pla_f.JPG.jpg

One application of a PLA is to implement the control over a data path. It defines various states in an instruction set, and produces the next state (by conditional branching).

Note that the use of the word "Programmable" does not indicate that all PLAs are field-programmable; in fact many are mask-programmed during manufacture in the same manner as a ROM. This is particularly true of PLAs that are embedded in more complex and numerous integrated circuits such as microprocessors. PLAs that can be programmed after manufacture are called FPLA (Field-programmable logic array).



Random Access Memory (RAM) is a type of computer data storage. Its mainly used as main memory of a computer. RAM allows to access the data in any order, i.e random. The word random thus refers to the fact that any piece of data can be returned in a constant time, regardless of its physical location and whether or not it is related to the previous piece of data. You can access any memory cell directly if you know the row and column that intersect at that cell.
    Most of the RAM chips are volatile types of memory, where the information is lost after the power is switched off. There are some non-volatile types such as, ROM, NOR-Flash.

SRAM: Static Random Access Memory
SRAM is static, which doesn't need to be periodically refreshed, as SRAM uses bistable latching circuitry to store each bit. SRAM is volatile memory. Each bit in an SRAM is stored on four transistors that form two cross-coupled inverters. This storage cell has two stable states which are used to denote 0 and 1. Two additional access transistors serve to control the access to a storage cell during read and write operations. A typical SRAM uses six MOSFETs to store each memory bit.
    As SRAM doesnt need to be refreshed, it is faster than other types, but as each cell uses at least 6 transistors it is also very expensive. So in general SRAM is used for faster access memory units of a CPU.

DRAM: Dynamic Random Access Memory
In a DRAM, a transistor and a capacitor are paired to create a memory cell, which represents a single bit of data. The capacitor holds the bit of information. The transistor acts as a switch that lets the control circuitry on the memory chip read the capacitor or change its state. As capacitors leak charge, the information eventually fades unless the capacitor charge is refreshed periodically. Because of this refresh process, it is a dynamic memory.
    The advantage of DRAM is its structure simplicity. As it requires only one transistor and one capacitor per one bit, high density can be achieved. Hence DRAM is cheaper and slower, when compared to SRAM.

Other types of RAM

FPM DRAM: Fast page mode dynamic random access memory was the original form of DRAM. It waits through the entire process of locating a bit of data by column and row and then reading the bit before it starts on the next bit.

EDO DRAM: Extended data-out dynamic random access memory does not wait for all of the processing of the first bit before continuing to the next one. As soon as the address of the first bit is located, EDO DRAM begins looking for the next bit. It is about five percent faster than FPM.

SDRAM: Synchronous dynamic random access memory takes advantage of the burst mode concept to greatly improve performance. It does this by staying on the row containing the requested bit and moving rapidly through the columns, reading each bit as it goes. The idea is that most of the time the data needed by the CPU will be in sequence. SDRAM is about five percent faster than EDO RAM and is the most common form in desktops today.

DDR SDRAM: Double data rate synchronous dynamic RAM is just like SDRAM except that is has higher bandwidth, meaning greater speed.

DDR2 SDRAM: Double data rate two synchronous dynamic RAM. Its primary benefit is the ability to operate the external data bus twice as fast as DDR SDRAM. This is achieved by improved bus signaling, and by operating the memory cells at half the clock rate (one quarter of the data transfer rate), rather than at the clock rate as in the original DDR SRAM.


Every flip-flop has restrictive time regions around the active clock edge in which input should not change. We call them restrictive because any change in the input in this regions the output may be the expected one (*see below). It may be derived from either the old input, the new input, or even in between the two. Here we define, two very important terms in the digital clocking. Setup and Hold time. 
  • The setup time is the interval before the clock where the data must be held stable.
  • The hold time is the interval after the clock where the data must be held stable. Hold time can be negative, which means the data can change slightly before the clock edge and still be properly captured. Most of the current day flip-flops has zero or negative hold time. 

sh1_f.JPG.jpg

In the above figure, the shaded region is the restricted region. The shaded region is divided into two parts by the dashed line. The left hand side part of shaded region is the setup time period and the right hand side part is the hold time period. If the data changes in this region, as shown the figure. The output may, follow the input, or many not follow the input, or may go to metastable state (where output cannot be recognized as either logic low or logic high, the entire process is known as metastability). 

sh2_f.JPG.jpg

The above figure shows the restricted region (shaded region) for a flip-flop whose hold time is negative. The following diagram illustrates the restricted region of a D flip-flop. D is the input, Q is the output, and clock is the clock signal. If D changes in the restricted region, the flip-flop may not behave as expected, means Q is unpredictable. 

setup_hold_f.JPG.jpg

To avoid setup time violations: 
  • The combinational logic between the flip-flops should be optimized to get minimum delay.
  • Redesign the flip-flops to get lesser setup time.
  • Tweak launch flip-flop to have better slew at the clock pin, this will make launch flip-flop to be fast there by helping fixing setup violations.
  • Play with clock skew (useful skews).
To avoid hold time violations: 
  • By adding delays (using buffers).
  • One can add lockup-latches (in cases where the hold time requirement is very huge, basically to avoid data slip).
* may be expected one: which means output is not sure, it may be the one you expect. You can also say "may not be expected one". "may" implies uncertainty. Thanks for the readers for their comments. 



System-on-a-chip (SoC) refers to integrating all components of an electronic system into a single integrated circuit (chip). A SoC can include the integration of:
  • Ready made sub-circuits (IP)
  • One or more microcontroller, microprocessor or DSP core(s)
  • Memory components
  • Sensors
  • Digital, Analog, or Mixed signal components
  • Timing sources, like oscillators and phase-locked loops
  • Voltage regulators and power management circuits
The blocks of SoC are connected by a special bus, such as the AMBA bus. DMA controllers are used for routing the data directly between external interfaces and memory, by-passing the processor core and thereby increasing the data throughput of the SoC. SoC is widely used in the area of embedded systems. SoCs can be fabricated by several technologies, like, Full custom, Standard cell, FPGA, etc. SoC designs are usually power and cost effective, and more reliable than the corresponding multi-chip systems. A programmable SoC is known as PSoC.

Advantages of SoC are:
  • Small size, reduction in chip count
  • Low power consumption
  • Higher reliability
  • Lower memory requirements
  • Greater design freedom
  • Cost effective
Design Flow

SoC consists of both hardware and software( to control SoC components). The aim of SoC design is to develop hardware and software in parallel. SoC design uses pre-qualified hardware, along with their software (drivers) which control them. The hardware blocks are put together using CAD tools; the software modules are integrated using a software development environment. The SoC design is then programmed onto a FPGA, which helps in testing the behavior of SoC. Once SoC design passes the testing it is then sent to the place and route process. Then it will be fabricated. The chips will be completely tested and verified.



Why Reset?

A Reset is required to initialize a hardware design for system operation and to force an ASIC into a known state for simulation.

A reset simply changes the state of the device/design/ASIC to a user/designer defined state. There are two types of reset, what are they? As you can guess them, they are Synchronous reset and Asynchronous reset.

Synchronous Reset

A synchronous reset signal will only affect or reset the state of the flip-flop on the active edge of the clock. The reset signal is applied as is any other input to the state machine.

Advantages:
  • The advantage to this type of topology is that the reset presented to all functional flip-flops is fully synchronous to the clock and will always meet the reset recovery time.
  • Synchronous reset logic will synthesize to smaller flip-flops, particularly if the reset is gated with the logic generating the d-input. But in such a case, the combinational logic gate count grows, so the overall gate count savings may not be that significant.
  • Synchronous resets provide some filtering for the reset signal such that it is not effected by glitches, unless they occur right at the clock edge. A synchronous reset is recommended for some types of designs where the reset is generated by a set of internal conditions. As the clock will filter the logic equation glitches between clock edges.
Disadvantages:
  • The problem in this topology is with reset assertion. If the reset signal is not long enough to be captured at active clock edge (or the clock may be slow to capture the reset signal), it will result in failure of assertion. In such case the design needs a pulse stretcher to guarantee that a reset pulse is wide enough to be present during the active clock edge.
  • Another problem with synchronous resets is that the logic synthesis cannot easily distinguish the reset signal from any other data signal. So proper care has to be taken with logic synthesis, else the reset signal may take the fastest path to the flip-flop input there by making worst case timing hard to meet.
  • In some power saving designs the clocked is gated. In such designed only asynchronous reset will work.
  • Faster designs that are demanding low data path timing, can not afford to have extra gates and additional net delays in the data path due to logic inserted to handle synchronous resets.
Asynchronous Reset

An asynchronous reset will affect or reset the state of the flip-flop asynchronously i.e. no matter what the clock signal is. This is considered as high priority signal and system reset happens as soon as the reset assertion is detected.

Advantages:
  • High speeds can be achieved, as the data path is independent of reset signal.
  • Another advantage favoring asynchronous resets is that the circuit can be reset with or without a clock present.
  • As in synchronous reset, no work around is required for logic synthesis.
Disadvantages:
  • The problem with this type of reset occurs at logic de-assertion rather than at assertion like in synchronous circuits. If the asynchronous reset is released (reset release or reset removal) at or near the active clock edge of a flip-flop, the output of the flip-flop could go metastable.
  • Spurious resets can happen due to reset signal glitches.
Conclusion

Both types of resets have positives and negatives and none of them assure fail-proof design. So there is something called "Asynchronous assertion and Synchronous de-assertion" reset which can be used for best results. (which will be discussed in next post).