Intro: Fundamentals, Transistors, Gates

The Transformation Hierarchy

Traditionally, computer architecture has focused on the interface between software and hardware. We write some software that is then executed on hardware, but there needs to be an interface that compiles this software. At the underlying level, this software/hardware interface is implemented through micro-architecture.

Fig-1

Starting with the problem statement and aiming to control electrons to behave as we want them to, the above picture shows the typical steps involved. The orange highlight shows where in this hierarchy computer architecture plays a role.

However, with modern systems evolving and the emphasis on speed, timing, efficiency, and power consumption increasing, the scope of computer architecture has expanded from being part of algorithm design to being part of device design. This means that computer architecture should take into consideration aspects of algorithm design, programming languages, instruction set architecture (ISA), logic, and device design to account for improved efficiency.

To understand what we mean by co-design across hierarchy from algo to devices, consider an example where we have a customized hardware to run a specific algorithm of say, vectore multiplication - that would improve the performance drastically. The idea is to customize the algo for HW and customize HW for the algo.

In this course we would study something like below: Fig-2

Focusing back on the hierarchy, we start with a problem (because that's what computers are there for- to solve problems).

The next step is to define an algorithm. An algo is basically a step-by-step procedure that is guaranteed to terminate (finite), where each step is precisely defined (definite), and can be carried out by a computer (computational).

After algo comes programming. This step involves making a computer process the algo using a language it understands.

Then comes system software (VM, OS, MM). This sits on on the ISA and will be dealt with later. As for ISA - it is like a contract between SW and HW, and is something a programmer assumes the HW to satisy.

Micro-architecture is an implementation of the ISA. The same ISA can be implemented in a thousand different ways. To attain better performance, it's easier to pick a micro architecture as per requirement than to modify the ISA - because a lots of SW is sitting on the ISA and it's possible to modify everything easily.

Logic constitutes the building blocks of micro arch - the logic gates etc.,

Computer Architecture

"Essentially, computer arch is the art and science of designing computing platforms 
to achieve a set of design goals"

Examples of design goals could be:
- best performance on loads X, Y, and Z.
- longest battery life that has a low form factor (fits in your pocket), and costs less than Rs.20k/-
- best average performance across all loads at the best performance/cost ratio.

Designing a super comp is different from designing a smartphone, but the underlying principles are similar.

Studying comp arch:
- Enables better systems
- Enables new applications
- Helps us to understand why computers work the way they do

Side Note: Quantum computing is one area currently, where we have fancy HW but don't have a friendly SW stack. If we could have something in Quantum Computing similar to what we have for Machine Learning, that'd be a leap.

One can develop: - better HW if one understands SW.
- better SW if one understands HW.
- better computing systems of one understands both HW and SW.

This course cover SW/HW interface and micro-architecture, focussing on th tradeoffs and how they affect SW.

What is a Computer

Fig-3

What we will cover

Logic part
- Combinational logic design
- HW description languages (verilog)
- Sequential logic design
- Timing and verification
Micro arch
- Micro arch fundamentals
- Single cycle micro archs
- Multi-cycle and micro-programmed archs
- Pipelining
- Issues in pipelining: Dependence handling, State maintainance and recovery etc.,
- Branch prediction
- Out-of-Order execution
- Super scalar execution
SW/HW interface
- ISA (MIPS and LC3b as examples)
- Assembly language
From Algo to Logic
- Memory Technology and Organisation
- Memory Hierarchy
- Caches
- Multi-core caches
- Prefetching
- Virtual Memory
Processing paradigms
- Data flow at ISA level
- Superscalar execution
- VLIW (Very Long Instruction Words)
- Decoupled Access-Execute
- Systolic arrays
- SIMD (Single instruction multiple data) processing of vectors and arrays
- GPUs

General purpose vs Special purpose systems

Fig-4

Fig-5

GPUs are both general purpose and special purpose. Initially they were special purpose- for graphics. Over time, they became more general purpose.

Transistors

Fig-6

We also cover how these logic gates are inter-connected to form larger units that are needed to construct a computer.

MOS transitors: Combination of conductor (Metal), insulator (Oxide), and semiconductor

These are useful because we can combine many of these to realize simple logic gates.
The electrical properties of MOS are below the lowest level of our abstraction (refer section 1.6 and 1.7 from Harris and Harris).

Fig-7

Depending on the technology, high voltage can range from 0.3V to 3V

Fig-8

Fig-9

Fig-10

The above pic shows general implementation of an invert logic.

Pull-up and Pull-down logic ON at same time means there is a short circuit.
Both off at the same time means there's a floating output.

MOS transistors are imperfect switches. pmos pass 1 well but poor at passing 0. nmos is good at passing 0 but poor at passing 1.
Hence the general implementations have pmos pull up and nmos pulldown circuits.

Fig-11

Side note: Transistors in series are slower than those in parallel as series resistance gets added.
See sec 1.7.8 from H&H and read about pseudo nMOS.

Dynamic Power Consumption: Power used to charge capacitance as signals change (0 <--> 1). CfV² where f is charging freq of cap.

Static Power Consumption: Power used when signals do not change (V*I_leakage).

Energy Consumption: Power x time