G4 Architecture

Up Next

CAD Project: WIll Crusoe Choke on Apple?

Architecture Of The G4 Processor

The G4 processor was designed to be targeted at both portable and desktop computing system applications. This had a dramatic effect on its design, which is a 32-bit architecture (as shown below), combined with a 128-bit, aptly named, Velocity Engine. This provides 32-bit effective addresses, integer data types of 8, 16, and 32-bits and floating-point data types of 32 and 64 bits.

G4 Hardware Design Diagram

The G4 (MPC7400) processor is designed basically around the G3 version. The standard architectural features were taken from the G3 and where needed improved upon, this means that the G4 incorporates all of the standard features that you would expect to find in a modern day microprocessor, some of which are listed below. However they have been combined with a new technology called the Velocity Engine (AltiVec Technology as it is sometimes called) described later.

Some of the standard architectural features incorporated in the G4 are:

Branch processing unit
This unit allows one branch to be processed per clock cycle, as well as fetching four instructions and resolving 2 speculations. This unit incorporates a 512-entry branch history table (BHT) and a 64-entry, 4-way set associative branch target instruction cache (BTIC).
Dispatch unit
Decode Unit
Completion Unit
This unit incorporates instruction tracking and peak completion of two instructions per cycle. As well as an 8-entry completion buffer.
Fixed-point units (FXUs) that share 32 GPRs for integer operands
Three-stage floating-point unit and a 32-entry FPR file
System Unit
Load/store unit
This unit incorporates all of the usual features such as 1 cycle load and store cache access, effective address generation, zero padding and sign extension. But also incorporates such features as internal floating-point conversion, sequencing for load/store multiples, as well as support for Big- and Little-endian addressing and all of their variants.
Memory management unit

As you would expect from a modern microprocessor the G4 (MPC7400) incorporates 4-stage pipelines. In fact the processor incorporates a pipelined superscalar PowerPC core, which is capable of issuing three instructions per clock cycle into seven independent execution units, as shown.

1. Two integer units
2. Double-precision floating-point unit
3. Vector unit
4. Load/store unit
5. System unit
6. Branch processing unit

G4 Block Diagram
Apple Block Diagram

The pipelined superscalar core has enabled the G4 to execute several instructions in parallel, as each separate execution unit has its own pipeline. So when combined with its use of simple instructions, the G4 has a high efficiency and throughput for most if not all applications/tasks. However having so many pipelines could cause problems with instruction execution consistency, as the designers realised. This was over come by incorporating rename buffers, which allow results of operations to be posted for future use by other instructions as well as a complicated Branch processing Unit (BPU), which receives its instructions during the fetching stage and performs CR lookahead operations to try and resolve any branches earlier. This overcomes the pipeline problem by making the execution of instructions more constant. The common pipeline structure of the G4 can be summarized by the below diagram.

Common Pipeline For The G4

To help control the operations of the processor the G4 incorporates a single clock input signal, and when combined with its multi-cycle instructions this provides a very effective architecture that is perfect for handling processor intensive tasks/applications.

Back to top, Next Page

Website Design Copyright 2000, Iain Gibson.
For problems or questions regarding this web contact [webmaster].
Last updated: January 30, 2001.