## Chapter 5

## The DEC PDP-8

## Introduction ${ }^{1}$

The PDP-8 is a single-address, 12 -bit-word computer of the second generation. It is designed for task environments with minimum arithmetic computing and small Mp requirements. For example, it can be used to control laboratory devices, such as gas chromotographs or sampling oscilloscopes. Together with special T's, it is programmed to be a laboratory instrument, such as a pulse height analyzer or a spectrum analyzer. These applications are typical of the laboratory and process control requirements for which the machine was designed. As another example, it can serve as a message concentrator by controlling telephone lines to which typewriters and Teletypes are attached. The computer occasionally stands alone as a small-scale general-purpose computer. Most recently it was introduced as a small-scale general-purpose timesharing system, based on work at Carnegie-Mellon University and DEC. It is used as a KT (display) when it has a P (display; '338); this C is discussed in Chap. 25. The PDP-8 has achieved a production status formerly reserved for IBM computers; about 5,000 have been constructed.

PDP-8 differs from the character-oriented 8 -bit computer in Chap. 10 ; it is not unlike the 16 -bit computers, such as the IBM 1800 in Chap. 33. The PDP-8 is typical of several 12-bit computers: the early CDC-160 series (1960), CDC-6600 Peripheral and Control Processor (Chap. 39), the SDS-92, M.I.T. Lincoln Laboratory's Laboratory Instrument Computer LINC (1963), Washington University's Programmed Console (1967), and the SCC 650 (1966).

The PDP-5 (transistor, I963), PDP-8 (1965), PDP-8/S (serial, 1966) and PDP-8/I (integrated circuit, 1968), PDP-8/L (integrated circuit, 1968) constitute a series of computers based on evolving technology. All of these have identical ISP's. Their PMS structures are nearly identical, and all components other than Pc and Mp are compatible throughout the series. The LINC-8-338 PMS structure is presented in Fig. 1. A cost performance tradeoff took place in the PDP-8 (parallel-by-word arithmetic) and PDP-8/S (serial-by-bit arithmetic) implementations. A PDP- $8 / \mathrm{S}$ is one-fifteenth of a PDP-8 at one-half the cost. The performance factors can be attributed to $8 / 1.5$ or 5.3 for Mp speed and a factor of about 3 for logical organization, even though the same 2 -megahertz logic clock is used in both cases. The PDP- 8 is about 6.7 times a PDP-5.
${ }^{1}$ The initials in the title stand for Digital Equipment Corporation Programmed Data Processor.

The ISP of the PDP-8 Pc is about the most trivial in the book. It has only a few data operators, namely, $\leftarrow,+,-$ (negate), $\neg$, $\wedge, / 2, \times 2$, (optional) $\times, /$, and normalize. It operates on words, integers, and boolean vectors. However, there are microcoded instructions, which allow compound instructions to be formed in a single instruction.

The computer is straightforward and illustrates the levels discussed in Chap. I. We can easily look at it from the "top down." The C in PMS notation is

C('PDP-8; technology:transistors; $12 \mathrm{~b} / \mathrm{w}$;
descendants:'PDP-8/S, 'PDP-8/I, 'PDP-8/L;
antecedents: 'PDP-5;
Mp(core; \#0:7; 4096 w ; tc: I. $5 \mu \mathrm{~s} / \mathrm{w}$ );
$\operatorname{Pc}(\operatorname{Mps}(2 \sim 4 \mathrm{w})$;
instruction length: $1 / 2 \mathrm{w}$
address/instruction:1;
operations on data/od:( $\leftarrow,+, \neg, \wedge,-$ (negate $), \times 2$,
/2, +1)
optional operations: $(\times, /$, normalize);
data-types:word, integer, boolean vector;
operations for data access:4);
P(display; '338);
P(c; 'LINC);
S('I/O Bus; I Pc; 64 K );
Ms(disk, 'DECtape, magnetic tape);
T(paper tape, card, analog, cathode-ray tube))

## ISP

The ISP is presented in Appendix I of this chapter (including the optional Extended Arithmetic Element/EAE). The $2^{12}$-word Mp is divided into 32 fixed-length pages of 128 words each. Address calculation is based on references to the first page, Page_0, or to the current page of the Program Counter/PC. The effectiveaddress calculation procedure provides for both direct and indirect reference to either the current page or the first page. This scheme allows a 7 -bit address to specify local page addresses.

A $2^{15}$-word Mp is available on the PDP-8, but addressing greater than $2^{12}$ words is comparatively inefficient. In the extended range, two 3-bit registers, the Program Field and Data Field Registers, select which of the eight $2^{12}$-word blocks are being actively addressed as program and data.

There is an array of eight registers, called the Auto_index registers, which resides in Page_0. This array (Auto_index[0: $\left.11]\langle 0: 7\rangle:=\mathrm{M}\left[10_{8}: 17_{8}\right]\langle 0: 1 \mathrm{I}\rangle\right)$ possesses the useful property that whenever an indirect reference is made to it, a I is first added


```
\({ }^{1} \mathrm{Mp}\) (core; \(1.5 \mathrm{us} / \mathrm{w}\); \(\left.4096 \mathrm{w}:(12+1) \mathrm{b}\right)\)
\({ }^{2}\) ('Memory Bus)
```



```
    antecedents: PDP-5; descendants; POP-85, PDP-81, PDP-L)
\({ }^{4} \mathrm{~S}(\mathrm{II/O}\) Bus; from; Pc; to; 64 K\()\)
\({ }^{5}\) K (1) \(\sim 4\) instructions; M.buffer (1 char~2 w))
```

Fig. 1. DEC LINC-8-338 PMS diagram.
to its contents. (That is, there is a side effect to referencing.) Thus, address integers in the register can select the next member of a vector or string for accessing.

The instruction-set-execution definition can also be presented as a decoding diagram or tree (Fig. 2). Here, each block represents an encoding of bits in the instruction word. A decoding diagram allows one more descriptive dimension than the conventional, linear ISP description, revealing the assignment of bits to the instruction. Figure 2 still requires ISP descriptions for Mp, Mps, the instruction execution, the effective-address calculation, and the interpreter. Diagrams such as Fig. 2 are useful in the ISP
design to determine which instruction numbers are to be assigned to names and operations and instructions which are free to be assigned (or encoded).

There are eight basic instructions encoded by 3 bits, that is op $\langle 0: 2\rangle:=\mathrm{i}\langle 0: 2\rangle$, where instruction/i $\mathbf{i}\langle 0: 11\rangle$. Each of the first six instructions (where $0 \leq$ op $<6$ ) have the 4 address operand determination modes (thus yielding essentially 24 instructions). The first six instructions are:
data transmission: deposit and clear-accumulator/dca two's complement add to the accumulator/tad


Fig. 2. DEC PDP-8 instruction-decoding diagram.
binary arithmetic:
binary boolean:
program control:
two's complement add to the accumulator/tad
and to the accumulator/and
jump/set program counter/jmp jump to subroutine/jms index memory and skip if results are zero/isz

Note that the add instruction, tad, is used for both data transmission and arithmetic.

The subroutine-calling instruction, jms, provides a method for transferring a link to the beginning (or head) of the subroutine. In this way arguments can be accessed indirectly, and a return is executed by a jump indirect instruction to the location storing the returned address. This straightforward subroutine-call mechanism, although inexpensive to implement, requires reentrant and recursive subroutine calls to be interpreted by software, rather than by hardware. A stack, as in the DEC 338 (Chap. 25), would be nicer.

The input_output instruction/iot $(:=o p=6)$ uses the remaining 9 bits of the instruction to specify instructions to input/ output devices. The 6 io_select bits select 1 of 64 devices. The 3 bits, io_pl_bit, io $\square_{\sqsupset} 2_{\hookrightarrow}$ bit, io_p4_bit, command the selected device by conditionally providing three pulses in sequence. The instructions to a typical io device are:

```
io_pl_bit \(\rightarrow\) (IO_skip_flag[io select \(] \rightarrow(\mathrm{PC} \leftarrow \mathrm{PC}+1))\)
    testing a condition of an IO device output to a device input
    from a device
```

```
io_p4_bit \(\rightarrow\) (Output_data[io select] \(\leftarrow A C\) )
io_p2_bit \(\rightarrow\) (AC \(\leftarrow\) Input_data[io select] \()\)
```

There are three microcoded instruction groups selected by $o p=7$. The instruction decoding diagram (Fig. 2) and the 1SP description (Appendix 1 of this chapter) show the microinstructions which can be combined in a single instruction. These instructions are: operate group $1(:=(o p=7) \wedge \neg i\langle 3\rangle)$ for operating on the processor state; operate group $2(:=\langle o p=7) \wedge(i\langle 3,11\rangle=$ $\left.10_{2}\right)$ ) for testing the processor state; and the extended arithmetic element group $\left(:=\left((\mathrm{op}=7) \wedge\left(\mathrm{i}\langle 3,11\rangle=11_{2}\right)\right\rangle\right)$ for multiply, divide, etc. Within each instruction the remaining bits, $\langle 4: 10\rangle$ or $\langle 4: 11\rangle$, are extended instruction (or opcode) bits; that is, the bits are microcoded to select instructions. In this way an instruction is actually programmed (or microcoded). For example, the instruc-
tion set_link $\rightarrow \mathrm{L} \leftarrow 1$ is formed by coding the two microinstructions, clear link, next, complement link.

$$
\begin{aligned}
\text { opr } \_1 \rightarrow & (\mathrm{i}\langle 5\rangle \rightarrow \mathrm{L} \leftarrow 0 ; \text { next } \\
& \mathrm{i}\langle 7\rangle \rightarrow \mathrm{L} \leftarrow \neg \mathrm{~L})
\end{aligned}
$$

Thus, in operate group 1, the instructions clear link, complement link, and set link are formed by coding instruction $\langle 5,7\rangle=10,01$, and 11, respectively. The operate group 2 instruction is used for testing the condition of the Pc state. This instruction uses bits 5, 6 , and 8 to code tests for the accumulator. The AC skip conditions are coded $(0 \sim 7)$ as never, always, $=0, \neq 0,<0, \geq 0, \leq 0$, and $>0$. If all the nonredundant and useful variations in the two operate groups were available as separate instructions in the manner of the first seven (dca, tad, etc.), there would be approximately $7+12\left(\right.$ opr $\left._{\checkmark} 1\right)+10\left(\right.$ opr $\left._{\bullet} 2\right)+6(\mathrm{EAE})=35$ instructions in the PDP-8.

The optional Extended Arithmetic Element/EAE includes additional Multiplier Quotient/MQ and Shift Counter/SC registers and provides the hardwired operations multiply, divide, logical shift left, arithmetic shift, and normalize. The EAE is defined on the last page of Appendix 1.

## The interrupt scheme

External conditions in the input/output devices can request that Pc be interrupted. Interrupts are allowed if (Interrupt_state $=1$ ). A request to interrupt clears Interrupt_state (Interrupt $b_{\square}$ state $\leftarrow 0$ ), and Pc behaves as though a jump to subroutine 0 instruction, jms 0 , had been given. A special iot instruction (instruction $=$ $6001{ }_{8}$ ) followed by a jump to subroutine indirect to 0 instruction (instruction $=5200_{8}$ ) returns Pc to the interruptable state with Interrupt_state $=1$. The program time to save M (processor state $/ \mathrm{ps}$ ) is 6 Mp accesses ( 9 mic oseconds), and the time to restore Mps is 9 Mp accesses ( 13.5 microseconds).

Only one interrupt level is provided in the hardware. If multiple priority levels are desired, programmed polling is required. Most io devices have to interrupt because they do not have a program-controlled enable switch for the interrupt. For multiple devices approximately 3 cycles ( $4.5 \mu \mathrm{~s}$ ) are required to poll each interrupter.

## PMS structure

The PMS structure of the LINC-8-338 consisting of a Pc('LINC), Pc('PDP-8), and P.display('338) is shown in Fig. 1. The PDP-8 is just a single Pc. The Pc('LINC) is a very capable Pc with more
instructions than the main Pc. It is available in the structure to interpret programs written for the C('LINC), a computer developed by M.I.T.'s Lincoln Laboratory as a laboratory instrument computer for biomedical and laboratory applications. Because of the rather limited ISP in Pc, one would hardly expect to find all the components present in Fig. 1 in an actual configuration.

The S between the Mp and the Pc allows eight Mp's. This S is actually $\mathrm{S}\left({ }^{\prime}\right.$ Memory Bus; 8 Mp ; I Pc; ( $P$ requests); time-multiplexed; $1.5 \mu \mathrm{~s} / \mathrm{w})$. Thus the switch makes Mp logically equivalent to a single $\mathbf{M p}(32768 \mathrm{w})$. There are two other L's which are connected to the Pc, excluding the T.console. They are $\mathrm{L}(\mathbf{I} / \mathrm{O}$ Bus) and L('Data Break; Direct Memory Access). These links become switches when we consider the physical structure. Associated with each device is a switch, and the bus links all the devices; the $\mathrm{L}(\mathbf{I} / \mathrm{O}$ Bus) is really an $\mathrm{S}(\mathbf{I} / \mathrm{O}$ Bus). Each time a K connects to it, the $S$ is included in the K. A simplified PMS diagram (Fig. 3) shows the structure and the logical-physical transformation. Thus, the I/O Bus is
$\mathrm{S}($ 'I/O Bus; duplex; bus; time-multiplexed, $1 \mathrm{Pc} ; 64 \mathrm{~K} ; P c$ controlled, $K$ requests; t:4.5 $\mu \mathrm{s} / \mathrm{w}$ )

The $\mathrm{S}(\mathrm{I} / \mathrm{O}$ Bus) is the same for the PDP-5, $8,8 / \mathrm{S}, 8 / \mathrm{I}$, and $8 / \mathrm{L}$. Hence, any K cau be used on any of the above C's. The I/O Bus is the link to the K's for Pc-controlled data transfers. Each word transferred is designated by a Pc instruction. However, the I/O Bus allows a $K$ to request Pc's attention via the interrupt request signal. The Pc polls the K's to find the requesting K if multiple interrupt requests occur. A detailed structure of the Pc-Mp (Fig. 4) shows these L(I/O Bus, 'Data Break) connections to the registers and control in the notation used by DEC. This diagram is essentially a functional block diagram.

The $\mathrm{S}(\mathrm{I} / \mathrm{O}$ Bus) in Fig. 1 is only an abstract representation of


Fig. 3. DEC PDP-8 PMS diagram (simplified).
the structure. Since it is a bus structure, the S can be expanded into L's and simple S's as shown in Fig. 3. The termination of the L in Pc is given in Fig. 3. The corresponding logic at a K is given in Fig. 5 in terms of logic design elements (AND's and OR's). (Fig. 5 also shows the $S(1 / \mathrm{O}$ Bus) structure of Figs. 1 and 3). The operation of $\mathrm{S}(\mathrm{I} / \mathrm{O}$ Bus) shown in Fig. 5 starts when Pc sends a signal to select (or address) a particular K, using the IO_select $\langle 0: 5\rangle$ signals to form a 6-bit code to which $K$ responds. Each K is hardwired to respond to a unique code. The local control, K[j], select signal is then used to form three local commands when ANDed with the three iot command lines from Pc, io_pl_bit, io_p2_bit, and io_p4_bit. Twelve data bits are transmitted either to or from Pc, indirectly under K's control. This is accomplished by using the AND-OR gates in K for data input to Pc , and the AND gate for data input to K . The data lines are connected to AC as shown in Fig. 4. A single skip input is used so that Pe can test a status bit in K. A K communicates to Pc via the interrupt request line. Any K wanting attention simply ORs its request signal into the interrupt request signal. Program polling in Pc then selects the specific interrupter. Normally, the K signal causing an interrupt is also connected to the skip input.

The L('Data Break; Direct Memory Access) provides a direct access path for a P or K to Mp via Pc. The number of access ports to memory can be expanded to eight by using the S('DM0I Data Multiplexer). The S is requested from a P or K . The P or K supplies an Mp address, a read or write access request, and then either accepts or supplies data for the Mp accessed word. In the configuration (Fig. I), P('LINC) and P('338) are connected to S('DM0I) and make requests to Mp for both their instructions and data in the same way as the Pc. The global control of these processor programs is via the $\mathrm{S}($ 'I/O Bus). The Pc issues start and stop commands, initializes their state, and examines their final state when a program in the other P halts or requires assistance.

When a $K$ is connected to L('Data Break) or to S('DM01 Data Multiplexer), the K only accesses Mp for data. The most complex function these K's carry out is the transfer of a complete block of data between the Mp and an Ms or a T , for example, K('DECtape, disk). A special mode, the three-cycle data break, is controlled by Pc so that a K may request the next word from a queue in Mp. In this mode the next word is taken from the queue (block) in Mp, and a counter is reduced each time K makes a request. With this scheme, a word transfer takes three Mp cycles: one to add one to the block count, one to add one to the address pointer, and one to transmit the word.

The DECtape was derived from M.I.T.'s Lincoln Laboratory LINCtape unit. Data are explicitly addressed by blocks (variable


Fig. 4. DEC PDP-8 timing and control-element block diagram.
(Courtesy of Digital Equipment Corporation.)


Fig. 5. DEC PDP-8 S('I/O Bus) logic and PMS diagrams.
but by convention 128 w ). Thus information in a block can be replaced or rewritten at random. This operation is unlike queueaccessed tape (conventional IBM format magnetic tape) in which data can be appended only to the end of a file.

The control for the $T$ (telephone) links 64 Teletypes or typewriters to the Pc. The final K which connects to a line is on a bit-serial basis. Since a telephone line sends and receives informa-
tion serially by bit, there are special input/output instructions in the Pc to sample the line and to convert the sampled bits to coded characters. There are 11 bits transmitted per character (although other codings use $7,7.42,7.5$, and 10 bits per character). Of the 11 bits, there are 3 control, 1 parity, and 7 information bits. The action of the Pc instruction, which is issued $5 \times 11(55)$ times for every character, is to control the line by forming the 7 -bit characters. The instruction is a good example of tradeoff in the hardware/software domain toward almost pure software; the only hardware state associated with a telephone line is a 1-bit register to hold the state of the outgoing line, and a single AND gate to sample the incoming line state. This sampling process requires about 0.3 per cent of $\mathrm{Pc}-\mathrm{Mp}$ capacity per active line (each of $10 \sim 15$ char/s). In general, the PDP-8 hardware controls are minimal-in turn fairly elaborate control programs must be used as part of them.

## Computer levels

In this section we describe all the systems levels in the PDP-8 computer from the top down. The reader should already have a sketchy knowledge of the PDP-8 because the registers and ISP have been exposed. Here, we wish to clarify how it operates. A map of the hierarchy is given in Fig. 6, starting from PMS to ISP and down through logic design to circuit electronics. These description levels are subdivided to provide more organizational detail. For example, the register-transfer level has the more detailed registers, data operators, functional units, and macro logic of the processor, whereas the next logic level below has sequential and combinational networks, and the sequential and combinatorial elements.

It should be apparent that the relationship of the various description levels constitutes a tree structure where the organizationally complex computer is the top node and each descending description level represents increasing detail (or smaller component size), until the final circuit element level is reached. For simplicity, only a few of the many possible paths through the structural description tree are illustrated. For example, the path showing mechanical parts is missing. The path shown proceeds from the PDP-8 computer to the processor and from there to the arithmetic unit or, more specifically, to the AC register of the arithmetic unit. Next, the macro logic implementing the registertransfer operations and functions for the jth bit of the AC is given; the flip-flops and gates needed for this particular implementation are shown. Finally, on the last segment of the path, come the electronic circuits and components of which flip-flops and NAND gates are constructed.


Fig. 6. DEC PDP-8 hierarchy of descriptions.

## Abstract representations

Figure 6 also lists some of the methods used to represent the physical computer abstractly at the different description levels. As mentioned previously, only a small part of the PDP-8 description tree is represented here. The many documents, schematics, diagrams, etc., which constitute the complete representation of even this small computer include logic diagrams, wiring lists, circuit schematics and printed-circuit board layout masks, production description diagrams, production parts lists, testing specifications, programs for testing and diagnosing faults, and manuals for modification, production, maintenance, and use. As the discussion continues down the abstract description tree, the reader will observe that the tree conveniently represents the constituent objects of each level and their interconnection at the next highest level. Each level in the abstract-description tree will be described in order.

## The PMS level

The simplified PMS structure in Fig. 3 has been reduced from Fig. I. The computer is small enough so that the physical delineation of the PMS components, such as K's and S's, is less pronounced than in larger systems. In fact, in the case of the S('Memory Bus, 'I/O Bus), the S's are actually within the K and

Mp, as shown in Fig. 5. The implementation of these switches within the K and Mp was shown in Fig. 5. In Fig. 7 we present a more conventional functional diagram and the equivalent PMS diagram of the computer, with Pc decomposed into K , processor state ( Mps ), and D . The functional diagram has the same components of the characteristic elementary computer model, namely, $\mathrm{K}, \mathrm{D}, \mathrm{M}$, and T (input, output). These figures give a somewhat general idea of what processes can occur in the computer, and how information flows, but it is apparent that at least another level is needed to describe the internal structure and behavior of the Mp and Pc. We should look at these primitives (although still together as a C) at the register-transfer level.

## Programming level (ISP)

The ISP interpretation is given in Appendix I of this chapter and is the specification of the programming machine. In addition, it constrains the physical machine's behavior to have a particular ISP. The ISP has been discussed earlier in the chapter.

## Register-transfer level

The $C$ can also be represented at the register-transfer level by using PMS. Figure 4 (by DEC) shows the register-transfer level;


Fig. 7. DEC PDP-8 function block and PMS diagrams. (a) Processor functional block diagram. (b) Pc PMS diagram.


Fig. 8. DEC PDP-8 register-transfer-level PMS diagram.
only registers, operations, and L's are important at this level. We still lack information about the conditions under which operations are evoked. Figure 8 is a PMS diagram of Pc-Mp registers. Here we show considerably more detail (although we do not bother with electrical pulse voltages and polarities) than in Fig. 4. We declare the Pc state (including the temporary register) within Pc. The figure also gives the permissible data operations, $D$, which are permitted on the registers. It should be clear from this that the logical design level for the registers and the operators can easily be reached. The $K$ logic design cannot be reached until we use the programming level constraints (ISP), thus defining the conditions for evoking the data operators.

The core memory. The Mp structure is given in Fig. 8. A more detailed block diagram which shows the core stack with its twelve
$64 \times 64$ 1-bit core planes is needed. Such a diagram, though still a functional block diagram, takes on some of the aspects of a circuit diagram because a core memory is largely circuit-level details. The Mp (Fig. 9) consists of the component units: the two address decoders (which select 1 each of 64 outputs in the X and Y axis directions of the coincident current memory); selection switches (which transform a coincident logic address into a highcurrent path to switch the magnetic cores); the 12 inhibit drivers (which switch a high current or no current into a plane when either a 0 or 1 is rewritten); 12 sense amplifiers (which take the induced low sense voltage from a selected core from a plane being switched or not switched and transform it into a 1 or 0 ); and the core stack, an array $\mathrm{M}\left[0: 7777_{8}\right]\langle 0: 11\rangle$. Since this is the only time the Mp is mentioned, Fig. 9 also includes the associated circuitlevel hardware needed in the core-memory operation, such as
power supplies, timing, and logic signal level conversion amplifiers. The timing signals are generated within $\mathrm{Pc}(\mathrm{K})$ and are shown together with Pc's clock in Fig. 10.

The process of reading a word from memory is:
1 A 12-bit selection address is established on the MA〈0:11〉 address lines, which is 1 of $10000_{8}$ (or $4096_{10}$ ) unique numbers. The upper 6 bits, $\langle 0: 5\rangle$, select 1 of 64 groups of $Y$ addresses and the lower 6 bits, $\langle 6: 11\rangle$, select 1 of 64 groups of X addresses.

2 The read logic signal is made a 1 .
3 A high-current path flows via the X and Y selection switches. In each of the $\mathbf{X}$ and $Y$ directions $64 \times 12$ cores
have selection current. Only one core in each plane is selected since $\mathrm{Ix}=\mathrm{I} y=\mathrm{I} s w i t c h i n g / 2$, and the current at the selected intersection $=I x+1 y=1$ switching.

4 If a core is switched to 0 (by having Iswitching amperes through it), then a 1 was present and is read at the output of the plane (bit) sense amplifiers. A sense amplifier receives an input from a winding that threads every core of every bit within a core plane [0:77778]. All 12 cores of the selected word are reset to 0 . The sense time at which the sense amplifier is observed is tms (memory strobe), and the strobe in effect creates $\mathrm{MB} \leftarrow \mathrm{M}[\mathrm{MA}]$.

5 The read current is turned off.


Fig. 9. DEC PDP-8 four-wire coincident current (three dimensions) core-memory-logic block diagram.


Fig．10．DEC PDP－8 clock and memory timing diagram．

6 The write and inhibit logic signals are tumed on．The bit inhibit signal is present or not，depending on whether a 0 or I，respectively，is written into a bit．
7 A high－current path flows via the X and Y selection switches，but in an opposite direction to the read case（2 above）．If a $I$ is written，no inhibit current is present，and the net current in the selected core is－Iswitching．If a 0 is written，the current is－Iswitching + （Iswitching／2） and the core remains reset．
8 The inhibit and write logic signals are tumed off，and the memory cycle is completed．

Registers and operations．As Fig． 8 shows，the registers in the Pc cannot be uniquely assigned to a single function．In a minimal machine such as the PDP－8，functional separation is not economi－ cal．Thus there are not completely distinct registers and transfer paths for memory，arithmetic，and program and instruction flow． （This sharing complicates understanding of the machine．）How－ ever，Fig． 8 clarifies the structure considerably by defining all the registers in Pc（including temporaries）．For example，the Memory Buffer／MB is used to hold the word being read from or written to Mp．MB also holds one of the operands for binary operations（for example， $\mathrm{AC} \leftarrow \mathrm{AC} \wedge \mathrm{MB}$ ）． MB is also used as an extension of the Instruction Register／IR during the instruction interpretation．

The additional registers，not in the ISP，are：

Memory Buffer／MB〈0：Il $\rangle$

Memory Address／MA〈0：11〉

Instruction Register／IR〈0：2＞
holds memory data， instruction，and oper－ ands
holds address of word in Mp being accessed holds the value of current instruction being performed

| State＿register $_{3}$ | aternary state register <br> holding the major <br> state of memory cycle <br> being performed |
| :--- | :--- |
| Fetch $/ \mathrm{F}:=($ State＿register $=0)$ | memory cycle to <br> fetch instruction |
| Defer／D／Indirect <br> $:=($ State＿register $=1)$ | memory cycle to get <br> address of operand |
| Execute／E | memory cycle to fetch |
| $:=($ State＿register $=2)$ | （store）operand and <br> execute the instruc－ <br> tion |

Figure 8 has been concerned with the static definition（or declaration）of the information paths，the operations，and state． The ISP interpretation（Appendix l）is the specification for the physical machine＇s behavior．As the temporary hardware registers are added，a more detailed ISP definition could be given in terms of time and temporary registers．Instead，we give a state diagram （Fig．11）to define the actual Pc which is constrained by both the ISP registers，the temporary registers implied by the implementa－ tion，and time．The relationship among the state diagram，the ISP description，and the logic is shown in the hierarchy of Fig．6．In the relationships of the figures，we observe that the ISP definition does not have all the necessary detail for fully defining a physical Pc．The physical Pc is constrained by actual hardware logic and lower－level details even at the circuit level．For example，a core memory is read by a destructive process and requires a temporary register（ MB ）to hold the value being rewritten．This is not repre－ sentable within a single ISP language statement since we define only the nondestructive transfer $\leftarrow$ ，but it can be considered as the two parallel operations $\mathrm{MB} \leftarrow \mathrm{M}[\mathrm{MA}] ; \mathrm{M}[\mathrm{MA}] \leftarrow 0$ ．The problem of explaining rewriting of core using ISP is also difficult， because explicit time is not in the ISP language（although we can define clock events，or at least relative time）．

The state diagram（Fig．11）describes the implementation be－ havior using the registers and register operations（Fig．8）and the temporary registers declared above．

The implementation is fundamentally Mp－timing－based，as we see from both the state diagram and the times when the four clock signals are generated（Fig．I0）．Thus there are three（State＿regis－ ter $=0,1,2) \times 4$（clock），that is， 12 major states，in the implemen－ tation．We use the IR to obtain two more states，F2b and F3b，


Fig. 11. DEC PDP-8 Pc state diagram.
for the description. The State_register values 0 , I, and 2 correspond to fetching, deferring (indirect addressing, i.e., fetching an operand address), and executing (fetching or storing data, then executing) the instruction. The state diagram does not describe the Extended Arithmetic Element/EAE operation, the interrupt state, and the data break states (these add 12 more states). The initialization procedure, including the T.console state diagram, is also not given. One should observe that when t2 occurs at the beginning of the memory cycle, a new State_register value is selected. The State_register value is always held for the remainder of the cycle; i.e., only the sequences $(\mathrm{F} 0 \rightarrow \mathrm{~F} 1 \rightarrow \mathrm{~F} 2 \rightarrow \mathrm{~F} 3$ or $\mathrm{D} 0 \rightarrow \mathrm{D} 1 \rightarrow \mathrm{D} 2 \rightarrow \mathrm{D} 3$ or $\mathrm{E} 0 \rightarrow \mathrm{E} 1 \rightarrow \mathrm{E} 2 \rightarrow \mathrm{E} 3$ ) are permitted.

Figure 8 alludes to $\mathrm{Pc}(\mathrm{K})$, that is, the sequential network used for controlling Pc. The inputs and the present state (including clocks) determine the operations to be issued on the registers.

## Logic design level (registers and data operations)

Proceeding from the register-transfer and ISP descriptions, the next level of detail is the logic module. Typical of the level is the 1-bit logic module for an accumulator bit, $\mathrm{AC}\langle\mathrm{j}\rangle$, illustrated in Fig. 12. The horizontal data inputs in the figure are to the logic module from $\mathrm{AC}\langle\mathrm{j}\rangle, \mathrm{MB}\langle\mathrm{j}\rangle$, IO Bus $\langle\mathrm{j}\rangle$, and Data_switch $\langle\mathrm{j}\rangle$. The vertical control signal inputs command the register operations (i.e., the transfers); they are labeled by their respective ISP operations (for example, $\mathrm{AC} \leftarrow \mathrm{MB} \wedge \mathrm{AC}, \mathrm{AC} \leftarrow \mathrm{AC} \times 2$ \{rotate\}). The sequential network $\mathrm{Pc}(\mathrm{K})$ (Fig. 8) generates these control signal inputs.

## Logic design level (Pc control, Pc(K) sequential network)

The output signals from the $\mathrm{Pc}(\mathrm{K})$ (Fig. 8) can be generated in a straightforward fashion by formulating the boolean expressions


Fig．12．DEC PDP． 8 AC〈J〉 bit register－transfer logic diagram．
＇ $\mathrm{AC}-0:=($
$(\uparrow 1 \wedge(I R=149) \wedge(\neg \mathrm{MB}<3>\wedge \mathrm{MB}<4>\wedge \neg \mathrm{MB}<6>) \wedge($ Stoteuregister $=0)) \vee$ $(+1 \wedge(I R=119) \wedge(M B<3>\wedge \neg M B<11>\wedge M B<4>) \wedge($ Stote＿register $=0)) \vee$ $(t 1 \wedge(I R=119) \wedge(M B<3>\wedge M B<19>\wedge M B<4>) \wedge(\text { Stoteuregister }=0))^{\prime} \vee$ $(+1 \wedge(I R=011) \wedge($ Stoteuregister $=2))$ ）
$(t 1 \wedge\{(($ Stote」register $=0) \wedge(1 R=111) \wedge M B<4>\wedge(M B<3>\vee \neg M B<6>)) \vee$ $(($ Stote＿register $=2) \wedge(I R=0 i 1)))$ Logic equotion for＇$A C-0$


Logic diogrom for $A C=0$
${ }^{1}$ This term is derived from EAE ond is not on the stote diogrom

Fig．13．DEC PDP－8 $\mathrm{Pc}(\mathrm{K}){ }^{\prime} \mathrm{AC} \leftarrow \mathbf{0}$ signal－logic equations and diagram．


Flip－flop circuit


Combinotoriol logic equivolent of flip－flop


Direct set－cleor flip－flop sequentiol logic element
Table of flip－flop input－output

| Outputs 1 | Table of flip－flop input－output |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: |
|  | 0 | Direct set | Direct cleor | 1 | 0 |
| 1 | 0 | 0 | 0 | 1 | 0 |
| 0 | 1 | 0 | 0 | 0 | 1 |
| 0 | 1 | 0 | 1 | 0 | 1 |
| 1 | 0 | 0 | 1 | 0 | 1 |
| 0 | 1 | 1 | 0 | 1 | 0 |
| 1 | 0 | ， | 0 | ， | 0 |

${ }^{1}$ Note；This is not on＂ideot＂sequentiol circuit element，becouse there is no deloy in the output．

Fig．14．DEC PDP－8 sequential－element circuit and logic diagrams．


Fig. 15. DEC PDP-8 combinational element circuit and logic diagrams.
directly from the state diagram in Fig. 11. For example, the $\mathrm{AC} \leftarrow 0$ control signal is expressed algebraically and with a combinatorial network in Fig. 13. Obviously these boolean output control signals are functions which include the clock, the State register, and the states of the arithmetic registers (for example, $A=0, L=0$, etc.). The expressions should be factored and minimized so as to reduce the hardware cost of the control for the interpreter. Although we are rather cavalier about $\mathrm{Pc}(\mathrm{K})$, it constitutes about one-half the logic within Pc .

## Circuit level

The final level of description is the circuits which form the logic functions of storage (flip-flops) and gating (NAND gates). Figures 14 and 15 illustrate some of these logic devices in detail.

In Fig. 14 a direct set and direct clear flip-flop, a sequentiallogic element, is described in terms of circuit implementation, combinational logic equivalent, a table of its behavior, and its algebraic behavior. Note that this is not an ideal element, because it has no delay and responds directly and immediately to an input. Some idealized sequential logic elements are used in the PDP-8 (but not illustrated), including the RS (Reset-Set), T(Trigger), JK, and D(Delay). A delay in the flip-flops makes them behave in the same way as the ideal primitives in sequentialcircuit theory. The outputs require a series delay, $\Delta t$, such that, if the inputs change at time $t$, the outputs will not change until $t+\Delta t$. ln fact, the PDP-8 uses capacitor-diode gates at the flipflop inputs to delay the inputs.

Figure 15 illustrates the combinatorial logic elements used in the PDP-8. The circuit selection is limited to the inverter circuit with single or multiple inputs. These are more familiarly called NAND gates or NOR gates, depending on whether one uses positive and/or negative logic-level definitions.

## Conclusion

We could continue to discuss the behavior of the transistor as it is used in these switching-circuit primitives but will leave that to books on semiconductor electronics and physics. lt is hoped that the student has gained a grasp of how to think about the hierarchical decomposition of computers into particular levels of analysis (and synthesis).

## APPENDIX 1 DEC PDP－8 ISP DESCRIPTION

```
                                    Appendix I
DEC PDP-8 ISP Description
Pc State
AC<0: 11>
L Link bit/AC extension for overflow and carmy
PC<0:11>
Run
Interruptsstate
10_pulse\lrcornerl; 10_pulse」2; 10\_pulse」A
Acoumulator
Program Counter
1 when Pc is interpreting instmctions or "muning"
1 when Pe can be intermupted; under programmed control
IO pulses to IO devices
Mp State
Extended memory is not included．
M［0：7777 \(]<0: 11>\)
Page＿ \(\left.\mathrm{O}\left[0: 177_{8}\right]<0: 11\right\rangle:=M\left[0: 177_{8}\right]\langle 0: 11\rangle\)
Autouinde \(\times[0: 7]<0: 11\rangle:=\) Page \(\left.0\left[10_{8}: 17_{8}\right]<0: 11\right\rangle\)
special array of directiy addressed memory registers special array when addressed indirectly，is incremented by 1
Pc Console State
Keys for start，stop，continue，examine（load from memory），and deposit（store in memomy）are not included．
Data switches \(\langle 0: 11\rangle\)
data entered via console
Instmetion Format
instruction／i＜0：11＞
\begin{tabular}{|c|c|c|}
\hline op＜0：2＞ & \(:=1<0: 2\rangle\) & op code \\
\hline indirect＿bit／ib & ：＝i＜3＞ & 0 ，direct； 1 indirect memory reference \\
\hline page＿0＿bit／p & \(:=1<4>\) & 0 selects page 0； 1 selects this page \\
\hline page waddress＜0：6＞ & \(:=1\langle 5: 11\rangle\) & \\
\hline this＿page \(\langle 0: 4>\) & \(:=P C^{\prime}\langle 0: 4\rangle\) & \\
\hline PC＇\(<0\) ：11＞ & \(:=(P C<0: 11>-1)\) & \\
\hline 10＿select＜0：5＞ & \(:=1<3: 8\rangle\) & selects a \(T\) or Ms device \\
\hline iouplubit & \(:=i<11\rangle\) & these 3 bits control the selective generation of -3 volts， \\
\hline  & \(:=\mathrm{i}\langle 10\rangle\) & 0.4 us pulses to I／O devices \\
\hline iosp4」it & ：＝i \(\langle\) ¢ \(\rangle\) & \\
\hline sma & \(:=1<5\rangle\) & \(\mu\) bit for skip on minus \(A C\) ，operate 2 group \\
\hline sza & \(:=1<6>\) & \(\mu\) bit for skip on zero \(A C\) \\
\hline snl & \(:=1<7\rangle\) & it bit for skip on non zero Link \\
\hline
\end{tabular}
Effective Address Calculation Process
\(z<0: 11>:=1\) effective \(\rightarrow i b \rightarrow z^{11}\) ；
ib \(\wedge\left(10_{8} \leq z^{\prime \prime} \leq 17_{8}\right) \rightarrow\left(M\left[z^{\prime \prime}\right] \leftarrow M\left[z^{\prime \prime}\right]+1\right.\) ；next \()\) ；auto indexing
\(\left.i b \rightarrow M\left[z^{\prime \prime}\right]\right)\)
\(z^{\prime}<0: 11>:=\left(-\right.\) ib \(\rightarrow z^{\prime \prime} ;\) ib \(\left.\rightarrow M\left[z^{\prime \prime}\right]\right)\)
\(\left.z^{\prime \prime}<0: 11\right\rangle:=\)（page， \(0_{L} b i t \rightarrow\) this＿pagegpage＿address；direct address
－page＿O＿bit \(\rightarrow\) Oapage＿address）
\(\mu\) microcoded instruction or instruction bit（s）within an instruction
```


## APPENDIX 1 DEC PDP-8 ISP DESCRIPTION (Continued)

```
Instmuction Interpretation Process
    Run ^\neg (Interrupt\_request ^ Interruptustate) }->\mathrm{ (
        instruction &M[PC];PC &PC + 1; next
        instruction_execution);
    Run ^ Interrupt_request ^ Interrupt_state }->\mathrm{ (
        M[0] \leftarrowPC; Interrupt_state }\leftarrow0;PC&&1
Instruction Set and Instmuction Execution Process
    Instruction_execution := (
        and (:= op = 0) ->(AC\leftarrowAC^M[z]);
        tad (:= op = 1) -> (LOAC & LDAC + M[z]);
        isz (:= op = 2) ->(M[z']}\leftarrowM[z] + 1; next
        (M[\mp@subsup{z}{}{\prime}]=0)->(PC&PC+1)):
        dca (:= OP=3)->(M[z]\leftarrowAC; AC \leftarrow0);
        dea (:= op = 3) ->(M[z]\leftarrowAC; AC \leftarrow0);
        jmp (:= OP = 5) ->(PC&z);
        iot (:= op = 6) }->\mathrm{ (
            io_pl_bit }->\mathrm{ 10_pulse山l }\leftarrow1\mathrm{ ; next
            iO_p2_bit }->\mathrm{ 10_pulse_2 }\leftarrow1\mathrm{ ; next
            ioLp4\llcornerbit }->\mathrm{ 10_pulse_4 4 & 1);
        opr (:= op = 7) ->Operate_execution
        )
```

no interrupt interpreter
fetch
execute
intermupt interpreter
logical and
two's complement add
the operate instruction $i$
end Instruction execution

Operate Instruction Set
The microprogramed operate instructions: operate group 2, operate group 2, and extended arithmetic are defined as a separate instmetion set.

Operatecexecution := (
cla $(:=i<4>=1) \rightarrow(A C \leftarrow 0)$;
opral $(:=1<3>=0) \rightarrow($
c11 $(:=1<5=1) \rightarrow(L \leftarrow 0)$; next
clear AC. Common to all operate instmutions.
operate group 1
cma $(:=i\langle 6\rangle=1) \rightarrow(A C \leftarrow \neg A C) ;$
$\mu$ clear link
$\operatorname{cml}(:=i\langle\neg\rangle=1) \rightarrow(L \leftarrow \neg L)$; next
4 complement $A C$
iac $(:=i<11\rangle=1) \rightarrow($ LDAC $\leftarrow$ LDAC +1$)$; next
$\mu$ complement $L$
ral $(:=i\langle 8: 10\rangle=2) \quad u$ increment $A C$
ral $(:=i\langle 8: 10\rangle=2) \rightarrow($ LDAC $\leftarrow \operatorname{LDAC} \times 2$ \{rotate\});
4. rotate left
rti $(:=i<8: 10\rangle=3) \rightarrow\left(\right.$ LIAC $\leftarrow \operatorname{LDAC} \times 2^{2}$ \{rotate $) ;$ \& rotate twice left
$\operatorname{rar}(:=\mathbf{i}\langle 8: 10\rangle=4) \rightarrow($ LDAC $\leftarrow \operatorname{LDAC} / 2$ \{rotate\} $) ; \quad 4$ rotate right
$\operatorname{rtr}(:=i\langle 8: 10\rangle=5) \rightarrow\left(\right.$ LDAC $\leftarrow$ LDAC / $2^{2}\{$ rotate \}) $) ; \quad 4$ rotate twice right
opr $\iota^{2}(:=1<3,11>=10) \rightarrow($
operate group 2
skip condition $\oplus(i<8>=1) \rightarrow(P C \leftarrow P C+1)$; next u $\quad$ uC, $L$ skip test
skip condition $:=(($ sma $\wedge(A C<0)) \vee($ sza $\wedge(A C=0))$ ソ $(s n l \wedge L))$
osr $(:=K<g=1) \rightarrow(A C \leftarrow A C \vee$ Data switches $) ; \quad$ u "or" switches
hit $(:=i<10>=1) \rightarrow($ Run $\leftarrow 0)): \quad$ u halt or stop
FAE $(:=\mathbf{i}\langle 3,11\rangle=11) \rightarrow$ EAF Linstructionvexecution) optional EAE description

## APPENDIX 1 DEC PDP－8 ISP DESCRIPTION（Continued）

```
KT and KMs State
Each K may have any or all of the following registers. There can be up to 64 optional K's.
```

```
Input data[0:77 8]<0:11> 64 input buffers
```

Input data[0:77 8]<0:11> 64 input buffers
Output Nata[0:77 8]<0:11> 64 output buffers
Output Nata[0:77 8]<0:11> 64 output buffers
10_skip~flag[0:77 8] 64 test conditions
10_skip~flag[0:77 8] 64 test conditions
10
10
Interrupt_request := (
Interrupt_request := (
max(10_interrupt_request[0:77 % ]))
Extended Asithmetic Element, EAE (optional)
Provides additional arithmetic instmuctions (or operators) including x, l, normalize, logical shift and arithmetic shift.
EAE State

| $M Q<0: 11>$ | Multiplier Quotient |
| :--- | :--- |
| $S C<0: 4>$ | Shift Counter |

Instruction Format and Data
mds<0:11> multipiier divisor shift data
s<0:4> := mds<7:11> shift count parameter
Instruction Set for EAE
EAE,instruction_execution := (next
mqa (:= i<5>) ->(AC \&AC\veeMQ); MO into AC
sca (:= i<6>) ->(AC \&AC V SC): SC into AC
mql (:= i<7>) ) (MQ \leftarrowAC; AC \leftarrow0); next AC into MO, clear AC
Note only one of nmi, shl, asr, Lsr, muu, or dvi can be given at a time.
i<8:10> = 00\Phi ->;
IO operation
\negmi}->(mds \leftarrowM[PC];PC\leftarrowPC + 1); nex
muy (:= i<8:10> = 2) -> (LロAC口MQ }\leftarrowMQ.x mds; \$C \leftarrow0) multinlw,
dvi (:= i<8:10> = 3) -> (MQ \&-LロACםMO/mds; divide
L\squareAC \leftarrow LםAC\squareMQ mod mds; SC \leftarrowO):
nmi (:= i<8:10> = 4) }->(\mathrm{ ACaMQ \& normalize (ACDMO);
normalize(AC,MO) into SC
SC }\leftarrow\mathrm{ normalize`exponent (ACDMQ));
sh1 (:= i<8:10> = 5) ->(LロAC\squareMQ \leftarrowL\squareAC\squareMQ }\times\mp@subsup{2}{}{5+1}:SC\leftarrow0); shifft left
asr (:= i<8:10> = 6) ->(L\squareAC\squareMQ \leftarrowL\squareAC\squareMQ / 2 2+1}:SC\&0): shift righ
lsr (:= i<8:10> = 7) ->(L\squareAC\squareMQ \leftarrowL\squareAC口MQ / 2 +1}{logical}; Logical shift
SC}\leftarrow0
) end EAE instruction execution

```

\section*{Chapter 8}

\title{
Structural Levels of the PDP-8 \({ }^{1}\)
}

\author{
C. Gordon Bell / Allen Newell / \\ Daniel P. Siewiorek
}

A map of the PDP-8 design hierarchy, hased on the Structural Levels View of Chap. 2, is given in Fig. 1, starting from the PMS structure, to the ISP, and down through logic design to circuit electronics. These description levels are subdivided to provide more organizational details such as registers, data operators, and functional units at the register transfer level.

The relationship of the various description levels constitutes a tree structure, where the organizationally complex computer is the top node and each descending description level represents increasing detail (or smaller component size) until the final circuit element level is reached. For simplicity, only a few of the many possible paths through the structural description tree are illustrat-
\({ }^{1}\) Originally printed in C. G. Bell, J. C. Mudge, and J. E. McNamara, Computer Engineering: A DEC View of Hardware System Design, Digital Press, 1978, pp. 209-228.
ed. For example, the path showing mechanical parts is missing. The descriptive path shown proceeds from the PDP-8 computer to the processor and from there to the arithmetic unit, or more specifically, to the Accumulator ( AC ) register of the arithmetic unit. Next, the logic implementing the register transfer operations and functions for the \(j\) th bit of the Accumulator is given, followed by the flip-flops and gates needed for this particular implementation. Finally, on the last segment of the path, there are the electronic circuits and components from which flip-flops and gates are constructed.

\section*{Abstract Representations}

Figure 1 also lists some of the methods used to represent the physical computer abstractly at the different description levels. As mentioned previously, only a small part of the PDP-8 description tree is represented here. The many documents which constitute the complete representation of even this small computer include logic diagrams, wiring lists, circuit schematics, printed circuit board photo etching masks, production description diagrams, production parts lists, testing specifications, programs for testing and diagnosing faults, and manuals for modification, production, maintenance, and use. As the discussion continues down the abstract description tree, the reader will observe that the tree

\(|x|\) indicetes frgure number of instence
Fig. 1. PDP-8 hierarchy of descriptions.
conveniently represents the constitutent objects of each level and their interconnection at the next highest level.

\section*{The PMS Level}

The PDP - 8 computer in PMS notation is:
```

C[PDP-8 technology:transistors: $12 \mathrm{~b} / \mathrm{w}$ :
descendants:' PDP-8/S, ' PDP-8/I, ' PDP-8/L,
'8/E, '8/F, '8/M, '8/A, 'CMOS-8;
antecedents: ' PDP-5;
Mp[core; \#0:7; 4096 words; te:1.5 $\mu \mathrm{s} /$ word];]
$\mathrm{Pc}(\mathrm{Mps}(2$ to 4 words);
instruction length:I| 2 words;
address/instruction:I;
operations on data:( $=,+$, Not, And, Minus
(negate), Srr I(/2), SIr 1 (×2), +)
optional operations:( $\times, /$, normalize);
data-types:word, integer,Boolean vector;
operations for data access:4);
$\mathrm{P}\left(\right.$ display; $\left.{ }^{1} 338\right)$;
P(c; ' LINC);
S(' I/O Bus; $1 \mathrm{Pc} ; 64 \mathrm{~K}$ );
Ms (disk, ' DECtape, magnetic tape);
T (paper tape, card, analog, cathode-ray tube)

```

As an example of PMS structure, the LINC-8-338 is shown in Fig. 2; it consists of three processors (designated P): Pc('LINC), Pc( ' PDP-8), and P.display(' 338). The LINC processor is a very capable processor with more instructions than the PDP-8 and is available in the structure to interpret programs written for the LINC. Because of the rather limited instruction set being interpreted, one would hardly expect to find all the components present in Fig. 2 in an actual configuration.

The switches ( S ) between the memory and the processor allow eight primary memories \((\mathrm{Mp})\) to be connected. This switch, in PMS called S(' memory Bus; 8 Mp; I Pc; time-multiplexed; I. 5 \(\mu \mathrm{S} / \mathrm{word}\) ), is actually a bus with a transfer rate of 1.5 microseconds per word. The switch makes the eight memory modules logically equivalent to a single 32,768 -word memory module. There are two other connections (a switch and a link) to the processor excluding the console. They are the \(S\left({ }^{\prime} I / O\right.\) Bus) and L('Data Break; Direct Memory Access) for interconnection with peripheral devices. Associated with each device is a switch, and the I/O Bus links all the devices. A simplified PMS diagram (Fig. 3) shows the structure and the logical-physical transformation for the I/O Bus, Memory Bus, and Direct Memory Access link. Thus, the I/O Bus is:

S(' I/O Bus duplex; time-multiplexed; I Pc; 64K;Pc controlled, \(K\) requests; t:4.5 \(\mu \mathrm{s} / \mathrm{w}\) )

The I/O Bus is nearly the same for the PDP-5, 8, 8/S, 8/I, and \(8 / \mathrm{L}\). Hence, any controller can be used on any of the above computers provided there is an appropriate logic level converter (PDP-5, 8, and \(8 / \mathrm{S}\) use negative polarity logic; the \(8 / \mathrm{I}\) and \(8 / \mathrm{L}\), positive logic). The I/O Bus is the link to the controllers for processor-controlled data transfers. Each word transferred is designated by a processor in-out transfer (IOT) instruction. Due to the high cost of hardware in 1965, the PDP-8 I/O Bus protocol was designed to minimize the amount of hardware to interface a. peripheral device. As a result, only a minimal number of control signals were defined with the largest portion of I/O control performed by software.

A detailed structure of the processor and memory (Fig. 4) shows the I/O Bus and Data Break connections to the registers and control in the notation used in the initial PDP-8 reference manual. This diagram is essentially a functional block diagram. The corresponding logic for a controller is given in Fig. 3 in terms of logic design elements (ANDs and ORs). The operation of the I/O Bus starts when the processor sends a control signal and sets the six I/O selection lines (IO.SELECT \(<0: 5>\) ) to specify a particular controller. Each controller is hardwired to respond to its unique 6 -bit code. The local control, \(\mathrm{K}[\mathrm{k}]\), select signal is then used to form three local commands when ANDed with the three IOT command lines from the processor. These command lines are called IO.PULSE. I, IO.PULSE.2, and IO. PULSE.4. Twelve data bits are transmitted either to or from the processor, indirectly under the controller's control. This is accomplished by using the AND/OR gates in the controller for data input to the processor, and the AND gate for data input to the controller. A single skip input is used so that the processor can test a status bit in the controller. A controller communicates back to the processor via the interrupt request line. Any controller wanting attention simply ORs its request signal into the interrupt request signal. Normally, the controller signal causing an interrupt is also connected to the skip input, and skip instructions are used in the software polling that determines the specific interrupting device.
The Data Break input for Direct Memory Access provides a direct access path for a processor or a controller to memory via the processor. The number of access ports to memory can be expanded to eight by using the DM0I Data Multiplexer, a switch. The DM01 port is requested from a processor (e.g., LINC or Model 338 Display Processor) or a controller (e.g.; magnetic tape). A processor or controller supplies a memory address, a read or write access request, and then accepts or supplies data for the accessed word. In the configuration (Fig. 1), Pc['LINC] and P [ ' 338] are connected to the multiplexer and make requests to memory for both their instructions and data in the same way as the PDP-8 processor. The global control of these processor programs is via the processor over the I/O Bus. The processor issues start and stop commands, initializes their state, and examines their


Fig. 2. LiNC-8-338 PMS diagram.


Fig. 3. PDP-8 S('I/O Bus) logic and PMS diagrams.
final state when a program in the other processor halts or requires assistance.

When a controller is connected to the Data Break or to the DM01 Data Multiplexer, it only accesses memory for data. The most complex function these controllers carry out is the transfer of a complete block of data between the memory and a high speed transducer or a secondary memory (e.g., DECtape or disk). A special mode, the Three Cycle Data Break, allows a controller to request the next word from a block in memory.

The DECtape was derived from M.I.T.'s Lincoln Laboratory LINCtape unit. Data were explicitly addressed by blocks (variable but by convention 128 words). Thus, information in a block could be replaced or rewritten at random. This operation was unlike the early standard IBM format magnetic tape in which data could be appended only to the end of a file.

\section*{Programming Level (ISP)}

The ISP of the PDP-8 processor is probably the simplest for a general purpose stored program computer. It operates on 12 -bit words, 12 -bit integers, and 12 -bit Boolean vectors. It has only a few data operators, namely, \(=,+\), minus (negative of), not, and slr I (rotate bits left), srr 1 (rotate bits right), (optional) \(\times, / /\), and normalize. However, there are microcoded instructions, which allow compound instructions to be formed in a single instruction.

The ISP of the basic PDP-8 is presented in Appendix 1. The \(2^{12}\)-word memory (declared MP \([0: 4095]<0: 11>\) ) is divided into 32 fixed-length pages of 128 words each (not shown in the ISPS description). Address calculation is based on references to the first page, page.zero, or to the current page of the Program Counter (PC\Program.Counter). The effective address calculation procedure, called MA in Appendix 1, provides for both direct and indirect reference to either the current page or the first page. This scheme allows a 7 -bit address to specify a local page address.

A \(2^{15}\)-word memory is available on the PDP-8, but addressing more than \(2^{12}\) words is comparatively inefficient. In the extended range, two 3 -bit registers, the Program Field and Data Field registers, select which of the eight \(2^{12}\)-word blocks are being actively addressed as program and data. These are not given in the ISPS description.
There is an array of eight 12 -bit registers, called the auto.index registers, which resides in page.zero. This array (auto.index \([0: 7]<0: 11>:=\mathrm{MP}[\# 10: \# 17]<0: 11>)\) possesses a useful property: whenever an indirect reference is made to it, a 1 is first added to its contents. (That is, there is a side effect to referencing.) Thus, address integers in the register can select the next member of a vector or string for accessing.

The processor state is minimal, consisting of a 12 -bit accumulator (ACXAccumulator \(<0: 11>\) ), an accumulator extension bit called the Link (LLLink), the 12 -bit Program Counter, the GO flip-flop,


Fig. 4. PDP-8 processor block diagram.
and the INTERRUPT.ENABLE bit. The external processor state is composed of console switches and an interrupt request.

The instruction format can also be presented as a decoding diagram or tree (Fig. 5). Here, each block represents an encoding of bits in the instruction word. A decoding diagram allows one more descriptive dimension than the conventional, linear ISPS description, revealing the assignment of hits to the instruction. Figure 5 still requires ISPS descriptions for the memory, the processor state, the effective address calculation, the instruction interpreter, and the execution for each instruction. Diagrams such as Fig. 5 are useful in the ISP design to determine which instruction operation codes are to be assigned to names and operations, and which instructions are free to be assigned (or encoded).

There are eight basic instructions encoded by 3 opcode bits of
the instruction.register, that is, IR \(<0: 2>\). Each of the first memory reference six instructions, where the opcode is less than or equal to 5 , has four addressing modes (direct page.zero, direct current. page, indirect page.zero, and indirect current.page). The first six instructions in the following four categories are:

1 Data transmission
"deposit and clear Accumulator" (DCA). (Note that the add instruction, TAD, is used for both data transmission and arithmetic.)
2 Binary arithmetic
"two's complement add to the Accumulator" (TAD).
3 Binary Boolean "and to the Accumulator" (AND).


Fig. 5. PDP-8 instruction decoding diagram.

4 Program control
"jump/set Program Counter" (JMP); "jump to subroutine" (JMS); "index memory and skip if results are zero" (ISZ).

The subroutine calling instruction, JMS, provides a method for transferring a link to the beginning (or head) of the subroutine. In this way arguments can be accessed indirectly, and a return is executed by a "jump indirect" instruction to the location storing the returned address. This straightforward subroutine call mechanism, although inexpensive to implement, requires reentrant and recursive subroutine calls to be interpreted by software rather than by hardware. A stack for subroutine linkage, as in the PDP-11, would allow the use of read-only memory program segments consisting of pure code. This scheme was adopted in the CMOS-8.

The "in-out transfer" instruction, opcode 6, 1OT uses the remaining nine bits of the instruction to specify instructions to input/output devices. The six io.select bits select 1 of 64 devices. Three conditional pulse commands to the selected device, io.pulse.1, io.pulse.2, and io.pulse.4, are controlled by the 1OT, io.control \(<0: 2>\) operation code bits. The instructions to a typical 1/O device are:
```

1 Testing a Boolean Condition of an 10 device.
1F io.pulse. $1 \Rightarrow$
(1F io.skip.flag[io.select] $\Rightarrow$
$\mathrm{PC}=\mathrm{PC}+1$ )
2 Output data to a device from Accumulator. IF io. pulse. $4 \Rightarrow$
(output.register[io.select] $=$ AC)

```

3 Input data from a device to Accumulator. \(1 F\) io.pulse. \(2 \Rightarrow\)
\((\mathrm{AC}=\) input. register[io.select \(])\)

There are three microcoded instruction groups selected by ( \(\mathrm{IR}<0: 2>\) eqv \#7), called the operate instructions. The instruction decoding diagram (Fig. 5) and the ISP description show the microinstructions which can be combined in a single instruction. These instructions are: operate group \(1((\operatorname{IR}<0: 2>\) eqv \#7) and not \(i b)\) for operating on the processor state; operate group 2 \(((1 R<0: 2>\) eqv \#7) and NOT \(\mathrm{ib}<>\) and \(\mathrm{MB}<11>\) ) for testing the processor state; and the Extended Arithmetic Element group (not included in the 1SP description) (IR \(<0: 2>\) eqv \#7 and \(\mathrm{ib}<>\) and \(\mathrm{MB}<11>\) ) for multiply, divide, etc. Within each instruction the remaining bits, \(\langle 4: 10\rangle\) or \(\langle 4: 11\rangle\), are extended instruction (or opcode) bits: that is, the bits are microcoded to select additional instructions. In this way, an instruction is actually programmed (or microcoded, as it was originally named before "microprogramming" was used extensively). For example, the instruction, "set
link to 1, " is formed by coding the two microinstructions, "clear link" following by "complement link."
```

If (IR $<0: 2>$ eqv \#7) and (group eqv 0$)$ ) $\Rightarrow$
If $\mathrm{MB}<5>\Rightarrow \mathrm{L}=0$; next
If $\mathrm{MB}<7>\Rightarrow \mathrm{L}=\operatorname{not} \mathrm{L}$ )

```

Thus, in operate group 1, the instructions "clear link, complement link, and set link" are formed by coding \(\mathrm{MB}<5,7>=10,01\), and 11 , respectively. The operate group 2 instructions are used for testing the condition of the processor state. These instructions use bits 5, 6, and 8 to code tests for the Accumulator. The AC skip conditions are coded as never, always, AC eql 0 , AC neg 0 , \(A C\) lss \(0, A C\) leq \(0, A C\) geq 0 and \(A C\) gtr 0 . The optional Extended Arithmetic Element (EAE) includes additional Multiplier Quo-

Table 1 PDP-8 Register Transfer Control Signals and Data Break Interface
```

AC/Accumulator, L/Link and combined L, AC LAC
$A C=0 ; A C=\# 7777 ; A C=$ not $A C ; L A C=L A C+1$
$L=0 ; L=1 ; L=\operatorname{not} L ;$
LAC = LAC srr 1; LAC = LAC srr 2; !rotates right
LAC = LAC slr 1; LAC = LAC slr 2; !rotates left
$A C=A C$ or SWITCHES; AC = AC and MB; AC $=10 . B U S$
$A C=A C$ xor $M B ; L A C=$ carry (AC.MB);
(note that previous two commands form: LAC = AC + MB).
MB/Memory.Buffer
$M B=M B+1 ;$
! Increment
$M B=P C ; M B=A C ; M B=M[M A] ; M B=D B . D A T A . \quad$ Set
$M B=0 ;$
MA/Memory.Address
$M A<0: 4>=0 ; M A=P C ; M A=M B ; M A<5: 11>=M A<5: 11>$;
$M A=D B . A D D R E S S$.
PC/Program.Counter
$P C=0 ; P C<0: 4>=0 ; \quad$ !Clear
$\mathrm{PC}=\mathrm{MB} ; \mathrm{PC}<5: 11>=\mathrm{MB}<5: 11>$ !Set
$P C=P C+1 \quad$ Increment
IR/Instruction.Register
$\mathrm{IR}=0 ; \quad$ !Clear
$I R=M[M A]<0: 2>\quad$ !Load
M/Memory[0:4095]<0:11>
$M[M A]=M B \quad$ !write
$M B=M[M A] \quad$ !read
DB/DATA.BREAK interface
DB.DATA<0:11>
I Input to MB
DB.ADDRESS <0:11>
! Input to MA
MB<0:11>
DB.REQUEST ! Control inputs to Pc
DB.DIRECTION
DB.CYCLE.SELECT<0:11>
ADDRESS.ACCEPTED
WORD.COUNT.OK
BREAK.STATE

```


KIMPSI Contains SIATE REGISTER3 RUN INTERRUPTENABLE
- DATA TRANSMISSION FULL OUPLEX - DIAECTEO OATA TRANSMISSIONS
- - CONTROLSIGNALS

Fig. 6. PDP-8 register transfer level PMS dlagram.
tient (MQ) and Shift Counter (SC) registers and provides the hardwired operations, "multiply," "divide," "logical shift left," "arithmetic shift," and "normalize." If all the nonredundant and useful variations in the two operate groups were available as separate instructions in the manner of the first seven (DCA, TAD, etc.), there would be approximately \(7+12\) (group 1) +10 (group 2) \(+6(\) EAE \()=35\) instructions in the PDP-8.

\section*{The Interrupt Scheme}

External conditions in the input/output devices can request that the processor be interrupted. Interrupts are allowed if the processor's interrupt enable flip-flop is set (if INTERRUPT. ENABLE eqv 1). A request to interrupt (i.e., INTERRUPT. REQUEST \(=1\) ) clears the interrupt enable bit (INTERRUPT. ENABLE \(=0\) ), and the processor hehaves as though a "jump to subroutine" 0 instruction (JMS 0) has been executed. A special IOT instruction ( \(\mathrm{MB}<0: 11>\) eql \#6001) followed by a "jump to subroutine indirect" to 0 , and instruction ( \(\mathrm{MB}<0: 11>\) eql \#5220) returns to the processor to the interruptable state with INTERRUPT. ENABLE a 1. The program time to save the processor state is six memory accesses ( 9 microseconds), and the time to restore the state is nine memory accesses ( 13.5 microseconds).

Only one interrupt level is provided in the hardware. If multiple priority levels are desired, programmed polling is
required. Most \(1 / O\) devices have to interrupt because they do not have a program-controlled device interrupt-enable switch. For multiple devices, approximately three cycles ( 4.5 microseconds) are required to poll each interrupter.

\section*{Register Transfer Level}

More detail is required than is provided by either the PMS or ISP levels to describe the internal structure and hehavior of the processor and memory. Figure 4 shows the registers and controllers at a block diagram level, and Fig. 6 gives a more detailed version using PMS notation. Table 1 gives the permissible register transfer operations that the processor's sequential control circuit can give to the PDP-8 registers.

Although electrical pulse voltages and polarities are not shown in Table 1, the operations are presented in considerably more detail than shown in Fig. 4. As Fig. 6 shows, the registers in the processor cannot be uniquely assigned to a single function. In a minimal machine such as the PDP-8, functional separation is not economical. Thus, there are not completely distinct registers and transfer paths for memory, arithmetic, program, and instruction flow. (This sharing complicates understanding of the machine.) However, Fig. 6 clarifies the structure considerably by defining all the registers in the processor (including temporaries and controls). For example, the Memory Buffer (MB \(\backslash\) Memory. Buffer \(<0: 11>\) )


Fig. 7. PDP-8 Pc state diagram.
is used to hold the word being read from or written to memory. The Memory Buffer also holds one of the operands for binary operations (for example, \(\mathrm{AC}=\mathrm{AC}\) and MB). The Memory Buffer is also used as an extension of the Instruction. Register during the instruction interpretation. The additional physical registers, not part of the ISP, are:

\section*{MB \(\backslash\) Mempry. Buffer \(<0: 11>\)}

Holds memory data, instruction, and operands.
MAlMemory.Address \(<0: 1\) I \(>\)
Holds address of word in memory being accessed.
IRUnstruction. Register \(<0: 2>\)

Holds the value of current instruction being performed. state. register \(<0: 1>\)

A ternary state register holding the major state of memory cycle being performed-declared as 2 bits.
F \(\backslash\) Fetch: \(=(\) IF state.register eqv 0)
Memory cycle to fetch instruction.
\(D \backslash D e f e r r e d:=(\) IF state.register eqv 1)
Memory cycle to get address of operand.
E\Execute: \(=(\) IF state. register eqv 2)
Memory cycle to fetch (store) operand and execute the instruction.


Fig. 7. (Continued)

The emphasis in Fig. 6 is on the static definition (or declaration) of the information paths, the operations, and state. The ISP interpretation (Appendix I) is the specification for the machine's behavior as seen by a program.

As the temporary hardware registers are added, a more detailed ISPS definition must be given in terms of time and in terms of temporary and control registers. Instead, a state of diagram (Fig. 7) is given to define the actual processor which is constrained by both the ISP registers, the temporary registers implied by the implementation, and time. The relationship among the state diagram, the ISP description, and the logic is shown in the hierarchy of Fig. I. In the relationships shown in the figures, one can observe that the ISP definition does not have all the necessary detail for fully defining a physical processor. The physical processor is constrained by actual hardware logic and lower level details even at the circuit level. For example, a core memory is read by a destructive process and requires a temporary register (MB) to hold the value being rewritten. This is not represented
within a single ISPS language statement because ISPS defines only the nondestructive transfer; however, it can be considered as the two parallel operations \(\mathrm{MB}=\mathrm{MP}[\mathrm{MA}] ; \mathrm{MP}[\mathrm{MA}]=0\). The explanation of the physical machine, including the rewriting of core using ISPS, is somewhat more tedious than the highest level description shown in Appendix I. For this reason, the state diagram is used (Fig. 7), and the description of the physical machine (in ISPS) is left as an exercise for the reader.
The state diagram (Fig. 7) is fundamentally driven by minor clock cycles as seen from both the state diagram and the times when the four clock signals are generated. Thus, there are 3 (state.register eqv \#0, \#I, \#2) \(\times 4\) (clock) or 12 major states in the implementation. The Instruction. Register is used to obtain two more F2b and F3b, for the description. The state.register values \(0, I\), and 2 correspond to fetching, deferred or indirect addressing (i.e., fetching an operand address), and executing. The state diagram does not describe the Extended Arithmetic Element operation, the interrupt state, or the data break states
(which add 12 more states). The initialization procedure, including the console state diagram, is also not given. One should observe that, at the beginning of the memory cycle, a new state.register value is selected. The state.register value is always held for the remainder of the cycle; i.e., only the sequences F0, F1, F2, F3, or D0, D1, D2, D3, or E0, E1, E2, E3 are permitted.

\section*{Logic Design Level (Registers and Data Operations)}

Proceeding from the register transfer and ISP descriptions, the next level of detail is the logic module. Typical of the level is the l-bit logic module for an accumulator bit, \(\mathrm{AC}<\mathrm{j}>\), illustrated in Fig. 8. The horizontal data inputs in the figure are to the logic module from \(\mathrm{AC}<\mathrm{j}>, \mathrm{MB}<\mathrm{j}>, \mathrm{AC}<\mathrm{j}>\) input from the 10 . Bus. In, and switches \(\langle\mathrm{j}\rangle\). The control signal inputs whose names are identified using the vertical bar (e.g., \(|\mathrm{AC}=0|\) ) command the register operations (i.e., the transfers). They are labeled by their respective \(1 S P\) operations (for example, \(\mathrm{AC}=\mathrm{AC}\) and \(\mathrm{MB}, \mathrm{AC}=\)

AC slr 1, for rotate once left). The sequential state machine, for the processor \(\mathrm{Pc}(\mathrm{K})\), generates these control signal inputs using a combinational circuit as the one shown in Fig. 9.

\section*{Logic Design Level (Pc Control, Pc(K) Sequential State Machine Network)}

The output signals from the processor sequential machine (Fig. 9) can be generated in a straightforward fashion by formulating the Boolean expressions directly from the state diagram in Fig. 7. For example, the \(\mathrm{AC}=0\) control signal is expressed algebraically and with a combinational network in Fig. 9. Obviously, these Boolean output control signals are functions which include the clock, the state.register, and the states of the arithmetic registers (for example, \(\mathrm{AC}=0, \mathrm{~L}=0\), etc.). The expressions should be factored and minimized so as to reduce the hardware cost of the control for the interpreter. Although the sequential controller for the processor is mentioned here only briefly, it constitutes about half the logic within the processor.


Fig. 8. PDP-8 AC \(<j>\) bit logic diagram.


Fig. 9. \(\mathrm{PDP}-8 \mathrm{Pc}(\mathrm{K}) \mathrm{AC}=0\) signal logic equation and dlagram.

\section*{Circuit Level}

The final level of description is the circuits that form the logic functions of storage (flip-flops) and gating (NAND gates). Figures 10 and 11 illustrate some of these logic devices in detail. In Fig. 10 a direct set/direct clear flip-flop (a sequential logic element) is described in terms of circuit implementation, combinational logic equivalent, a state table, and its algebraic behavior. Note that this is not a conventional textbook circuit because it has no output delay and responds directly and immediately to an input. Some conventional sequential logic elements are used in the PDP-8, including RS(Reset-Set), T (Trigger), D (Delay), and JK. A delay in the flip-flops makes them behave in the same way as the "textbook" primitives in sequential circuit theory. The outputs require a series delay, \(\Delta t\), such that, if the inputs change at time, \(t\), the outputs will not change until \(t+\Delta t\). In actuality, the PDP- 8 uses capacitor-diode gates at the flip-flop inputs so that input


Table of Circuit Input-Output


Note this is not an "ideal" sequential circuit element because ihere is no delay in the output
Fig. 10. PDP-8 direct-coupled flip-flop and logic diagram.

(a) Multiple input inverter circuit.

(b) NAND logic element.

(c) NOR logic element.

Table of Circuit Behavior
\begin{tabular}{cccc}
\hline \multicolumn{4}{c}{ Input } \\
\(\mathbf{1}\) & \(\mathbf{2}\) & \(\mathbf{3}\) & Output \\
\hline 0 & 0 & 0 & -3 \\
0 & 0 & -3 & -3 \\
0 & -3 & 0 & -3 \\
0 & -3 & -3 & -3 \\
-3 & 0 & 0 & -3 \\
-3 & 0 & -3 & -3 \\
-3 & -3 & 0 & -3 \\
-3 & -3 & -3 & 0 \\
\hline
\end{tabular}

Table of NAND Behevior
\begin{tabular}{cccc}
\hline \multicolumn{3}{c}{\begin{tabular}{c} 
Input \\
1
\end{tabular}} & \(\mathbf{2}\)
\end{tabular} \(\mathbf{3} \quad\) Output \begin{tabular}{llc} 
\\
\hline 0 & 0 & 0 \\
0 & 0 & 1 \\
0 & 1 & 0 \\
0 & 1 & 1 \\
1 & 0 & 0 \\
1 & 0 & 1 \\
1 & 1 & 0 \\
1 & 1 & 1
\end{tabular}

Table of NOR Bahavior
\begin{tabular}{cccc}
\hline \multicolumn{3}{c}{ Input } & \\
\(\mathbf{1}\) & \(\mathbf{2}\) & \(\mathbf{3}\) & Output \\
\hline 1 & 1 & 1 & 0 \\
1 & 1 & 0 & 0 \\
1 & 0 & 1 & 0 \\
1 & 0 & 0 & 0 \\
0 & 1 & 1 & 0 \\
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 0 \\
0 & 0 & 0 & 1 \\
\hline
\end{tabular}

Fig. 11. PDP-8 combinational circuit and logic diagram.
changes will not be noticed until after the clock passes. This achieves the same effect.

Figure 11 illustrates the combinational logic elements used in the PDP-8. The circuit selection is limited to the inverter circuit with single or multiple inputs. These are more familiarly called NAND gates or NOR gates, depending on whether one uses positive and/or negative logic level definitions.
The core memory structure is given in Fig. 6. A more detailed block diagram showing the core stack with its twelve \(64 \times 64\) 1-bit core planes is needed. Such a diagram, though still a functional block diagram, takes on some of the aspects of a circuit diagram because a core memory is largely circuit level details. The memory (Fig. 12) consists of the component units: the two address decoders (which select 1 each of 64 outputs in the X and Y axis directions of the coincident current memory); selection switches
(which transform a coincident logic address into a high current path to switch the magnetic cores); the 12 inhibit drivers (which switch a high current or no current into a plane when either a 1 or 0 is rewritten); 12 sense amplifiers (which take the induced low sense voltage from a selected core from a plane being switched or not switched and transform it into a 1 or 0 ); and the core stack, an array \(\mathrm{M}[\# 0: \# 7777]<0: 11>\). Figure 12 also includes the associated circuit level hardware needed in the core memory operation (e.g., power supplies, timing, and logic signal level conversion amplifiers).

The timing signals are generated within the control portion of the processor and are shown together with processor clock in Fig. 13. The process of reading a word from memory is:

1 A 12-bit selection address is established on the MA<0:11> address lines, which is 1 of \#10000 (or 4096) unique


Fig. 12. PDP-8 four-wire coincident current (three dimensions) core memory logic diagram.


Fig. 13. PDP-8 clock and memory timing diagram.
numbers. The upper 6 bits \(<0: 5>\) select I of 64 groups of \(Y\) addresses, and the lower 6 bits <6:II> select I of 64 groups of X addresses.
2 The read logic signal is made a I at time t2.
3 A high current path flows via the X and Y selection switches. In each of the X and Y directions, \(64 \times 12\) cores have selection current (Ix and Iy). Only one core in each plane is selected since \(\mathrm{Ix}=\mathrm{I} y=\) Iswitching/2, and the current at the selected intersection \(=\mathrm{Ix}+\mathrm{I} y=\mathrm{I}\) switching.
4 If a core is switched to 0 (by having Iswitching amperes through it), then a I is present and is read at the output of the plane bit sense amplifiers. A sense amplifier receives an input from a winding that threads every core of every bit within a core plane [\#0:\#7777]. All I2 cores of the selected word are reset to 0 . The time at which the sense amplifier is observed is tms (the memory strobe), which also causes the transfer \(\mathrm{MB}=\mathrm{M}[\mathrm{MA}]\).

5 The read current is turned off by timing in the memory module.

6 The inhibit and write (slightly delayed) logic signals are turned on at time tI. The bit inhibit signal is present or not, depending on whether a 0 or \(I\), respectively, is written into a bit.

7 A high current path flows via the X and Y selection switches, but in an opposite direction to the read case (see item 2). If a 1 is written, no inhibit current is present and the net current in the selected core is -Iswitching. If a 0 is written, the current is -Iswitching +(Iswitching/2) and the core remains reset.
8 The inhibit and write logic signals are turned off at time tmd specified by timing in the memory module, and the memory cycle is completed.

\section*{Device Level}

For a discussion of the behavior of the transistor as it is used in these switching circuit primitives, the reader should consult semiconductor electronics and physics textbooks. It is hoped that the reader has gained a sense of how to think about the hierarchical decomposition of computers into particular levels of analysis (and synthesis) and that the hierarchical approach will be of aid in the reading of Parts 2,3, and 4.

\section*{References}

Bell, Mudge, and McNamara [1978]
```

PDP8 :=
begin

```
! The basic PDP-8 instruction set (without extended arithmetic element)
! is implemented. No \(1 / 0\) devices are included in the description.
! I/O instruction execution is limited to the instructions that
! deal with the internal interrupt enable flags and status.
! Reference: "The DIGITAL Small Computer Handbook". 1967 Edition,
1 Digital Equipment Corporation.
**MP.State**
    MP[\#0:\#7777]<0:11〉, \(\quad\) Main memory (4k words)
    page.zero[\#0:\#177]<0:11>:= MP[\#0:\#177]〈0:11>.
    auto.index[\#0:\#7]<0:11〉 := MP[\#10:\#17]〈0:11>.
    \(M B\langle 0: 11\rangle\)
    ! Memory buffer
**PC.State**
    L〈〉, ! Link bit
    AC〈0:11〉.
    PC<0:11〉 1 Program counter
**External.State**
    switches〈0:11》, \(\quad\) Console data switches
    interrupt.request〈〉 \(\quad\) Any device requesting interrupt
** Implementation. Declarations**
go〈〉. ! 1 when running
interrupt.enableく>. \(\quad 1\) when Pc can be interrupted
1ast.pc〈0:11〉,
skip〈〉
! Skip flag
** Instruction. Format**
```

IR\instruction.register<0:2\rangle.
ib\indirect.bit<> := MB\langle3>.
pb\page.bit<\rangle := MB\langle4\rangle, | 0 = zero page; 1 = current page
pa\page.address<0:6\rangle := MB\langle5:11\rangle.
io.select\langle0:5\rangle := MB\langle3:8\rangle, ! I/O device select
io.pulse\langle0:2\rangle := MB\langle9:11\rangle, ! 1/0 pulse control bits
io.pulse.1〈\rangle := io.pulse\langle0\rangle.
io.pulse.2\langle\rangle := io.pulse\langle1\rangle.
io.pulse.4〈\rangle := io.pulse\langle2\rangle.

```

APPENDIX 1 （Cont＇d．）
```

! Instruction format (continued)
group〈〉 : $=\mathrm{MB}\langle 3\rangle, \quad$ Microinstruction group
CLA $\rangle$ : $:=\mathrm{MB}\langle 4\rangle$. Clear AC
CLL〈〉 : $=$ MB〈5>, ! Clear Link
CMA $\rangle \quad:=$ MB $\langle 6\rangle$, . Complement $A C$
CML <> := MB〈7>, ! Complement Link
RAR $\rangle$ : $:=M B\langle B\rangle$, Rotate right
RAL〈〉 $:=\mathrm{MB}\langle 9\rangle$. $\quad$ Rotate left
RTx<> : $=\mathrm{MB}\langle 10\rangle$, Rotate twice
IAC $\rangle \quad:=$ MB〈I1>, I Increment AC
SMA〈〉 := MB〈5〉, ! Skip on minus AC
$S P A\rangle \quad:=M B\langle 5\rangle$. $\quad$ Skip on positive $A C$
SZAく> : $\quad$ MB〈6>, Skip on zero $A C$
SNA〈〉 : $\quad$ MB〈6>, Skip on AC not zero
SNL〈〉 $:=M B\langle 7\rangle$, Skip on Link not zero
SZL.〈〉 $:=M B\langle 7\rangle$.
is $\rangle \quad:=\mathrm{MB}\langle 8\rangle$, ! Invert skip sense
OSR〈〉 $:=M B\langle 9\rangle$ Logical or $A C$ with switches
HLT〈〉 := MB〈10〉 ! Halt the processor
**Address.Calculation**
MA\effective.memory.address〈0:11〉:=
begin
$M A=100000$ pa next I Zero page
If $\mathrm{pb} \Rightarrow \mathrm{MA}\langle 0: 4\rangle=$ last.pc $\langle 0: 4\rangle$ next : Current page
IF $\mathrm{ib} \Rightarrow$ ! Indirect bit
begin
IF MA〈0:8> eql \#001 $\Rightarrow$ : Auto index
$M P[M A]=M P[M A]+1$ next
$M A=M P[M A] \quad$ ! Indirect address
end
end
**Instruction. Interpretation**
start\{main\} :=
begin
go $=1$ next
run()
end.
run $\backslash$ instruction.interpretation :=
begin
IF go $\Rightarrow$
begin
$M B=M P[P C]:$ last. $\mathrm{pc}=P C$ next
$P C=P C+1$ next
exec() next
IF interrupt.enable and interrupt. request $\Rightarrow$ >
begin
$\operatorname{MP}[0]=P C$ next
$P C=1$
end next
RESTART run
end
end

```

\section*{APPENDIX 1 (Cont'd.)}
```

** Instruction.Execution**\{us\}
exec $\backslash i n s t r u c t i o n . e x e c u t i o n ~:=~$
begin
IR $=M B\langle 0: 2\rangle$ next
IF (IR geq \#3) and (IR leq \#5) $\Rightarrow M A()$ next
IF IR leq \#2 $\Rightarrow M B=M P[M A()]$ next
DECODE IR $\Rightarrow$
begin
\#0 : = AND.: = $A C=A C$ and $M B$.
\#1 := TAD $:=$ L@AC = L@AC + MB.
\#2 : = ISZ := begin
$M B=M B+1$ next
If MB eql $0 \Rightarrow P C=P C+1$
end.
\#3 : = DCA : = begin
$M B=A C$ next
$A C=0$
end.
\#4 := JMS := begin
$M B=P C$ next
$P C=M A+1$
end.
\#5 : = JMP $:=\mathrm{PC}=\mathrm{MA}$,
\#6 : = IOT ().
\#7 : = OPR()
end next
If (IR geq \#2) and (IR leq \#4) $\Rightarrow M P[M A]=M B$
end.
IOT :=
begin
DECODE MB〈3:11〉 $\Rightarrow$
begin
\#001 : = ION : = $\begin{aligned} & \text { begin } \\ & \text { interrupt.enable }=1 \text { next }\end{aligned}$
RESTART run
end.
\#002 : = IOF : = interrupt.enable $=0$. ! Turn interrupt off
otherwise := no.op() I Not implemented
end
end.

```

APPENDIX 1 (Cont'd.)
```

skip.group :=
begin
skip = 0 next
DECODE is $\Rightarrow$
beg in
$0:=$ begin
IF SNL and (L eqv 1) $\Rightarrow$ skip $=1$;
IF SZA and ( $A C$ eql 0 ) $\Rightarrow$ skip $=1$;
IF SMA and $(A C$ LSS 0$) \Rightarrow$ skip $=1$
end,
$1:=$ begin
IF SZL@SNAESPA eqv $0 \Rightarrow$ skip $=1$;
IF SZL and (L eqv 0) $=>$ skip $=1$;
IF SNA and (AC neq 0) $\Rightarrow$ skip $=1$;
IF SPA and (AC geq 0) $\Rightarrow$ skip $=1$
end
end next
IF skip $\Rightarrow P C=P C+1 \quad$ I Skip
end.
opr:= ! Operate Instructions
begin
DECODE group $\Rightarrow$
begin
$0:=$ begin
IF CLA $\Rightarrow A C=0$;
IF CLL $\Rightarrow L=0$ next
IF $C M A \Rightarrow A C=\operatorname{not} A C: \quad$ Complement accumulator
IF CML $\Rightarrow L=\operatorname{not} L$ next $\quad$ Complement link
IF IAC $\Rightarrow L Q A C=L Q A C+1$ next ! Increment accumúlator
DECODE RTx $\Rightarrow$
begin
$0:=$ begin ! Rotate
IF RAL $\Rightarrow$ LOAC $=A C Q L$;
IF RAR $\Rightarrow A C Q L=L Q A C$
end,
1 := begin ! Rotate two places
IF $R A L \Rightarrow L Q A C=L Q A C$ sir 2 :
IF RAR $\Rightarrow$ L@AC $=$ L@AC srr 2
end
end
$1:=\mathrm{DECODE} M B\langle I| \Rightarrow \quad$ : Group 2 and 3
begin
0 : = begin
! Group 2 microinstructions
If HLT $\Rightarrow$ go $=0$;
skip.group() next
IF $C L A \Rightarrow A C=0$ next
If $O S R \Rightarrow A C=A C$ or switches
end.
$1:=$ no.op() ! EAE group
end
end
end
end : End of description

```

\section*{Chapter 15}

\section*{A PDP-8 Implemented from AMD Bit-Sliced Microprocessors}

\author{
Michael Tsao
}

An example of a microprogrammable system based on the Am2910 sequencer and the Am2901 ALU will illustrate design with bit slices. The target machine is the PDP-8 1SP (see Appendix 1 of Chap. 8). This register-transfer (RT) level design of the micromachine is thus optimized toward the basic PDP-8. However, the general principles involved in microprogramming bit slices are illustrated by this example. A major goal of this design is the clarity of implementation, rather than the economy of design.

\section*{Overview}

The basic implementation is a one-stage pipeline as shown in Fig. 1 in Chap. 13. In this micromachine, the pipeline register stores the current microinstruction, which is being executed by the Am2910 Sequencer and the Am2901 ALU. The status information (zero, overflow, etc.) of the ALU operations is stored in the Status Register. In a one-stage pipeline design, conditional branches can be executed only by the microinstruction following the microcycle that has generated the branching status. The Am2910 sequencer is used instead of the Am2909 to simplify the design and to aid understandability. A more cost-effective design might actually result from using the Am2909 sequencer, since the number of microinstruction types used to emulate the PDP-8 is small. The Am2901 ALU is used because it more closely reflects the ISP of the PDP-8.
A timing diagram for a typical microcycle is shown in Fig. 1. The indicated delays are typical values, illustrating the timing requirements rather than actual component performances. On the rising edge of the system clock, the Pipeline Register latches the microinstruction to be executed during this microcycle. The output of the Pipeline Register is valid 15 ns later. After another 15 -ns delay, the Condition Code input to the Am2910 is valid. The microsequencer generates the next microaddress based on the current microinstruction and the Condition Code input. When the microprogram memory output is valid (approximately 130 ns after the rising clock edge), the microcycle can be restarted. Concurrently with the sequencer operation and microword fetch, the Am2901 ALU executes the operations specified by the microword in the Pipeline Register. The output of the ALU is


Fig. 1. One-stage pipeline microcycle timing waveform.
valid prior to the falling edge of the system clock. External registers, such as the Memory Address Register (MAR) and the Status Register, use the falling clock edge to latch results from the ALU output port. In this design, the duty cycle of the system clock does not need to be symmetrical at 50 percent.

\section*{RT-Level Implementation and the Microword Format}

The RT-level implementation of the Am2900/PDP-8 is shown in Fig. 2 for the control part, and in Fig. 3 for the data part. The design can best be explained in conjunction with the microword format shown in Table 1. The ISPS description of the RT-level design is listed in Appendix 2. The following subsections discuss the meaning of each microword field and the associated RT-level components. For each microword field, there are three possible bit sizes: the number of bits normally required for the associated components, the minimum required for this PDP-8 application, and the actual field size used. The position of each field in the microword is defined in the ISPS description. The reason for inserting extra bits is to align the fields on octal boundaries, thus aiding the reading of the encoded microprogram.

\section*{Sequencer Instruction and Address Field}

The Am2910 sequencer normally requires a 4 -bit-wide instruction and a 12 -bit-wide "next address" direct input. The microprogram occupies less than 128 words, requiring only 7 bits of address. Two extra instruction bits and two extra address bits are inserted as 0 s in this design example for octal boundary alignment.


Fig. 2. Micromachine--control and sequencer.

Out of sixteen Am29I0 instructions, only 4 are used in this example: Conditional Jump Subroutine (CJS, \#O1), Conditional Jump (CJP, \#03), Conditional Return from Subroutine (CRTN, \#12), and Continue (CONT, \#16). Therefore, it is theoretically possible to use only 2 bits of information to specify these four actions.


Fig. 3. Micromachine-ALU and data.

Table 1 The Microword Format and Required Bits per Field
\begin{tabular}{|c|c|c|c|}
\hline Bits jer field & Normal & Minimum & Actual (ISP) \\
\hline \multicolumn{4}{|l|}{Micro sequencer control} \\
\hline Microinstruction & 4 & 2 & 6 \\
\hline Next microaddress & 12 & 7 & 9 \\
\hline Condition code select & (6) & 6 & 6 \\
\hline \multicolumn{4}{|l|}{ALU control} \\
\hline \multicolumn{4}{|l|}{ALU instruction} \\
\hline Source & 3 & 3 & 3 \\
\hline Function & 3 & 3 & 3 \\
\hline Destination & 3 & 3 & 3 \\
\hline RAM A port select & 4 & 3 & 3 \\
\hline RAM B port select & 4 & 3 & 3 \\
\hline Direct input select & (2) & 2 & 3 \\
\hline Constant mask select & (3) & 3 & 3 \\
\hline \multicolumn{4}{|l|}{Miscellaneous control signals} \\
\hline Control signal select & (4) & 4 & 6 \\
\hline Total & 48 & 39 & 48 \\
\hline
\end{tabular}

\section*{Condition Code Input Selection}

There is only one condition code (CC) input for the Am2910. The status conditions have to be multiplexed into this input. The assignments for the multiplexer input lines can be found in the ISP description in Appendix 1 (ISPS procedure Condition.Code). Five bits are used to select one out of 32 different input signals. The sixth bit in this field is used to select between the original signal and the complement of the signal. In this manner, the micromachine can branch when the signal is either high or low. When an unconditional microprogram branch is required, a logic 0 can be selected for the CC input.

Each bit froin the Instruction Register (IR, 5 bits) or from the Memory Buffer Register (MBR, I2 bits) can be selected individually. This capability is used for the basic PDP-8 instruction decode, effective address calculation, and the Group 7 microinstruction decode. Random combinational logic is used to generate a single skip.enable signal for the portion of the microprogram that decodes the PDP-8 skip conditions. Interrupt requests are also handled by using combinational logic in a similar manner.

\section*{ALU Operations and the Link Bit}

Three Am2901 ALU chips are cascaded to form the PDP-8 ALU section. The ALU requires a 9 -bit opcode: source, function, and destination. Six bits are used to encode the A port (3 bits) and B port ( 3 bits) select, since only a subset of the sixteen ALU RAM registers is used in this implementation.

The PDP-8 Link bit is constructed from random logic controlled by a set of signals. For economic reasons, random logic is used rather than adding another Am2901 chip. The Link bit does not correspond to any Am290I function, and its control would
have to be separately microprogrammed. Another alternative for the PDP-8 Link bit is to use one of the Am290I RAM registers for storing the value. In this case, additional Link-handling microcode would have to be inserted after each PDP-8 ALU operation, increasing the target instruction execution time.

\section*{Data Input to the \(A L U\)}

There is only one method of writing external data into the Am2901 ALU. It is through the Direct (D) input. In this PDP-8 design, three sources are connected to share the D input: data from the main memory (MBR), constants for ALU operations (the Mask ROM), and data in the switch register (SWITCHES). These three sources are connected by an input bus to the D input port on the ALU. The microword selects which one of the three will be the source during any given microcycle.
The use of a separate ROM to store the constants can be debated. An alternative is to store the constants in the microword. It is wasteful to dedicate a microword bit field to this purpose, since the width of this field must be the same as the ALU width and constants are used infrequently. If the microword fields are multiplexed, we violate the design goal of clarity. Hence, a constant ROM is a good compromise between the two conflicting objectives. One need only store the address of the constant in the microprogram.

\section*{Miscellaneous Control Signals}

The data part of this design requires many miscellaneous control signals. For example, the Link bit uses seven different signals to control its operation. Analysis indicates that only one of these signals needs to be asserted during any given microcycle. The Miscellaneous Control Select field in the microword selects one and only one signal during each microcycle. The selection code is decoded and directed to the associated destinations. The assignments of the signals can be found in the ISPS description.

\section*{The PDP-8 Primary Memory}

The primary memory (MP) for the PDP-8 target machine is assumed to be constructed from "static" semiconductor memory chips. In this type of memory, the output constantly displays the content of the location selected by the address input, unless a write operation is in progress. In this PDP-8 design, the ALU output is connected with the Memory Address Register (MAR) and with the data input port of the MP. When the write enable line of the MP is asserted, the content of the ALU output port is latched into the location selected by the MAR. The Memory Buffer Register (MBR), an ISP implementation pseudoregister, is constantly displaying the content of the location selected by the MAR. For the ISPS simulation, the memory access speed is assumed to be less than one microcycle. One can read the value of

MBR (containing data from MP) two microcycles after a "write" into the MAR.

\section*{The Microprogram}

The encoded microprogram that emulates the PDP-8 basic instruction set is listed in Appendix 2. This program listing is extracted from an ISPS simulator command file used to simulate this microprogrammable machine. The content of the constant ROM (Mask) is defined using the ISPS simulator "set" command, e.g., "set Mask[4]=\#0177." The content of the microprogram store is also defined in this manner. As an example, the instruction fetch cycle is now described. (For readability, the encoded microword is broken into seven fields separated by dashes.)

> set \(u M P[000]=\# 03-010-10-403-12-00-10\)
> !RUN: MAR \(\leftarrow\) LastPC \(\leftarrow P C\), IF PDP8.go \(=0\) goto HALT:

If the PDP-8.go bit is off (Condition code select 10), the microprogram jumps to Halt: (location 010). The content of PC (ALU RAM[1]) is pushed to the ALU output. The value is also latched into LastPC (ALU RAM[2]). Concurrently, the value is latched into the Memory Address Register (MAR) using the control code 10 .
\[
\text { set } u \mathrm{MP}[001]=\# 16-000-00-503-11-2 \mathrm{I}-00!\mathrm{PC} \leftarrow \mathrm{PC}+1
\]

The value \#0001 is selected from the constant Mask ROM (2I). The PC value is selected at the ALU A port, added to the constant, and then latched back into PC.
set \(\mathrm{uMP}[002]=\# 03-040-41-703-05-10-15\)
! IR \(\leftarrow \mathrm{ALU} . \mathrm{Mb} \leftarrow \mathrm{MBR}\), goto Exec:
The content of the Memory Buffer Register (obtained by the MP[MAR] operation) is latched into the ALU.Mb (ALU RAM[5]). In this cycle, the MBR is also latched into the Instruction Register (IR) by the control signal I5. The microprogram jumps to the instruction execution section (location 040, Exec:) by forcing a pass-test condition (4I) into the Am2910 sequencer Condition Code input.
set \(\mathrm{uMP}[004]=\# 03-000-03-74 \mathrm{I}-00-20-10\)
!ENDex: MAR \(\leftarrow 0\), IF no interrupt goto RUN:
When the instruction execution is finished, the microprogram returns to this point. The MAR is set to zero in anticipation of interrupt servicing. The MAR will be reset to the correct PC value by microinstruction \(\mathrm{uMP}[00 \mathrm{I}]\) later on. If the interrupt request is not granted (condition code 03), the microprogram jumps back to RUN: (location 000). Otherwise, the program continues to location uMP[005] to handle the interrupt.

\section*{Implementation and Simulation Results}

The micromachine and the microcode were simulated and tested by the ISPS simulator. The results are presented here.

\section*{Chip Count}

Since the micromachine was not actually built, the chip count is an estimate of the required hardware parts. The goal of this exercise is to identify the inefficient area in terms of the parts count, and to suggest alternative IC chip types that may reduce the parts count. (See Table 2.)

The parts count for this microprogrammed PDP-8 implementation is 35 chips. Of these IC parts, over two-thirds ( 25 chips ) are SSI or MSI types. If IC custom-made parts are available for the Link bit, the Skip-condition generate, and the Pipeline Register, the design can be reduced to 22 chips.

\section*{Target-Machine Instruction Execution Speed}

Two methods of comparing this microprogrammed PDP-8 and a basic PDP-8 are discussed here. By counting the average number of microinstructions executed for a target instruction, one can estimate the execution speed of the emulated PDP-8. Or one can compare the execution speed of the two ISPS simulators.

For each target PDP-8 instruction, the microprogram must execute the following number of microinstructions (Table 3).

On the average, 18 microwords \((4+3+6+5\) or \(4+3+11)\) are needed to do one PDP-8 target instruction. At the manufacturer-recommended microcycle time of 150 ns , and not counting the PDP-8 Mp access time, the microprogram execution speed is \(2.7 \mu\) s per target instruction ( \(150 \mathrm{~ns} \times 18\) ). The Mp access time is usually quoted at \(1.3 \mu \mathrm{~s}\) for PDP-8/E and /M [Bell, Mudge, and McNamara, 1978]. For an average instruction (i.e., indirect memory reference), three memory accesses are required: instruction fetch, pointer to data (one level of indirection), and the actual data fetch. When these are added to the \(2.7-\mu \mathrm{s}\) microprogram execution time, the projected maximum average instruction time is \(6.6 \mu \mathrm{~s}\).

Another method of comparison involves the ISPS simulator. Several PDP-8 benchmark and diagnostic programs were simulated. The CPU times used by each simulator were compared. The microcoded PDP-8 uses approximately 20 times the CPU time used by the basic PDP-8 ISP. Translated into simulation CPU time, the ISP simulator of the micromachine executes approximately 1.5 PDP-8 target instructions for every CPU second on a DEC KL-10 processor.

Table 2 Chip Count for a Microprogrammed PDP-8
Chip count Description

1 Am2910 microsequencer. The advantage of using the Am2910 instead of the Am2909 Sequencer is evident here. The Am2909 requires two chips instead of one Am2910 for this example.

3
5 (estimated)

3
4
3 Constant Mask ROM and associated ALU D input selection control. The Constant Mask uses two ROM chips. The D
Constant Mask ROM and associated ALU D input selection control. The Constant Mask uses two ROM chips. The D
input control uses one 2-to-4 decoder. The source registers for the ALU D input bus are assumed to have build-in tristate drivers.
4 (estimated)
Microstore. The microword width is between 39 bits and 48 bits (see Table 1). In using 8-bit-wide ROM or EPROM parts, six such chips are required. Since the microprogram is less than 128 words ( 7 address bits), many commercially available memory chips can be used here.
\(6 \quad\) Pipeline Register (Pipe). Eight-bit-wide \(D\) flip-flops are assumed here. This register is very expansive in terms of chip count. An alternative would be having a special ROM type that can latch the data in the output buffer. Another alternative is to latch the microaddress instead of the microword. In this second design, the microword fetch and ALU-Sequencer operations are in series rather then in parallel as in the original design. This is a classical costperformance tradeoff.

\section*{Am2901 ALU bit slices. Three slices are used to provide the 12-bit-wide PDP-8 data path.}

Link bit and associated hardware. The link bit in this design is constructed of a \(D\) flip-flop, some tristate drivers, and input multiplexers. SSI implementation of the Link bit requires 14 percent (5 out of 35) of the total chip count. An alternative is to use a custom-made MSI chip for the Link bit. A second alternative is to implement the Link bit in the ALU RAM registers. In this second design, additional microcode will have to be inserted to handle the special cases, degrading the overall performance.
Condition Code input multiplexer. Two 16-to-1 MUXs and two 2-to-1 MUXs.
PDP-8 Skip condition generate. The argument for a custom MSI chip can also be made here.

Other miscellaneous parts.

\section*{Table 3 Average Number of Microinstructions Executed for a Target Instruction}

\section*{Words Description}

4 PDP-8 instruction fetch cycle. Check PDP-8.go, fetch target instruction, increment PC, check interrupt conditions.
3 Instruction decodes. A straightforward binary decision decode tree is implemented in microcode. An alternative is to use the Instruction Decode Mapping ROM capability of the Am2910. The advantage of this alternative is not clear in view of the simple PDP-8 ISP.

6 Effective Address Calculation. Depending on the addressing mode, there are five possibilities
2 words PDP-8 Page 0 address
4 words current page
6 words indirect address, Page 0
8 words indirect address, current page
9 words auto index
On the average, approximately six microinstructions are needed to calculate the PDP-8 effective address (equivalent to the Page 0 indirect address).
5 Memory Reference Instructions. For each target instruction, the microcode fetches data from primary memory, executes the operation, and deposits the result in memory. Depending on the particular target instruction, anywhere between two microinstructions (JMP) and eight microinstructions (ISZ) are needed. On the average, five microinstructions are assumed.
PDP-8 OPR group microinstructions. The decoding and execution of the PDP-8 OPR instructions are highly sequential in nature. Therefore, 11 microinstructions executed is taken as the average.

\section*{Summary}

In this chapter, the design of a microprogrammed PDP-8 was presented. The central component of this micromachine was the AMD bit-sliced microprocessor. Although the design was optimized toward the basic PDP-8 configuration, many issues common to all microprogramming and RT-level hardware designs were illustrated. In simulating the micromachine, the usefulness
of the ISP descriptive language as a design tool was also demonstrated.

\section*{References}

Bell, Mudge, and McNamara [1978].
```

AM08 :~
begin
1 ISP of a POP: amulator using the AM2901 end AM2910 bit slice uPg chips
The AM2901 description is expanded to fit the 12 bits wide POP8 dete path,
The AM2910 sequencer uses 7 address bits to eddress }128\mathrm{ microwords.
(1) AMO8 description conteins the following mejor sections.
(2) declerations for the POP\& target mechine, the sequencer, and the ALU.
(2) implemantation pseudo registers for the ISPS description.
(4) the sequencer condition Code (CC)
(4) the sequencer Condition Code (CC) input mutiplexer.
*-PDPB.State**

| MP［0：4095］（11：0）． | Sasic PDPS 4k memory |
| :---: | :---: |
| switches＜11：0才． | ！Switch Register |
| Lく＞． | ！Link Register |
| interrupt．enable＜＞， | ！Interrupt Enable |
| PDP8．go ${ }^{\text {c }}$ ， | ！RUN Dit for the PDPE terget machine |
| IR（0：4） | ！Instruction register \＆pb \＆ib |
| pb＜＞：$=$ IR $\langle 4\rangle$ ， | ！Pege bit |
|  | Sendirect bit |

**Sequencer.State**
uMP[0:127]<47:0>.
PIPE(47:Q),
ulR<5:0) {= P1PE{47:42>, I AM2910 instruction, padded to B bits
nxtAddr<B:0\rangle := PIPE\langle41:33>, l next micro-eddress field
ALU controls
ALUir<8:0\rangle := PIPE\langle28:18\rangle
src<2:0\rangle := ALUir<8:6>,
fnc<2:0\rangle := ALUir<s:3).
dst(2:0):- ALUir<2:0)
ALUport<5:D> ! = PIPE<17:12>. ALU RAM port select
MaskSel<2:0\rangle:= PIPE < \&:5\rangle
MisCntr〈5:0\rangle := P1PE <5:0>, ! Miscellaneous Control signals
**AlU,State**
mask[0:15] <11:0\rangle.
status<3:0>,
SEQ.ovr<> := stetus<3>
ALU.ovr<> := status<2>
ALU.n<> := status<1>
ALU.z<> := status<0>
Microprogram memory
Pipeline Register
souruction
Source
function
destinetion
Constent mesk ROM
SEO result status
ALU result overflow
ALU result is negetive
l AlU result is zero

```
* Implementation. Registers**
AMD8.go<>.
go bit for the micro machine
\begin{tabular}{|c|c|}
\hline interrupt． MBR \(\langle 11: 0\) ）． & quest（＜）， \\
\hline MB＜0：11＞ & ：\(=\) MBR \(\langle 11: 0\rangle\) \\
\hline group＜＞ & ：＝MB〈3＞． \\
\hline CLAS \({ }^{\text {che }}\) & ：\(=\) MB \(\langle 4\rangle\) ． \\
\hline CLLく〉 & ：\(=\) MB〈 5 ¢ ． \\
\hline CMAく＞ & ：\(=\) MB＜B＞． \\
\hline CMLく＞ & ；\(=\) MB〈 \({ }^{\text {c }}\) ， \\
\hline RARS＞ & ：\(=\) MB \(\langle 8\rangle\) ， \\
\hline RAL＜\({ }^{\text {c }}\) & ：\(=\) MB \(\left\langle 9{ }^{\text {a }}\right.\) ， \\
\hline Rix（\％） & ：\(=\) MB \({ }^{\text {c }}\)（10＞． \\
\hline ［ACく＞ & ：\(=\) M 8 ＜11＞． \\
\hline SMAく＞ & ：\(=\) MB \(\langle 5\rangle\) ， \\
\hline SPAく＞ & ：\(=\) MB \(\langle 5\rangle\) ． \\
\hline S2AC＞ & ：\(=\) MB〈Б〉． \\
\hline SNAC＞ & \(:=M B\langle B\rangle\) ． \\
\hline SNLく〉 &  \\
\hline S2L＜＞ & ；\(=\) MB〈 \({ }^{\text {a }}\) ， \\
\hline isく＞ & ：\(=\) MB＜日＞． \\
\hline OSRく＞ & ：＝MB＜g＞， \\
\hline Heli＜＞ & ：\(=\) MB \(\langle 10\rangle\) ， \\
\hline
\end{tabular}

UMP．out 〈47：0〉，
CCodeく＞．
0 in＜11：0＞，
Memory Buffer Register，output of MP［MAR］ Memory Buffer Regist
pops bit assignments
Microinstruction group
Clear AC
Clear AC
Clear Link
Complement AC
Rotete right
Rotete righ
Rotete left
Rotate twice
Increment AC
Skip on minus AC
Skip on positive AC
Skip on zero AC
Skip on AC not zero
Skip on Link not zero
－Skip on Link zero
Invert skip sense
Logical or AC with switches
Halt the processor
ustore output
candition code input
direct input to ALU
ALU carry in
emp．AM2910＜15：0＞
UMP．addr〈6：0〉：\(=1 \mathrm{emp}\), AM2910〈8：0＞，｜effective uStore eddress
ALU．AM2901《17：0），
ALU．Cout \(\langle 1: 0\rangle\) ；\(=\) Iemp．AM2901＜11：0〉，I AlU result Y output

ALU．msb＜＞\(\quad:=\) lemp．Am2gol＜17＞，I ALU MSB output from RAM shifter

\section*{＊＊AMO8．Ixecution＊＊}
start．AMD8（main）：
begin
Initialize the micro machine
AM08．go \(=1\) ：
get the uMachine going
UMP．out \(=\) UMP［0］；
\[
1 \text { start et uAddro }
\]

Initialize the target machine
force interrupt hending which begins at POPS PC＝1
POP8．go \(=1: \quad 1\) get the target－m
interrupt，enable \(=1\) enable interrupt
interrupt，request \(=1\) next \(\quad\) request interrupt
run．AMDA ：\(\quad\) I one uCycle
begin
P First half of the cycle
P！PE \(=\) uMP，out next I latch uword
uAtU．opr ： 1 ALU operations
OECODE DircSel \(\Rightarrow \quad \mid\) select Direct input to ALU
begin \(\quad\) Din \(=0\) ．
WO \(:=\) Din \(=0\),
H1 \(:=\) Din \(=\) MBR，
\＃2 ：＝Din＝mask［MaskSe1］
＊3 ：Oin＝switches
aECODE MisCntr \(\Rightarrow\)
I set ALU carry in bit
begin \(=\) ALU．Cin \(=1\) ，
OIHERWISt：\(=\) ALU．Cin \(=0\)
end next
I Do AlU computetion．Input instruction．Aport．Bport，
Direct input，carry in，MSD，LSB，and eneble output

（ 0 Al Uport \(\langle 2: 0\rangle\) ），Din，ALU．Cin，L，L，＇0）
Seq．opr ：＝！sequencer operation
Condition．Code（）next
！Do sequncer computetion，input instreution，next address．
l condition code，etc．
Temp．AM2910 \(=\) AM2910（uIR〈3：0）．（ 0000 nxtAddr \(\langle 8: 0\rangle\) ）．
CCode．＇0110）next
end next \(=\) UMP［UMP．addr］：UStore ROM eccess
Second half of the cycle
status〈3：0〉＝Iemp．Am2910〈15＞Temp．Am2901〈14〉 8 Temp．Am2901〈13＞8 Temp．Am2901〈12＞：
Do．Control（）next
If AMDE．go end RESTARI run．AMDE
end．
1 end of run．amos
l end of start．AMDE
＊＊Condition．Code＊＊
Condition，Code ；a I calculate condition code begin
Oecode（＇0 CCsel《4：0））\＆
begit
\(\# 00\)
\(:=\)
cCode \(=\)
h01 ：\(=\) CCode \(=1\) ．
\(40 \Rightarrow\) always pass tes
HOI \(\Rightarrow\) always feil test
\(W 41 \Rightarrow\) lways pass test
02 ：＝CCode＝Skip．Cond（）．I calculate Skip condition
\＃03 ：＝CCode \(=\) interrupt，request AND interrupt．enable．
！check for interrupt handing
W04 ：＝CCode \(=\) status \(\langle 0\rangle, \quad\) ALU．z，ALU output eq1 0
HOS ：＝CCode \(=\) status \(\langle 1\rangle\) ．
\＃08 \(:=\) CCode \(=\) status〈2〉，\(\quad \mid\) ALU．ovr，ALU operet ion overflow
\＃07 ：＝CCode＝status〈3＞，I SEQ．ovr，sequncer stack overflow



\begin{tabular}{l|l}
\(113:\) CCode \(=1 R\langle 0\rangle\). & Mss of \\
\(114:=\) CCode \(=p h\). & page bit
\end{tabular}
\(\begin{array}{ll}\text {＂14 ：}=\text { CCode }=\mathrm{pb}, & \text { page bit } \\ \text {＂15 ：}=\text { CCode }=\mathrm{ib}, & \text { indirect bit }\end{array}\)
\begin{tabular}{|c|c|}
\hline \＃20 ：\(=C C o d e=M \Delta\langle 11\rangle\) ． &  \\
\hline W21 ：\(=\) CCode \(=\) MB 210\(\rangle\) ． & ！RIX．HLT \\
\hline ＊22 ：＝CCode＝M0〈09）， & 1 PAL，OSR \\
\hline  & ！RAR，is \\
\hline ＊24 ：＝CCode＝M8〈07＞ & 1 CMI．SML．SIL \\
\hline \＃25 ：\(=\) CCode＝MB＜06＞． & 1 CMA，STA，SHA \\
\hline \＃26 ：\(=\) CCode＝MO＜OS＞． & ！CLL．SMA．SPA \\
\hline 127 ：＝CCode \(=\) MS＜04） ． & 1 CLA \\
\hline \＃30 ：\(=\) CCode \(=\) M0＜03） ． & ！group \\
\hline \＃31 ：＝CCode＝Mu＜O2）． & \\
\hline \＄32 ：\(=\) CCode \(=\) MB＜01＞ ． & \\
\hline \＃33 ：\(=\) CCode \(=\) MB \(\langle 00\rangle\) ． & ！MBR \(\langle 11\rangle\) ，MSO \\
\hline \＃34 ：\(=\) CCode \(=\mathrm{L}\) ． & －Link bit \\
\hline OTHERWISL ：\(=\) CCOde \(=0\) & \\
\hline end next & 1 end of decode \\
\hline
\end{tabular}
！when \(C C s e l\langle 5\rangle=0\) will pass test if the selected signa？is low．＂o＂ 1 when CCsel〈5〉＝1 will pass test if the selected signal is high，＂I＂ If CCsel〈ら〉 \(\Rightarrow\) CCode \(=\) not CCode

\section*{end．}

1 end of Condition．Code
－＊Skip．Conditions．for．POPG．＊

＊Miscelleneaus．Controls＂．
0．Control ：＊
oecoot Miscntr＝ beg in
HOO \(:=\)
HO．OP（

\(\begin{array}{ll}\text { \＃02 }:=L=A L U .1 s b, & 1 L=01 d \text { LSB } \\ \text { \＃03 }:=L=A L U . m S B, & i=01 d \text { MSB }\end{array}\)
\＃04 \(:=L=0, \quad\) i clear \(L\) ink
\＃00 ：＝L not L．I negate \(t\) ink
＂07 ：＝If CML \(\Rightarrow L=\) not \(L\) ．i conditional negete
＊10：（MAR＝AUU，out next i set MAR
MOR＝MP［MAR］）．i read POPA main memory
（MP［MAR｜\(=\) AI U，out next
MBIZ \(=\) MP［MAR \(])\) ．\(\quad\) write into POPB MP
\＃12 ：\(=\) interrupt．enable \(=0.1\) clear intrrupt enable
13 ：\(=\) interrupl．enable \(=1.1\) set Interrupt enable
\(\begin{array}{ll}\text { \＃14：}=A L U . C \text { in }=1 . & \text { iset Alu carry in } \\ N 15:=1 R=M 0\langle 0: 4\rangle, & \text { get instruction }\end{array}\)
\＃18 ：＝POPB．go \(=0 . \quad\) istop the larget machine
\＃17：AMBE．go \(=0\) ．i stop the micro machine

OTHERWISE ：＝NO．OP（）
end 1 end of Do．Control
t logical end of the Amiss description

ISPS of the AM2910 and AM2901 descriptions Olly the key parts of the description are isted here the incension is to show the necessary modifications on the original ISP，making the descriptton callable from the AMOB description．
：AMO 2910 microsequencer description in procedureal form

beg in
－AM2910．External．State＊
\begin{tabular}{|c|c|}
\hline CCENく＞ & \(:=\) Misc．\(i n\langle 3\rangle\) \\
\hline Ciく） & \(:=\) Misc．in＜2＞ \\
\hline HL．O＜＞ & \(:=\) Misc．in \(\langle 1\rangle\) \\
\hline OE（ \({ }^{\text {d }}\) & ：＝Misc．in＜0＞ \\
\hline Futtく＞． & \\
\hline MAP（ \({ }^{\text {P }}\) & enable＜1＞． \\
\hline PLく＞ & enable＜2＞． \\
\hline VECT＜＞ & \\
\hline
\end{tabular}
enable CC input
carry in to increament uPC
load \(R\) register with \(D\) input enable output
！Stack full flag
Map address enabla flag
Pipeline address enable flag
VECT＜＞：＝enable＜0），｜Vector address enable flag
＊am2910．Operstion．Cycle＊＊（us）
run．AM2910（main）：＝
if not RLO \(\Rightarrow R=0\) next
Y（） \(\mathrm{n}_{6 \times \mathrm{t}}\)
uPC \(=Y+C\)
end．
P Besic operation loop
！Forced（external）load of reg．
：Put out selected address
next
！Increment pc
：end of run．AM2910
1 description of \(Y()\)
end． 1 end of AM290！description

AM2901 ALU description in procedural form
 begin
：＊＊Am2901．External．Stare＊＊
```

out<17:0\rangle,
R3out<<
H0ul(> = Out<\17>.
Coul<> := out<16>,
On4<> := out<15>,
|3<> := 0ut<13>,
FEOLO<> := Out<12>.
Y<11,0\rangle = 0ut<1:0)
M<<<= alu<12\rangle

```
*AM2901. Instruction.Cycle**
    run.AM2901 (main\} : =
        ovg \(=\) feqio \(=\) c.out \(=0\) next
        source() next
        exec() next
        destination() next
        \(\mathrm{F}_{3}=\mathrm{f}\langle 11\rangle\) : \(\mathrm{C}_{\mathrm{n}} 4=\mathrm{c}\). out mext | set final flags
        AM2901 = oul
        end,
        ... ! description of source(). etc..
    end
    end

Qutput carrier
RAM3 output
RAMO oulput
Cerry out
Siverflow
Sign bit out
high if AlU output \(=0\)
oala outputs
－Carry out
＊AM2901．Instruction．Cycle＊＊
run．AM2901（main）：＝
f Initialization
！init flags
set final flags
I push out Ald result
！description of source（），etc．．
1 End of AM2901 description

I end of AMOB description

\section*{APPENDIX 2 SIMULATOR COMMAND FILE FOR AM2900} IMPLEMENTATION OF THE PDP－8
```

---
Simulator command file for the AM2g00 implementation of the POPS
radix octal
! constant Mask ROM. used as input to the ALU Oirect input port
s Mask[0]=H0000
s Mask[1]=\#0001
5. Mask[2]=\#0002
5 Mask[3]=M0010
s Mask[4]="0177
s Mask[5]=W7600
s Mask[7]=\#0171
! the micro program

```

```

instruction fetch and interrupl handing cycle

```

```

effective address calculation

```

```

do MB＝MP［ma（）］，fetch data in Mp pointed by ALU．ma UMP $[032]=\$ 0102041431000000$ IMpMa：cjs．call SUBma：
S UMP[033]=\#1200041703051000
! do MP[ma ]=MB. depposit ALU.mb
uMP[034]=\$8600000401400010
S
instruction execution
UMP[040]=\#0306653431000000
S UMP[041]=\#0305052431000000
s UMP[042 J =H0103241431000000
S UMP[043]=\#0304551431000000
5 UMP\044 =\#0304551431000000
s UMP[044]=W0300441143500000
s UMP[045]=\#0300441103500001

| IExec： |  | ［f $\{1 \mathrm{R}\langle 0\rangle=1$ goto | 1月4567 |
| :---: | :---: | :---: | :---: |
| $!$ |  | ［F 1月〈1〉＝1 goto | ［R23： |
| ！ | cjs． | call MpMa： |  |
| 1 |  | IF 18（2）＝1 goto | IAD ： |
| IANO： |  | $\mathrm{AC}+\mathrm{AC} \&$ AILU．mb． | goto |
| ITAU： |  | LAC＋LAC＋ALU．mb． | goto |

```
iS2 and OCA
    " " \# " \# \#
UMP \(0531=0305551431000000\)
いAP'1051 = = \# 0103241431000000
uMP 05 P \(\mid=1600000503552100\)
LMP P \(052 \mid=1600000503552100\)
IMP| 053\(\}=\$ 1303404431000000\)

WHP \(065 \mathrm{~J}=1 \% 0102041431000000\)
UMP' \(056 \mathrm{~J}=1600000403050000\)
UMPI [057] \(=0303441443000000\)
the other 4 instructions
    " " \# " " \# \# \#
UMP [066 \(\mathrm{J}=10307752431000000\)
UMP \([066]=\$ 0307752431000000\)
UMP \(067 \mathrm{~J}=\# 0102041431000000\)
UMP \(\mid 067]=0102041431000000\)
UMP \(\left.^{2} \mid 070\right\}=0307351431000000\)
IMMP| 070\(\}=\# 0307351431000000\)
UMP 071\(\}=\# 1600000403150000\)
UNP| \(071 \mid=\# 1600000403150000\)
UAP| \(072 \mid=\$ 0303441502412100\)
UAP
UMP \([073]=\$ 0300441403410000\)
group 6 instruction, \(10 T\) s
umpr077] \# "\# \# \# \#"
UMP \([077]=H 0312051431000000\)
UMP \([100]=160005402700\)
UMP \([100]=1600000540502700\)
UMP \([101]=\$ 1600000661002100\)
UMP \(102 \mathrm{~J}=\$ 0310404431000000\)
UMP[103] \(=10300041431000013\)
UMP [103 \(104 \mathrm{~J}=\$ 1600000661002200\)
UMP [105 \(\mathrm{J}=\# 0300404431002200\)
UMP' \([106]=\# 0300441431000012\)
11R67:
110T:
110 T
1月4567: If IR 1 (1)=1 goto [R07:
[1月4567: \(\begin{aligned} & \text { IF IR〈1〉=1 } \\ & \text { cjs. call subMa: }\end{aligned}\)
If \(\mid R\langle 2\rangle=1\) goto JMP:
    ALU. mbePC
    PC+AL.L.mat1. goto MpMaMb:
IJMS
I.JMP

Group 7 operating instructions
UMP[120]=\#03140704310000 \# \#
UMP [120] \(=40314070431000000\)
UMP [121] \(=00312327231000005\)
UMP [ 122 I \(=H 1600000443000000\)
UMA' \([123]=80312525770002007\)
UMP \({ }^{\text {UM }} 125\) 12 120312720431000000
UMP [ 125\(]=\# 0312720431000000\)
do rotates
do rotates \# W NW W W
UMP[ 127\(]=H 03133634310000140\)
UMP \([127]=\$ 0313363410000130\)
UMP \([130]=\$ 03004224310000000\)
UMP \([130]=\# 0300422431000900\)
UMP \(\mid 131]=\# 0300421407000003\)
UMP
UMP| \(132 \mid=H 03000421407000003\)
I
UMM'| 133 I \(=H 0300421405000002\)
UMP \([134]=\$ 0300441405000002\)
\(!!\)

Al U.outt-Q xor \(\$ 0001\)
skip If ALU. \(z=0\)
enable interrupt. goto RUN:
ALU.outa xor \(W 0002\)
If ALU. \(L=0\) goto ENDex:
group 2 micro instructions
N \# \# \# \# \# \# \# \# \#
SUMP \(1401=* 0300400431000000\)
s UMP \((140)=10300480431000000\)

s uMP ; \(42 I=H 1600000401000018\)
s uMP 143\(\}=\$ 0314502431000000\)
S LIMP 1 144 \(\mid=\# 1600008503112100\)
s LIMP| \(144 \mid=1600008503112100\)
s UMP
s
S UMP \({ }^{2} 1451=10314727431000000\)

5 UMP
S UMP \([157]=\# 0300422431000000\)
150
! end of aicrocode
IOPR:
ImGRP 1 :
If
disable interr, goto ENDex:
If GAP \(=1\) goto mGRP2:
skip if CLA=0, cond clear Link
sk ip [F CMA \(=0\). Q - 11717. cond comp Link
AC - not \(A C\)
skip if IAC=0
LAC-LAC+1, carry-out cond comp Link
If \(\mathrm{AAH}=1\) goto Right:
If RAL \(=0\) goto EMDex
\(L A C+L A C+2\). If \(r t=0\) goto ENDex:
LAC+LAC 2, If rt=0 goto
LAC-LAC*2, got.o ENO日X:
LAC+LAC/2. If rt=0 goto ENOex:
\(L A C+L A C / 2\), goto \(r=0\) goto
\(L A C+L A C / 2\). goto
disable interr，goto ENOex：

LAC－LAC＋1，carry－out cond comp Link

If \(\mathrm{Mb}\langle 11\rangle=1\) goto ENOex：
skip if hal \(\ell=0, \quad Y\)
POPB．go \(0, ~ Y+A C\)
POPS．goot Y Y AC
skip if
\(1 \mathrm{C} \cdot \mathrm{PC}+1\)
ler－PC＋1
skip If

ac＋0
1 F osr＝0 goto thoex：
1 F os \(\mathrm{r}=0\) goto tNOex：
\(\mathrm{AC}-\mathrm{AC}\) or SWh，goto ENOex：

\section*{Chapter 46}

\section*{The PDP-8 Family \({ }^{1}\)}

\author{
C. G. Bell / J. E. McNamara
}

Figure \(\mathbf{1}\) depicts the PDP-8 family tree. The family ancestry began with the Laboratory Instrument Computer (LINC) initially built at the MIT Lincoln Laboratory in 1962, which, incidentally, we believe was the earliest personal computer. DEC began manufacturing LINC's in 1965. Eventually a PDP-8 and LINC were combined in a dual processor called the LINC-8.
In 1962, the need arose to produce a replacement for an analog monitoring system as a front end to a reactor control complex. A 12-bit real time control computer, the PDP-5, was constructed.
\({ }^{1}\) Abstracted from C. G. Bell, J. C. Mudge, and J. E. McNamara, Computer Engineering: A DEC View of Harduare Systems Design, Digital Press, Maynard, Mass., 1978, pp. 175-208.

The analog nature of the initial application was addressed by building an analog-to-digital converter into the Accumulator. The concept of an I/O bus was introduced instead of the radial I/O structure of previous DEC designs. The I/O Bus permitted equipment options to be added incrementally from a zero base instead of having the pre-allocated space, wiring, and cable drivers that characterized the radial structure. This lowered the entry cost of the system and simplified the later reconfiguring of machines in the field.

Although the design was optimized around the 4 -Kword memory, the PDP-5 ultimately evolved to 32 -Kword configurations using a memory extension unit. Similarly, although the base machine design did not include built-in multiply and divide functions, these were added later in the form of an Extended Arithmetic Element.

While the PDP-5 had been a reasonably successful computer, it soon became evident that a new machine capable of far greater performance was required. New logic technology promised a substantial speed improvement, and new core memory technology was becoming available that would permit the memory cycle


Fig. 1. PDP-8 family tree.
time to be shortened from 6 microseconds in the PDP-5 to 1.6 microseconds in the new machine. In addition, the cost of logic was now low enough so that the program counter could be moved from the memory to a separate register, substantially reducing instruction execution times. The new machine was called the PDP-8.

The new 12 -bit machine was only half the size of its predecessor, occupying only half a cabinet. The net small size meant that the PDP-8 was the first true minicomputer. It could be placed on top of a lab bench or built into equipment. It was this latter property that was the most important, as it laid the groundwork for the original equipment manufacturer (OEM) purchase of computers to be integrated into total systems sold by the OEM.

Like its predecessor the PDP-5, the PDP-8 was a single-address I2-bit computer designed for task environments with minimum arithmetic computing and small primary memory requirements. Typical of these environments were process control applications and laboratory applications such as controlling pulse height analyzers and spectrum analyzers.

The PDP- 8 was the first of the " 8 Family." A subset, called "Omnibus 8 " machines, is introduced later when the PDP-8/E, PDP-8/M, and PDP-8/A machines are discussed. Finally, computers which implement the PDP-8 instruction set in a single complementary metal oxide semiconductor (CMOS) chip will be referred to as "CMOS-8" based systems.

The PDP-8, which was first shipped in April 1965, and the other 8-Family machines that followed it achieved a production status formerly reserved for IBM computers with about 50,000 machines produced by 1979, excluding the CMOS-8 based computers. During the I5 years that these machines have been produced, logic cost per function has decreased by orders of magnitude, permitting the cost of entire systems to be reduced by a factor of 10 . Thus, the 8 Family offers a rare opportunity to study the effect of technology on implementations of the same instruction set processor from early second generation to late fourth generation.

The PDP-8 was followed in late 1966 by the PDP-8/S, a cost-reduced version. The PDP-8/S was quite small in size, scarcely larger than a file cabinet drawer. It achieved its low cost by implementing the PDP-8 instruction set in serial fashion. This did reduce the cost, but it so radically reduced the performance that the machine was not a good seller.

In 1968, the PDP-8/I was produced, using medium-scale integration (MSI) integrated circuits to implement the PDP-8 instruction set with better performance than the PDP-8, and at two-thirds the price. For those customers wishing a package with less option mounting space but the same performance, the PDP-8/L was introduced later the same year.

The PDP-8/S, PDP-8/I, and PDP-8/L are mentioned only briefly here because their characteristics were basically dictated
by the cost and performance improvements made possible by the emerging integrated circuit technology. The cost and performance figures for these machines are examined in greater detail in Figs. 4 to 8 and Table I.

Shortly after the introduction of the PDP-8/L, it became evident that customers wanted a faster and more expandable machine. The continuing technological trend toward higherdensity logic and some new concepts in packaging made it possible to satisfy both of these requirements but to still produce a new machine that would be cheaper than its predecessor. The new machine was the PDP-8/E. The PDP-8/E incorporated an adapter for interconnecting to PDP-8/I and PDP-8/L I/O devices. In addition, signal converters were available for interconnecting to the older PDP-5, PDP-8, and PDP-8/S I/O devices. Thus it was not neccssary to design a complete new set of options at the time the machine was introduced, and existing customers could upgrade to the new computer without having to buy new peripherals. The reason for using an adapter to connect to existing I/O devices was that the PDP-8/E featured a new unified-bus I/O Bus implementation called the Omnibus.

The Omnibus, which is still in use in the PDP-8/A, has I44 pins, of which 96 are defined as Omnibus signals. The remainder are power and ground. The large number of signals permit a great number of intraprocessor communications links as well as I/O signals to be accommodated. The Omnibus signals can be grouped as follows:

\section*{1 Master timing to all components}

2 Processor state information to the console
3 Processor request to memory for instructions and data
4 Processor to I/O device commands and data transfer
5 I/O device to processor, signaling completion (interrupts)
6 I/O Direct Memory Access control for both direct and Three Cycle Data Break transfers

The approximately 30 signals in groups 4 and 5 provide programmed I/O capability. There are about 50 signals in group 6 to provide the Direct Memory Access capability. These 80 signals are nearly equivalent in quantity and function to the preceding PDP-8 I/O Bus design, making the conversion from Omnibus structure to PDP-8/I and PDP-8/L I/O equipment very simple.

The processor for the PDP-8/E occupied three \(8-\times 10\)-inch boards; 4 Kwords of core memory took up three more boards; a memory shield board, a terminator board, a teleprinter control board, and the console board completed the minimuin system configuration. Thus, a total of ten \(8-\times 10-\) inch boards formed a complete system. The three-board PDP-8/E processor, occupying \(240 \mathrm{in}^{2}\), was in striking contrast to the 100 -board PDP- 5 processor, which occupied 2,100 \(\mathrm{in}^{2}\).

The PDP-8/E implementation was determined by the availability of integrated circuits. Multiplexers, register files, and basic arithmetic logic units performed the basic operations in a straightforward fashion using a simple sequential controller. Microprogrammed control was not feasible because suitable read-only memories were not available. Integrated circuit readonly memories available at that time were too small, holding only about 64 bits.

The PDP-8/E was mounted in a chassis which had space and power to accommodate two blocks of Omnibus slots. Thirty-eight modules could be mounted in the slots, allowing space for the processor and almost 30 peripheral option controllers. Many customers wanted to build the PDP-8/E into small cabinets and have it control only a few things. They found the large chassis and its associated price to be more than they wanted. To reach this market, the PDP-8/M was designed.
THE PDP-8/M was essentially a PDP-8/E cut in half. The cabinet had half the depth of a PDP-8/E, and the power supply was half as big. There were I8 slots available, enough for the basic processor-memory system and about eight options. The processor was the same as that for a PDP-8/E.
By 1975, DEC had been building "hex" size printed circuit boards. The hex boards were \(8 \times 15\) inches, half again as big as the "quad" boards used in the PDP-8/E and PDP-8/M, which were 8 \(\times 10\) inches. The dimensional difference was along the contact side of the board. A hex board had six sets of 36 contacts while the quad hoard had only four sets. Semiconductor memory chips had also become available, so a new machine was designed to utilize the larger boards and new memories to extend the PDP-8/E, PDP-8/M to a new, lower price range. The new machine was the PDP-8/A. The PDP-8/A processor and register transfer diagram is shown in Fig. 2.
The hex modules permitted some of the peripheral controller options that had occupied several boards in the PDP-8/E to fit on a single board in the PDP-8/A. The availability of hex boards and of larger semiconductor read-only memories permitted the PDP-8/A processor to use microprogrammed control and fit onto a single board. It should be noted here that when a logic system occupies more than one board, a lot of space on each board is used by etch runs going to the connectors. This was particularly true of the PDP-8/E and PDP-8/M processor boards, due to the contacts on two edges of the boards. When an option is condensed to a single board, more space becomes available than square inch comparisons would at first indicate because many of the etch lines to the contacts are no longer required.

The first PDP-8/A semiconductor memory took only 48 chips ( 1 Kbit each) to implement 4 Kwords of memory. Memories of 8 Kwords and I6 Kwords were also offered. In 1977, only 96 I6-Kbit chips were needcd to form a 128 -Kword memory. With greater use of semiconductor memory, especially read-only memory, a
scheme was devised and added to the PDP-8/A to permit programs written for read-write memory to be run in read-only memory. The scheme adds a 13th bit to the read-only memory to signify that a particular location is actually a location that is both read and written. When the processor detects the assertion of the 13th bit, the processor uses the other 12 bits to address a location in some read-write memory which holds the variable information. This êffectively provides an indirect memory reference.
In 1976, an option to improve the speed of floating-point computation was added to the PDP-8/A. This option is a single accumulator floating-point processor occupying two hex boards. It supports 3 - or 6 -word floating-point arithmetic (I2-bit exponent and 24 - or 60 -bit fraction) and 2 -word double precision 24 -bit arithmetic. As a completely independent processor with its own instruction set processor, it has its own program counter and eight index registers. The performance, approximately equal to that of an IBM 360 Model 40, provides what is probably the highest performance/cost ratio of any computer.
More Omnibus 8 computers (PDP-8/E, PDP-8/M, PDP-8/A) have been constructed than any of the previous models. The high demand for this model appears to be due to the basic simplicity of the design, together with the ability of the user to easily build rather arbitrary system configurations.
In 1976, Intersil offered the first PDP-8 processor to occupy a single chip, using CMOS technology. (Here we should note that an internal to DEC processor-on-a-chip effort, the PDP-8/B, yielded a design in 1973.) DEC verified that it was a PDP-8 and began to apply it to a product in the fall of 1976. In the meantime, in additon to Intersil, Harris Semiconductor became a second source of chip supply for DEC.
The CMOS-8 processor block diagram is given in Fig. 3. The block diagran looks very much like a conventional PDP-8/E processor design using medium scale integrated circuits. It has a common data path for manipulating the Program Counter (PC), Memory Address (MA), Multiplier-Quotient (MQ), Accumulator (AC), and Temporary (Temp) registers. The Instruction Register (IR), however, does not share the common arithmetic logic unit (ALU). Register transfers, including those to the "outside world," are controlled by a programmable logic array (PLA), as indicated by the dotted lines in the figure.
While the CMOS-8 is the first DEC processor to be built on a single chip, the most interesting thing about it is the systems configurations that it makes possible. It is not only small in size (a single 40 -pin chip), but it also has miniscule power requirements due to its CMOS construction. Thus, some very compact systems can be built using it.
An excellent example of the use of a CMOS-8 as part of a packaged system is the VT78 video terminal. The goals for this terminal were to drastically reduce costs by including the keyboard, cathode ray tube, and processor in a single package the

Table 1 Characteristics of PDP-8 Family Computers
\begin{tabular}{|c|c|c|c|c|}
\hline & PDP-5 & PDP-8 & PDP-8/S & PDP-8/I \\
\hline Project start, first ship & 9/63 & 4/65 & 9/66 & 4/68 \\
\hline Goals & Lowest cost computer, interfaceability & Cost, much greater performance & Cost; tabletop & Better cost, more function than 8 \\
\hline Applications & Process control monitoring; laboratory & + message switch control Lab processing for instruments & Standalone calculator & \[
\begin{aligned}
& \text { Remote job } \\
& \text { entry station, } \\
& \text { TSS/8 }
\end{aligned}
\] \\
\hline Innovations/improvements & I/O bus; ISP & Wire-wrap; producible; low cost bitsample communications controller & Serial implementation & Integrated circuits \\
\hline ```
Processor + 4 Kword
    memory (K$)
``` & 25.8 & 16.2 & 8.79 & 11.6 \\
\hline Same + terminal (K\$) & 27.0 & 18.0 & 9.99 & 12.8 \\
\hline Price/memory word (\$) & 1.83 & 1.83 & 0.73 & 1.46 \\
\hline \[
\begin{aligned}
& \text { Processor }+8 \text { Kword }+ \\
& \text { terminal }+ \text { mass storage }
\end{aligned}
\] & 51.1 & 38.8 & 30.4 & 28.9 \\
\hline Memory cycle time & 6.0 & 1.6 & 8.0 & 1.5 \\
\hline Processor Mwords accessed/s & 0.1 & 0.63 & 0.04 & 0.67 \\
\hline Processor bits accessed/s/\$ & 93 & 466 & 55 & 651 \\
\hline Performance/price improvement (over predecessor) & .... & 5.0 & 0.12 & 11.8 \\
\hline Price improvement & .... & 1.6 & 1.84 & 0.76 \\
\hline Performance improvement & & 6.3 & 0.06 & 16.75 \\
\hline Product life (years) & 3 & 5 & 3 & 3 \\
\hline Programmed I/O bus & 49 & 49 & \(43+\) Bus & 40 \\
\hline DMA I/O bus & 49 & 49 & 49 & 50 \\
\hline
\end{tabular}
size of an ordinary terminal. The CMOS-8 chip and high density RAM chips made this possible. To form a complete, stand-alone computer system that supports five terminals, mass storage was added. Because the mass storage was floppy disks, it was not in the terminal but in a small cabinet. Even without the mass storage, however, the VT78 forms an "intelligent terminal." An intelligent terminal is usually defined to include a computer
whose program can be loaded (usually via a communications line) to take on a variety of characteristics-i.e., it can learn. An intelligent terminal can be used either as part of a network or as a stand-alone computer system. In the former case, the application is determined by the network to which the terminal is attached, but in the latter case, the terminal functions as a desk-top computer running various PDP-8 software.
\begin{tabular}{lllll}
\hline PDP-8/L & PDP-8/E & PDP-8/M & PDP-8/A & VT78 \\
\hline \(11 / 68\) & \(3 / 71\) & \(6 / 72\) & \(1 / 75\) & \begin{tabular}{l} 
Lower cost
\end{tabular} \\
& \begin{tabular}{c} 
Easy to configure; \\
more functions \\
better performance \\
+Business data \\
processing, \\
testing;
\end{tabular} & \begin{tabular}{c} 
Lower cost, \\
limited system
\end{tabular} & \begin{tabular}{c} 
Lower cost \\
higher density
\end{tabular} & \begin{tabular}{c} 
Cost; complete \\
system in a \\
terminal
\end{tabular} \\
& & & \begin{tabular}{c} 
Computer-in-a- \\
desk
\end{tabular} & \begin{tabular}{c} 
Word processing; \\
desk-top com-
\end{tabular} \\
& & & & puter, terminal
\end{tabular}

\section*{Technology, Price, and Performance of the 12-Bit Family}

The PDP-8 has been re-implemented 10 times with new technology, from early second-generation to late fourth-generation, over a period of 15 years. Its implementations have included a minimal minicomputer and a minimal microcomputer. The performance characteristics of these implementations are given in Fig. 4. New
technology can be utilized in the computer industry in three ways: lower cost implementations as constant performance and functionality, higher performance implementations at constant cost, implementation of new basic structures. Of these three ways, the PDP-8 Family has primarily used lower cost implementations of constant performance and functionality.
The points in Fig. 5 are arranged to show the cost trends of


Fig. 2. PDP-8/A processor and register transfer diagram.
three configurations. The first configuration is merely a central processor with 4 Kwords of primary memory. The second configuration adds a console terminal, and the third configuration adds DECtapes or floppy disks for file storage. Note that the basic system represented in the first configuration has declined in price most rapidly: 22 percent per year in the early days and 15 percent per year in recent years. The price of primary memory, on the other hand, has declined at the rate of 19 percent per year, as seen in Fig. 6.

The price and performance trajectories for the PDP-8 family of machines are plotted in Fig. 7, with lines of constant price/ performance separated at factors of 2 . Note that the early implementations had significantly lower performance than the original PDP-8. Memory performance and instruction execution
performance were directly related in all of these machines except the PDP-5 (which kept the Program Counter in primary memory) and the PDP-8/S (which was a serial machine). Thus, with the design emphasis on lowering the cost with each new machine, performance continued to lag behind that of the PDP-8 until higher speed primary memory was available without a cost penalty. Other performance improvements, such as the addition of floating-point hardware or the addition of a cache, are not treated in this comparative analysis.

Figure 8 gives the performance/price ratio for the PDP-8 Family machines. Setting aside the PDP-5 design point, the improvement for the 12 -bit machines has been 22 percent per year.

Rather than try to fit a single exponential to the performance/


Fig. 3. Block diagram of CMOS-8.


Fig. 5. Price of DEC's 12-bit computers versus time (log).


Fig. 4. Performance of DEC's 12-bit computer versus time.


Fig. 6. Price per word of 12-bit memory versus time.


Fig. 7. Price versus performance of DEC's 12-bit computers.


Fig. 8. Bits accessed by the central processor/s/\$ versus time (for 4 K word + processor systems).


Fig. 9. Evolution of PDP-8 family PMS structures.
price data points in Fig. 8, it might be better to try two independent exponentials. The reason for this is that the data points really mark the transition between two generations. The PDP-5 was a mid-second (transistor) generation machine, and the PDP-8 represents a late second generation machine. The PDP-8/I and PDP-8/L were beginning third (integrated circuit) generation designs. These four machines represent a relatively rapid evolution from 1963 to 1968. After the PDP-8/L, the evolution slows
somewhat between 1968 and 1977, as medium-scale integrated circuits continued to be the implementation technology, and the cost of packaging and connecting components continued to be controlled by the relatively wide bus structure.
During their evolution, the DEC 12 -bit computers have significantly changed in physical structure, as can be seen from the block diagrams in Fig. 9. The machines up through the PDP-8/L had a relatively centralized structure with three buses to interface to memory, program-controlled I/O devices, and Direct Memory Access devices. The Omnibus-8 machines bundled these connections together in a simpler physical structure. The CMOS-8 avoids the wide bus problem by moving the bus to lines on a printed circuit board. The number of interconnection signals on the bus is then reduced by roughly a factor of 4 to about 25 signals which can be brought into and out of the chips within the number of pins available.
Figures 4 and 7 illustrate the oscillating price/performance history of the design evolution summarized below:

1 While the PDP-5 was designed to keep price at a minimum, the PDP-8 had additions to improve the performance while not increasing price significantly over that of a slower speed design. The cost per word was modestly higher with the PDP-8 than with the PDP-5, but the PDP-8 had 6 times the performance of a PDP-5. Thus, the PDP-8 crosses three lines of constant price/performance in Fig. 7.
2 The PDP-8/S was an attempt to achieve a minimum price by using serial logic and a minimum price memory design. However, the performance of the PDP-8/S was low.

3 The market pressures created by PDP-8/S performance probably caused the return to the PDP-8 design, but in an integrated circuit implementation, the PDP-8/I.
4 The PDP-8/1 was relatively expensive, so the PDP-8/L was quickly introduced to reduce cost and bring the design into line with market needs and expectations.
5: The PDP-8/E was introduced as a high performance machine that would permit the building of systems larger than those possible with the PDP-8/L.
6 The PDP-8/Mwas a lower cost, smaller cabinet version of the PDP-8/E and was intended to meet the needs of the OEM market.

The design goal of machines subsequent to the PDP-8/M has been primarily one of price reduction. The PDP-8/A was introduced to further reduce cost from the level of the PDP-8/E and PDP-8/M, although some large system configurations are still built with PDP-8/E machines. The CMOS-8 chips represent a substantial cost reduction but also a substantial performance reduction. The CMOS-8 performance is one-third that of a PDP-8/A, so a stand-alone system using a CMOS-8 is less cost-effective than an PDP-8/A when the central processor is used as the only performance criterion. The main reason for using large-scale integration is the reduced cost and smaller package rather than performance. Obviously, the next step is increased performance or more memory, or both more performance and more memory on the same chip.```

