Guide to the TD4 4-bit DIY CPU
Guide to the TD4 4-bit DIY CPU
I bought a cute little 4-bit CPU kit from Aliexpress called the TD4. It has 2 registers, some LEDs, and 16 bytes of program ROM. Quite limited but still very cool and teaches a lot of principles of computer architecture.
The documentation, schematics, and pictures for this CPU are here: https://github.com/wuxx/TD4-4BIT-CPU. It's a little sparse though. I can imagine a student getting overwhelmed. So I thought it would be helpful to write some longer form notes.
Building
We took two sessions just soldering it up. This image plus the schematic were enough instructions such that it wasn't too hard. The directionality of the surface mount diodes gave us pause, but we used a diode tester functionality in our multimeter to figure it out. The tiny green line on the front of the diode is towards the bottom of the board. I believe the line was on the back of the diode was towards the top, but it is soldered down now so I can't check.
The thing that gave us the most trouble was soldering on the USB connector. I'd advise doing this before putting on the IC sockets, because it got in the way of the iron. The middle pins of the USB are unconnected, so you can just blast them if need be. The USB is only for power.
Which chip goes where can be seen by inspecting the schematic for part number and IC number. The notch aligns with the notch printed on the PCB board. It basically worked the first time, minus an intermittent power connection on the USB. Soldering all those diodes sucked.
How it Works
At a high level, here's how it works. The program ROM is a bank of 16 DIP switches. This is the entire size of programs available to you, which makes it tricky to do much.
Pins 5-8 of each DIP switch are an opcode, selecting between ADD, MOV, IN, OUT, JNC, JMP instructions. These bits go through some combinatorial circuit to control:
- What signal enters the addition unit (input operand of instruction)
- Which register latches in the output of the addition unit (the output operand of the instruction)
Bits 1-4 of the DIP switch are the immediate value for the instruction, which is always piped into the adder. The data selector can also select a hard wired zero signal (ground). This is used for example by the MOV A,Im instruction. This MOV instruction is implemented by adding Im to 0 and clocking the result into A.
The order of the immediate and command bits is a bit confusing. Sometimes it feels like they are reversed. The least significant bit is the low labelled one on the DIP switch. For example, the value 3 is entered in as 1100 on the first 4 bits of the DIP switch.
Address Decoder
The address decoder chip IC11 is a demultiplexer that takes in a 4-bit signal and drives one of its 16 output signals low that corresponds to this number. When driven low, current can flow through that DIP bank's diodes if the DIP switch is connected. Otherwise, pullup resistors R21-R26 keep the signal high. IC12 is essentially a buffering inverter circuit for the result of the ROM.
Command Decoder
The command bits are translated by discrete combinatorial logic chips in IC8 and IC10 into not_LOAD0,1,2,3 and SEL_A/SEL_B signals. Really, not_LOAD0,1,2,3 could be called not_LOADA,B,Out,PC since that is what they are hooked up to.
| instruction | bit7-bit4 | bit3-bit0 | C | SEL_B | SEL_A | #LOAD0/A | #LOAD1/B | #LOAD2/IN | #LOAD3/PC |
|---|---|---|---|---|---|---|---|---|---|
| ADD A, Im | 0000 | Im | X | L | L | L | H | H | H |
| MOV A, B | 0001 | 0000 | X | L | H | L | H | H | H |
| IN A | 0010 | 0000 | X | H | L | L | H | H | H |
| MOV A, Im | 0011 | Im | X | H | H | L | H | H | H |
| MOV B, A | 0100 | 0000 | X | L | L | H | L | H | H |
| ADD B, Im | 0101 | Im | X | L | H | H | L | H | H |
| IN B | 0110 | 0000 | X | H | L | H | L | H | H |
| MOV B, Im | 0111 | Im | X | H | H | H | L | H | H |
| OUT B | 1001 | 0000 | X | L | H | H | H | L | H |
| OUT Im | 1011 | Im | X | H | H | H | H | L | H |
| JNC Im | 1110 | Im | L | H | H | H | H | H | L |
| JNC Im | 1110 | Im | H | X | X | H | H | H | H |
| JMP Im | 1111 | Im | X | H | H | H | H | H | L |
Data Selector
SEL_A and SEL_B are wired into the data selector chips IC6 and IC7. They select from 4 possible options:
| A | B | In Operand |
|---|---|---|
| 0 | 0 | A |
| 0 | 1 | B |
| 1 | 0 | In |
| 1 | 1 | Zero |
The Zero signal is hardwired to ground.
Adder
The 4 bits selected output to the Adder chip IC8, which also receives the immediate bits 1-4 from the ROM bank selected. A side note: the immediate bits are read by more instructions than might be apparent from their mnemonic. For example, OUT B actually still adds in the immediate. This functionality could be useful but also confusing. By default, zero out the immediate bits if you don't want them.
Registers
The registers A, B, Out, PC are all counter chips. Only PC has the counting functionality enabled. On the edge of the clock signal, if the not_LOADx signal is low, data will latch in from pins A,B,C,D into the output pins QA,QB,QC,QD. JMPs are implemented by moving values into the PC register, which is fed back into the ROM address decoder.
Carry Flip Flop
An interesting piece is the Carry circuit which has a D flip flop for storing the overflow carry bit from the previous operation. This is needed to implement the JNC "jump if no carry" functionality. This operates by feeding into the command decoder circuit and disabling the LOADPC coming from the instruction if there is a carry.
Note that there appears to be an error on this version of the schematic for the command decoder. Can you find it?
Clock
Down in the bottom right is the clock circuit. The switches change from manual clock to auto clock and also change the clock speed. The clock is an RC controlled multivibrator. I haven't actually seriously analyzed it.
Simple Programs
We ran a couple experimental programs. First it's good to try something very simple. For reference again, this is the instruction set:
| instruction | bit7-bit4 | bit3-bit0 |
|---|---|---|
| ADD A, Im | 0000 | Im |
| MOV A, B | 0001 | 0000 |
| IN A | 0010 | 0000 |
| MOV A, Im | 0011 | Im |
| MOV B, A | 0100 | 0000 |
| ADD B, Im | 0101 | Im |
| IN B | 0110 | 0000 |
| MOV B, Im | 0111 | Im |
| OUT B | 1001 | 0000 |
| OUT Im | 1011 | Im |
| JNC Im | 1110 | Im |
| JMP Im | 1111 | Im |
Output to LED
Try getting this program to work:
OUT 0101
Looking up the opcode for OUT Im on the table it is 1011. This translated to the first DIP switch being in 1010 1101. The first piece is the constant immediate, and the second piece is the reverse of that opcode.
Simple Flashing
Now we can try a simple blinking program with a jump. We output two different constants and then JMP back to instruction 0:
OUT 0001
OUT 0010
JMP 0
Which translates to:
# Im | Com
#------|-----
OUT 0001 # 1000 | 1101
OUT 0010 # 0100 | 1101
JMP 0 # 0000 | 1111
Counting Up
Now we can try counting:
ADD B 1
OUT B
JMP 0
Counting Down
Counting down can be done once you realize that adding 15 is the same as subtracting 1 modulo 16. Requires realizing modular arithmetic:
ADD B 15
OUT B
JMP 0
Counting Up then Down
This is much trickier strangely. We screwed up a lot getting this simple program to work. In our defense we were tired. But remember to update your jump addresses, put things in the right order, manually assemble - it's a lot.
Mistakes we made:
- We fed will ice cream too early.
- We flipped the labels going into
JNCvsJMP. - We had the
OUTin places they couldn't be executed (at the bottom of the program). - We moved
OUTafter the add, destroying the carry forADD A 1. - We realized we should think of
ADD B 1, OUT Bas a single unit. A pseudoinstruction. - And in fact after examining the circuit, these two can be compressed to
OUT B 1becauseOUTactually takes immediates. We lucked out that we zeroed the immediates onMOVandOUT.
Remember the program counter was 0 indexed:
00: ADD B 1 1000 1010
01: OUT B xxxx 1001
02: MOV A B xxxx 1000 # test if at 15
03: ADD A 1 1000 0000 # adding 1 to 15 makes a carry
04: JNC 0 0000 0111 # jump to 0 if no overflow (B != 15)
05: ADD B 15 1111 1010 # Go down 1 by adding 15
06: OUT B xxxx 1001
07: MOV A B xxxx 1000 # test if at 0
08: ADD A 15 1111 0000 # only 0 does not overflow when add 15
09: JNC 0 0000 0111 # jump to ascending at 0 if not overflow (B = 0)
10: JMP 5 0110 1111 # default jump to 5
An alternative possibly buggy version. Using the idea JC ~ JNC;JMP and can we directly test the overflow on B without using A as scratch? Do we miss some numbers though? The ends of the loop are tricky.
00: ADD B 1
01: JNC 3
02: JMP 5
03: OUT B
04: JMP 0
05: MOV B 15
06: ADD B 15
07: JNC 0
08: OUT B
09: JMP 6
Python Assembler and Simulator
Here's a little Python simulator and assembler that Ben and I slammed out. Well, Ben mostly wrote. Maybe using regex like this is a bit goofy. Maybe it's nice? match statements rule. We should return to this and make it better, but it's a start. There is also an assembler in the TD4 repo but I haven't tried using it.
import re
from collections import namedtuple
from types import SimpleNamespace
instruction_regex = re.compile(r"(?P<cmd>ADD|OUT|MOV|JMP|JNC|IN) +(?P<arg1>A|B|\d{1,2})(?: +(?P<arg2>A|B|\d{1,2}))? *(?:#.*)?")
input = """\
ADD B 1
OUT B
MOV A B
ADD A 1
JNC 0
ADD B 15
OUT B
MOV A B
ADD A 15 # this is a comment
JNC 0
JMP 5\
"""
NULL_IM = "0000"
def im_to_str(im):
return f"{int(im):04b}"[::-1]
# Assembler
# We made an error on ADD B not updating something copied from ADD A
program = []
for line in input.split("\n"):
instruction = instruction_regex.fullmatch(line)
match instruction["cmd"], instruction["arg1"], instruction["arg2"]:
case "ADD", "A", im:
program.append(im_to_str(im)+"0000"[::-1])
case "MOV", "A", "B":
program.append(NULL_IM+"0001"[::-1])
case "IN", "A", None:
program.append(NULL_IM+"0010"[::-1])
case "MOV", "A", im:
program.append(im_to_str(im)+"0011"[::-1])
case "MOV", "B", "A":
program.append(NULL_IM+"0100"[::-1])
case "ADD", "B", im:
program.append(im_to_str(im)+"0101"[::-1])
case "IN", "B", None:
program.append(NULL_IM+"0110"[::-1])
case "MOV", "B", im:
program.append(im_to_str(im)+"1001"[::-1])
case "OUT", "B", None:
program.append(NULL_IM+"1001"[::-1])
case "OUT", im, None:
program.append(im_to_str(im)+"1011"[::-1])
case "JNC", im, None:
program.append(im_to_str(im)+"1110"[::-1])
case "JMP", im, None:
program.append(im_to_str(im)+"1111"[::-1])
case _:
raise Exception("Unrecognized command")
print(program)
# Simulator
State = namedtuple("State", "a, b, out, pc, c, inp, program")
def build_starting_state(program):
return State(0,0,0,0,0,0,program)
starting_state = build_starting_state("""\
ADD B 1
OUT B
JMP 0\
""".split("\n"))
def step(state):
a, b, out, pc, c, inp, program = state
instruction = instruction_regex.fullmatch(program[pc])
match instruction["cmd"], instruction["arg1"], instruction["arg2"]:
case "ADD", "A", im:
im_int = int(im)
assert 0 <= im_int < 16
result = a + im_int
return State(result % 16, b, out, (pc+1)%16, result >= 16, inp, program)
case "MOV", "A", "B":
return State(b, b, out, (pc+1)%16, False, inp, program)
case "IN", "A", None:
return State(inp, b, out, (pc+1)%16, False, inp, program)
case "MOV", "A", im:
im_int = int(im)
assert 0 <= im_int < 16
return State(im_int, b, out, (pc+1)%16, False, inp, program)
case "MOV", "B", "A":
return State(a, a, out, (pc+1)%16, False, inp, program)
case "ADD", "B", im:
im_int = int(im)
assert 0 <= im_int < 16
result = b + im_int
return State(a, result % 16, out, (pc+1)%16, result >= 16, inp, program)
case "IN", "B", None:
return State(a, inp, out, (pc+1)%16, False, inp, program)
case "MOV", "B", im:
im_int = int(im)
assert 0 <= im_int < 16
return State(a, im_int, out, (pc+1)%16, False, inp, program)
case "OUT", "B", None:
return State(a, b, b, (pc+1)%16, False, inp, program)
case "OUT", im, None:
im_int = int(im)
assert 0 <= im_int < 16
return State(a, b, im_int, (pc+1)%16, False, inp, program)
case "JNC", im, None:
im_int = int(im)
assert 0 <= im_int < 16
if not c:
return State(a, b, out, im_int, False, inp, program)
else:
return State(a, b, out, (pc+1)%16, False, inp, program)
case "JMP", im, None:
im_int = int(im)
assert 0 <= im_int < 16
return State(a, b, out, im_int, False, inp, program)
case _:
raise Exception("Unrecognized command")
def simulate(state):
while True:
yield state
state = step(state)
# Circuit simulation
def counter(inp, not_load):
a = (inp >> 0) & 1
b = (inp >> 1) & 1
c = (inp >> 2) & 1
d = (inp >> 3) & 1
return a, b, c, d
assert(counter(1,1,1,1,0,1) == 1,1,1,1)
assert(counter(1,1,1,1,1,1) == 0,0,0,0)
assert(counter(1,1,1,1,1,0) == 0,0,0,0)
def multiplexer(c0, c1, c2, c3, a, b):
match a, b:
case False, False:
return c0
case False, True:
return c1
case True, False:
return c2
case True, True:
return c3
def flip_flop(d, not_clr):
clr = not not_clr
if clr:
return False, True
return d, not d
def adder(a0, a1, a2, a3, b0, b1, b2, b3):
a_int = a0 + (a1 << 1) + (a2 << 2) + (a3 << 3)
b_int = b0 + (b1 << 1) + (b2 << 2) + (b3 << 3)
s_int = a_int + b_int
s0 = (s_int >> 0) & 1
s1 = (s_int >> 1) & 1
s2 = (s_int >> 2) & 1
s3 = (s_int >> 3) & 1
carry_out = (s_int >> 4) & 1
return s0, s1, s2, s3, carry_out
# We found a bug in one of the diagrams. It showed
def command_decoder(d4, d5, d6, d7, not_c):
sel_a = d4 or d7
sel_b = d5
not_load0 = d6 or d7
not_load1 = (not d6) or d7
not_load2 = not ((not d6) and d7)
not_load3 = not ((not_c or d4) and d6 and d7)
return not_load0, not_load1, not_load2, not_load3, sel_a, sel_b
def rom():
...
def circuit_step(state):
a, b, out, pc, c, inp, program = state
adder0, adder1, adder2, adder3 = adder()
a_val = counter(adder0, adder1, adder2, adder3, not_load0)
b_val = counter(adder0, adder1, adder2, adder3, not_load0)
out_val = counter(adder0, adder1, adder2, adder3, not_load0)
pc_val = counter(adder0, adder1, adder2, adder3, not_load0)
new_state = State(a_out, b_out, out_out, ...)
Use lark?
prog = """
ADD B 1
"""
{
"ADD": "1000",
"OUT": "1001",
"MOV": "1010",
"JNC": "0111",
"JMP": "1111"
}
def assemble(x):
op = x[0]
match x:
case "MOV":
case "ADD":
case "OUT":
case "JNC":
case "JMP":
case _:
raise ValueError(f"Unknown op {op}")
grammar = """
"ADD"
"MOV"
"JNC"
"OUT"
"JMP"
"JNC"
"@" name -> label
"""
Comments
No comments yet. Start the discussion.