Introduction#
I have recently been studying the CS61C course at Berkeley, where Project 2 involves implementing handwritten digit recognition using RISC-V.
It sounds complicated, but it's actually manageable. The main challenges are efficiently utilizing registers, writing and calling functions in assembly language, manually allocating memory from the stack, and debugging assembly programs using Venus.
In the end, you just need to connect them together to form an artificial neural network (ANN) that can classify handwritten digits.
RISC-V Calling Conventions#
If we talk about the most important part of programming in RISC-V, it has to be the Calling Convention. It allows functions to freely use registers without worrying about causing errors. Imagine if function A needs to use the s1 register, but the s1 register currently holds an important value from function B. If function A changes it, function B will encounter errors when it tries to use it. If we had to constantly check whether the registers we want to use are still needed by other functions, our programming would turn into a disaster. This is where the Calling Convention makes our lives easier!
When a function is called, we refer to the calling function as the Caller and the called function as the Callee. They have different responsibilities for saving registers.
Saved Registers (Callee Saved)#
- s0-s11 (saved registers)
- sp (stack pointer)
These registers are very important for the Caller, as they hold some critical values, so the Caller hopes that these values will not change after calling the function.
This creates requirements for the Callee, which needs to:
- Allocate space on the stack (decrease the sp register)
- Save the s registers it will use on the stack
- Freely use the s registers
- Restore the original s register values from the stack
- Release the space on the stack (increase the sp register)
In this way, the Callee guarantees to the Caller that the values of the s and sp registers remain unchanged before and after calling it, while allowing itself to use these registers freely.
Volatile Registers (Caller Saved)#
- t0-t6 (temporary registers)
- a0-a7 (arguments & return values)
- ra (return address)
The Callee has no obligation to save these registers, as it only saves the most important s registers. Therefore, if the Caller needs them, it must save them before calling the function and restore them afterward.
Structure of a Function#
So according to our Calling Convention, the structure of a function should be as follows:
First, this function, as the Callee, needs to save the s registers it will use. If this function calls other functions, then as the Caller, it needs to save some other registers it requires, restore these values after the call, restore the s register values at the end of the function, and finally jump back to the called location.
It's a simple idea: all Callees are responsible to the Caller, so when a Callee acts as a Caller, it doesn't have to worry about its important registers being tampered with.
Neural Network#
We are going to write a neural network in RISC-V to recognize digits. In simple terms, a neural network aims to approximate a nonlinear function that maps inputs to outputs. In this project, we already have pre-trained matrices $m_0$ and $m_1$, so we only need to use them for inference. Our input is the MNIST dataset, which contains 60,000 images of handwritten digits, each 28x28 pixels.
We need to write the following functions:
- relu: activation function $f(x)=max(0,x)$
- argmax: returns the index of the maximum element in a vector
- dot: vector dot product
- matmul: matrix multiplication
- read_matrix: read matrix file
- write_matrix: write matrix file
- classify: connect the above functions to link the layers
At the same time, we need to write test files to verify the correctness of the program. Let's take the vector dot product as an example.
dot.s#
Function: Computes the dot product of two vectors.
Input:
- a0 (int*) Pointer to the first element of v0
- a1 (int*) Pointer to the first element of v1
- a2 (int) Length of the vector
- a3 (int) Stride of v0
- a4 (int) Stride of v1
Return value: a0 (int) Result of the dot product.
Code:
dot:
bge x0, a2, exit_5
bge x0, a3, exit_6
bge x0, a4, exit_6
li t0 0 # loop counter
li t4 0 # dot product accumulator
# Multiply stride by 4 to get byte offset
slli a3 a3 2
slli a4 a4 2
loop_start:
beq t0, a2, loop_end
lw t1, 0(a0)
lw t2, 0(a1)
mul t3, t1, t2
add t4, t4, t3
add a0, a0, a3
add a1, a1, a4
addi t0, t0, 1
j loop_start
loop_end:
mv a0 t4
ret
This dot product function does not need to call any functions, so it acts solely as a Callee. We can avoid using any s registers to skip the step of saving registers to the stack, thus improving the program's execution speed.
Test code:
# Set vector values for testing
.data
vector0: .word 1 2 3 4 5 6 7 8 9
vector1: .word 1 2 3 4 5 6 7 8 9
.text
# main function for testing
main:
# Load vector addresses into registers
la s0 vector0
la s1 vector1
# Set vector attributes
addi a2, x0, 9
addi a3, x0, 1
addi a4, x0, 1
# Call dot function
mv a0, s0
mv a1, s1
jal ra, dot
# Print integer result
mv a1, a0
jal ra, print_int
# Print newline
li a1 '\n'
jal ra print_char
# Exit
jal exit
Test result:
Final Effect#
By implementing each function one by one and finally putting them all together, we have created a neural network capable of classifying handwritten digits.
When we input this image:
After running the program, we obtained our result:
Summary#
This project of handwriting digit recognition using RISC-V has improved my ability to write assembly language and also honed my skills in writing test code. It is particularly interesting that after writing simple functions like vector dot product and matrix multiplication, we can implement a powerful neural network, which is very fulfilling.
Lastly, I want to express my thoughts on Berkeley's grading system. The exam accounts for 40% of the grade, spread over three instances, which can effectively assess learning progress at different stages rather than cramming before the final week. Four interesting projects (assembly handwriting recognition, designing your own CPU, etc.) account for 40%, while homework and attendance account for 20%. It's diverse and interesting, which is great.
Live in the moment!