banner
CedricXu

CedricXu

计科学生 / 摄影爱好者

[RISCV] Handwritten Digit Recognition

Introduction#

qwii3

I am currently studying CS61C at Berkeley, and Project 2 involves implementing handwritten digit recognition using RISC-V.

It sounds complicated, but it's actually not too bad. The main challenge is how to efficiently use registers, write and call functions in assembly language, manually allocate memory from the stack, and debug assembly programs using Venus.

In the end, all we need to do is connect everything together to create an artificial neural network (ANN) that can classify handwritten digits.

RISC-V Calling Conventions#

t1xa2

When it comes to programming in RISC-V, the most important part is the Calling Convention. It allows functions to freely use registers without worrying about causing errors. Imagine if function A needs to use register s1, but at that moment register s1 contains important values for function B. If function A changes it, then function B will encounter errors when using it. If we have to constantly check whether the registers we want to use are still needed by other functions, our programming would become a disaster. This is where the Calling Convention comes in to make our lives better!

During a function call, we refer to the calling function as the Caller and the called function as the Callee. They have different responsibilities when it comes to saving registers.

Saved Registers (Callee Saved)#

sb11b

  • s0-s11 (saved registers)
  • sp (stack pointer)

These registers are important for the Caller because they contain some important values. Therefore, the Caller hopes that these values will not change after calling a function.

This places requirements on the Callee, which are as follows:

  1. Allocate space on the stack (decrement sp register)
  2. Save the s registers it needs on the stack
  3. Use the s registers freely
  4. Restore the original values of the s registers from the stack
  5. Free the space on the stack (increment sp register)

This way, the Callee guarantees to the Caller that the values of the s and sp registers will remain unchanged before and after the function call, while allowing itself to freely use these registers.

Volatile Registers (Caller Saved)#

g71ox

  • t0-t6 (temporary registers)
  • a0-a7 (arguments & return values)
  • ra (return address)

The Callee is not obligated to save these registers because it only saves the most important s registers. So if the Caller needs them, it has to save them itself before the function call and restore them after the function call.

Structure of a Function#

According to our Calling Convention, the structure of a function should be as follows:

xnes0

First, as a Callee, the function needs to save the s registers it will use. If the function needs to call other functions, then as a Caller, it needs to save some other registers it needs, restore those values after the function call, restore the values of the s registers at the end of the function, and finally jump back to the calling location.

The idea is simple: all Callees are responsible for Callers, so when a Callee becomes a Caller, it doesn't have to worry about its important registers being tampered with.

Neural Networks#

5vldo

We need to write a neural network in RISC-V to recognize digits. Simply put, a neural network aims to approximate a non-linear function that maps inputs to outputs. In this project, we already have pre-trained matrices m0 and m1, so we just need to use them for inference. Our input is the MNIST dataset, which consists of 60,000 28x28 pixel images covering handwritten digits 0-9.

We need to write the following functions:

  • relu: activation function f(x) = max(0, x)
  • argmax: returns the index of the maximum element in a vector
  • dot: dot product of two vectors
  • matmul: matrix multiplication
  • read_matrix: read a matrix from a file
  • write_matrix: write a matrix to a file
  • classify: calls the above functions to connect the layers

We also need to write test files to test the correctness of the program. Let's take dot product as an example.

dot.s#

Function: Calculates the dot product of two vectors

Inputs:

  • a0 (int*) Pointer to the first element of vector v0
  • a1 (int*) Pointer to the first element of vector v1
  • a2 (int) Length of the vectors
  • a3 (int) Stride of v0
  • a4 (int) Stride of v1

Returns: a0 (int) The dot product result

Code:

dot:
    bge x0, a2, exit_5
    bge x0, a3, exit_6
    bge x0, a4, exit_6

    li t0 0 # loop counter
    li t4 0 # dot product accumulator

    # Multiply stride by 4 to get byte offset
    slli a3 a3 2 
    slli a4 a4 2
 
loop_start:
    beq t0, a2, loop_end

    lw t1, 0(a0)
    lw t2, 0(a1)

    mul t3, t1, t2
    add t4, t4, t3

    add a0, a0, a3
    add a1, a1, a4
    addi t0, t0, 1

    j loop_start

loop_end:
    mv a0 t4

    ret

This dot product function does not need to call any other functions, so it only acts as a Callee. We can skip the step of saving registers to the stack, which improves the program's execution speed.

Test code:

# Set vector values for testing
.data
vector0: .word 1 2 3 4 5 6 7 8 9
vector1: .word 1 2 3 4 5 6 7 8 9

.text
# main function for testing
main:
    # Load vector addresses into registers
    la s0 vector0
    la s1 vector1

    # Set vector attributes
    addi a2, x0, 9
    addi a3, x0, 1
    addi a4, x0, 1

    # Call dot function
    mv a0, s0
    mv a1, s1
    jal ra, dot

    # Print integer result
    mv a1, a0
    jal ra, print_int

    # Print newline
    li a1 '\n'
    jal ra print_char

    # Exit
    jal exit

Test result:

ndi5v

Final Result#

By implementing one function after another and combining them all together, we have created a neural network that can classify handwritten digits.

When we input this image:

j8mfi

and run the program, we get the result:

ltdb0

Conclusion#

This project of implementing handwritten digit recognition using RISC-V has improved my ability to write assembly language and has also honed my skills in writing test code. It's particularly interesting that after implementing simple functions like dot product and matrix multiplication, we can create a powerful neural network. It's very fulfilling.

Lastly, I would like to mention the grading system at Berkeley, where exams account for 40% of the grade, divided into three parts, which allows for checking the learning progress at different stages instead of cramming right before the final exam. The four interesting projects (assembly handwritten digit recognition, designing our own CPU, etc.) account for 40% of the grade, and the remaining 20% is for homework, labs, and attendance. It's diverse and interesting, which is great.

Live in the present!

9irzv

Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.