ECE 4750 Section 3: RTL Testing with Python

Author: Christopher Batten
Date: September 9, 2022

Table of Contents

Overview of Testing Strategies
Ad-Hoc vs. Assertion Testing
Testing with pytest
Testing with Test Vectors
Testing with Stream Sources and Sinks
Using Functional-Level Models

This discussion section serves as gentle introduction to the basics of RTL testing using Python. We will start by discussing various different kinds of testing strategies including:

Ad-Hoc vs. Assertion Testing
Directed vs. Random Testing
Black-Box vs. White-Box Testing
Value vs. Delay Testing
Unit vs. Integration Testing
Reference Models

After this discussion you should log into the ecelinux servers using the remote access option of your choice and then source the setup script.

% source setup-ece4750.sh
% mkdir -p $HOME/ece4750
% cd $HOME/ece4750
% git clone git@github.com:cornell-ece4750/ece4750-sec03-pymtl sec03
% cd sec03
% TOPDIR=$PWD
% mkdir $TOPDIR/build

Ad-Hoc vs. Assertion Testing

We will start by testing the simple single-cycle multiplier, we developed in last week’s discussion section which does include any kind of flow control (i.e., no valid/ready signals):

As a reminder, here is the interface for our single-cycle multiplier.

module imul_IntMulScycleV1
(
  input  logic        clk,
  input  logic        reset,

  input  logic [31:0] in0,
  input  logic [31:0] in1,
  output logic [31:0] out
);

Our single-cycle multiplier takes two 32-bit input values and produces a 32-bit output value. Let’s use the same ad-hoc test we used last week to test this multiplier. Start by reviewing the Python test bench located in imul/imul-v1-adhoc-test.py:

from sys import argv

from pymtl3  import *
from pymtl3.passes.backends.verilog import *

from IntMulScycleV1 import IntMulScycleV1

# Get list of input values from command line

in0_values = [ int(x,0) for x in argv[1::2] ]
in1_values = [ int(x,0) for x in argv[2::2] ]

# Create and elaborate the model

model = IntMulScycleV1()
model.elaborate()

# Apply the Verilog import passes and the default pass group

model.apply( VerilogPlaceholderPass() )
model = VerilogTranslationImportPass()( model )
model.apply( DefaultPassGroup(linetrace=True,textwave=True,vcdwave="imul-v1-adhoc-test") )

# Reset simulator

model.sim_reset()

# Apply input values and display output values

for in0_value,in1_value in zip(in0_values,in1_values):

  # Write input value to input port

  model.in0 @= in0_value
  model.in1 @= in1_value
  model.sim_eval_combinational()

  # Tick simulator one cycle

  model.sim_tick()

# Tick simulator three more cycles and print text wave

model.sim_tick()
model.sim_tick()
model.sim_tick()
model.print_textwave()

The test bench gets some input values from the command line, instantiates the design under test, applies some PyMLT3 passes, and then runs a simulation by setting the input values and displaying the output value. Let’s run this ad-hoc test as follows:

% cd $TOPDIR/build
% python ../imul/imul-v1-adhoc-test.py 2 2 3 3

Experiment with different input values. Try large values that result in overflow:

% cd $TOPDIR/build
% python ../imul/imul-v1-adhoc-test.py 70000 70000

In ad-hoc testing, we try different inputs and inspect the output manually to see if the design under test produces the correct result. This “verification by inspection” is error prone and not reproducible. If you later make a change to your design, you would have to take another look at the debug output and/or waveforms to ensure that your design still works. If another member of your group wants to understand your design and verify that it is working, he or she would also need to take a look at the debug output and/or waveforms. Ad-hoc testing is usually verbose, which makes it error prone and more cumbersome to write tests. Ad-hoc testing is difficult for others to read and understand since by definition it is ad-hoc. Ad-hoc testing does not use any kind of standard test output, and does not provide support for controlling the amount of test output. While using ad-hoc testing might be feasible for very simple designs, it is obviously not a scalable approach when building the more complicated designs we will tackle in this course.

The first step to improving our testing strategy is to use assertion testing where we explicitly write assertions that must be true for the test to pass. This way we have made the checking for the correct results systematic and automatic. Take a look at the simple Python test bench for assertion testing located in imul/imul-v1-assertion-test.py:

def test_basic():

  ... create and elaborate model ...
  ... apply Verilog import passes and the default pass group ...

  model.sim_reset()

  model.in0 @= 2
  model.in1 @= 2
  model.sim_tick()
  assert model.out == 0

def test_basic():

  ... create and elaborate model ...
  ... apply Verilog import passes and the default pass group ...

  model.sim_reset()

  model.in0 @= 0x80000001
  model.in1 @= 2
  model.sim_tick()
  assert model.out == 0

test_basic()
test_overflow()

We have structured our assertion testing into a set of test cases. Each test case is implemented as a Python function named with the prefix test_. Each test case creates and elaborates the design under test, applies appropriate passes, and resets the model. The test case then sets the inputs to the model, ticks the simulator, and asserts that the output of the model match the expected value. We explicitly call both test case functions at the end of the script. Let’s run this assertion test:

% cd $TOPDIR/build
% python ../imul/imul-v1-assertion-test.py

The first test case will fail since we have not specified the correct expected value. Modify the assertion test script to have the correct expected values for both test cases and then rerun the assertion test.

Testing with pytest

In this course, we will be using the powerful pytest unit testing framework. The pytest framework is popular in the Python programming community with many features that make it well-suited for test-driven hardware development including: no-boilerplate testing with the standard assert statement; automatic test discovery; helpful traceback and failing assertion reporting; standard output capture; sophisticated parameterized testing; test marking for skipping certain tests; distributed testing; and many third-party plugins. More information is available here:

http://www.pytest.org

It is pretty easy to adapt the assertion test script we already have to make it suitable for use with pytest. Usually we like to keep all of our tests in a dedicated test subdirectory. Take a look at the test script imul/test/IntMulScycleV1a_test.py. It looks exactly like our previous assertion test script with two changes:

we pass in cmdline_opts to each test case function
we do not need to explicitly call the test case functions at the bottom of the script

Let’s use pytest to run this test:

% cd $TOPDIR/build
% pytest ../imul/test/IntMulScycleV1a_test.py

You can see that pytest has automatically discovered the two test cases; pytest will assume any function that starts with the test_ prefix is a test case. The test cases will fail since we have not specified the correct expected values. We can use the -v command line

% cd $TOPDIR/build
% pytest ../imul/test/IntMulScycleV1a_test.py -v

We can then “zoom in” on the first test case using the -k command line option to run just that first test case:

% cd $TOPDIR/build
% pytest ../imul/test/IntMulScycleV1a_test.py -v -k test_basic

Then we can use the -s option to display the line trace and the --dump-vcd option to dump the VCD file.

% cd $TOPDIR/build
% pytest ../imul/test/IntMulScycleV1a_test.py -v -k test_basic -s --dump-vcd

Modify the test script to have the correct expected values for both test cases and then rerun the test using pytest.

Testing with Test Vectors

Our testing so far requires quite a bit of boilerplate code. Every test case must construct a model, elaborate that model, apply PyMTL3 passes, and reset the simulator. For every cycle, the test case must set the inputs, tick the simulator, and check the outputs. We can use the power of Python to encapsulate much of this common functionality into a library to simplify testing. PyMTL3 provides a run_test_vector_sim function that makes it easy to write these kind of cycle-by-cycle tests where we want to explicitly set inputs and check outputs every cycle. Take a look at the test script imul/test/IntMulScycleV1b_test.py.

def test_basic( cmdline_opts ):
  run_test_vector_sim( IntMulScycleV1(), [
    ('in0 in1 out*'),
    [ 2,  2,  '?'  ],
    [ 3,  2,   0   ],
    [ 3,  3,   0   ],
    [ 0,  0,   0   ],
  ], cmdline_opts )

def test_overflow( cmdline_opts ):
  run_test_vector_sim( IntMulScycleV1(), [
    ('in0         in1 out*'),
    [ 0x80000001, 2,  '?'  ],
    [ 0xc0000002, 4,   0   ],
    [ 0x00000000, 0,   0   ],
  ], cmdline_opts )

The run_test_vector_sim takes three arguments: a design under test, the test vector table, and the command line options. The first row in the test vector table specifies the names of the input and output ports. Output ports need to be indicated by adding a * suffix. The remaining rows in the test vector table specify the inputs and the correct outputs for every cycle. We can indicate we don’t care about an output on a given cycle with ?. Notice how Python can make things very compact while at the same time very readable. Let’s use pytest to run this test:

% cd $TOPDIR/build
% pytest ../imul/test/IntMulScycleV1b_test.py -s -v

The test cases will fail since we have not specified the correct expected values. Modify the test script to have the correct expected values for both test cases and then rerun the test using pytest. Use the -v and -s options and notice that the line trace roughly corresponds to the test vector table.

But wait there is more! We can use the pytest.mark.parametrize decorator to parameterize a single test case over many different parameters. In other words, instead of explicitly defining two test case functions, we can generate two test case functions from a single specification. Take a look at the test script imul/test/IntMulScycleV1c_test.py.

basic_test_vectors = [
  ('in0 in1 out*'),
  [ 2,  2,  '?'  ],
  [ 3,  2,   0   ],
  [ 3,  3,   0   ],
  [ 0,  0,   0   ],
]

overflow_test_vectors = [
  ('in0         in1 out*'),
  [ 0x80000001, 2,  '?'  ],
  [ 0xc0000002, 4,   0   ],
  [ 0x00000000, 0,   0   ],
]

@pytest.mark.parametrize( "test_vectors", [
  basic_test_vectors,
  overflow_test_vectors
])
def test_overflow( test_vectors, cmdline_opts ):
    run_test_vector_sim( IntMulScycleV1(), test_vectors, cmdline_opts )

Here we define test vector tables and then we use those test vector tables in the pytest.mark.parametrize decorator. In this specific example it does not save too much boiler plate, but we will see in the next section how this is a very powerful way to generate test cases. Modify the test script to have the correct expected values for both test cases and then rerun the test using pytest. Use the -v and -s options and notice that the output is basically the same as if we have explicitly defined two test cases.

% cd $TOPDIR/build
% pytest ../imul/test/IntMulScycleV1c_test.py -s -v

Testing with Stream Sources and Sinks

So far we have been testing a latency-sensitive design. We write the inputs on one cycle and then the result is produced after exactly one cycle. In this course, we will make extensive use of latency-insensitive streaming interfaces. Such interfaces use a val/rdy micro-protocol which will enable other logic to always function correctly regardless of how many cycles a component requires. Here is how we can implement a single-cycle multiplier with a latency-insensitive streaming interface:

Here is the interface for this single-cycle multiplier:

module imul_IntMulScycleV3
(
  input  logic        clk,
  input  logic        reset,

  input  logic        istream_val,
  output logic        istream_rdy,
  input  logic [63:0] istream_msg,

  output logic        ostream_val,
  input  logic        ostream_rdy,
  output logic [31:0] ostream_msg
);

Testing a latency-sensitive design requires using cycle-by-cycle testing, but when testing a latency-insensitive design we can make use of stream sources and sinks to both simplify our testing strategy and at the same time ensure we can robustly test the flow control.

Take a look at the test script imul/test/IntMulScycleV3a_test.py.

class TestHarness( Component ):

  def construct( s, imul, imsgs, omsgs ):

    # Instantiate models

    s.src  = StreamSourceFL( Bits64, imsgs )
    s.sink = StreamSinkFL  ( Bits32, omsgs )
    s.imul = imul

    # Connect

    s.src.ostream  //= s.imul.istream
    s.imul.ostream //= s.sink.istream

  def done( s ):
    return s.src.done() and s.sink.done()

  def line_trace( s ):
    return s.src.line_trace() + " > " + s.imul.line_trace() + " > " + s.sink.line_trace()

The test harness composes a stream source, the latency-insensitive single-cycle multiplier, and a stream sink. When constructing the test harness we pass in a list of input messages for the stream source to send to the multiplier, and a list of output messages for the stream sink to check against the messages received from the multiplier. The stream source and sink take care of correctly handling the val/rdy micro-protocol. Here is what a test case now looks like:

def test_basic( cmdline_opts ):

  imsgs = [ mk_imsg(2,2), mk_imsg(3,3) ]
  omsgs = [ mk_omsg(4),   mk_omsg(9)   ]

  th = TestHarness( IntMulScycleV3(), imsgs, omsgs )
  run_sim( th, cmdline_opts, duts=['imul'] )

The test cases look a little different from the previous approach. Instead of creating a test vector table, we now need to create the input and output message list and pass them into the test harness. We can use the run_sim function to handle applying PyMTL3 passes and actually ticking the simulator. Let’s use pytest to run this test:

% cd $TOPDIR/build
% pytest ../imul/test/IntMulScycleV3a_test.py -s -v

So far we have only been using directed testing, but random testing is of course also very important to help increase our test coverage. Here is a test case that randomly generates input messages and then calculates the correct output messages:

def test_random( cmdline_opts ):

  imsgs = []
  omsgs = []

  for i in range(10):
    a = randint(0,100)
    b = randint(0,100)
    imsgs.extend([ mk_imsg(a,b) ])
    omsgs.extend([ mk_omsg(a*b) ])

  th = TestHarness( IntMulScycleV3(), imsgs, omsgs )
  run_sim( th, cmdline_opts, duts=['imul'] )

You can use arbitrary Python to create a variety of random tests sequences. Let’s go ahead and run these random tests:

% cd $TOPDIR/build
% pytest ../imul/test/IntMulScycleV3b_test.py -s -v

In addition to testing the values, we also want to test that the latency-sensitive single-cycle multiplier correctly implements the val/rdy micro protocol. In other words, we want to make sure that the design under test can handle arbitrary source/sink delays. The stream source and sink components enable setting an initial_delay and a interval_delay to help with this kind of delay testing. Here we set the delay to be three cycles in the stream sink:

def test_random_delay3( cmdline_opts ):

  imsgs = []
  omsgs = []

  for i in range(10):
    a = randint(0,100)
    b = randint(0,100)
    imsgs.extend([ mk_imsg(a,b) ])
    omsgs.extend([ mk_omsg(a*b) ])

  th = TestHarness( IntMulScycleV3(), imsgs, omsgs, 3 )
  run_sim( th, cmdline_opts, duts=['imul'] )

Let’s go ahead and run these delay tests:

% cd $TOPDIR/build
% pytest ../imul/test/IntMulScycleV3c_test.py -s -v

Carefully compare the line trace to what we saw before without any delays. Finally, we can use a test case table and the pytest.mark.parametrize decorator to further simplify our test code.

#-------------------------------------------------------------------------
# mk_imsg/mk_omsg
#-------------------------------------------------------------------------
# Make input/output msgs, truncate ints to ensure they fit in 32 bits.

def mk_imsg( a, b ):
  return concat( Bits32( a, trunc_int=True ), Bits32( b, trunc_int=True ) )

def mk_omsg( a ):
  return Bits32( a, trunc_int=True )

#-------------------------------------------------------------------------
# test msgs
#-------------------------------------------------------------------------

basic_msgs = [
  mk_imsg(2,2), mk_omsg(4),
  mk_imsg(3,3), mk_omsg(9),
]

overflow_msgs = [
  mk_imsg(0x80000001,2), mk_omsg(2),
  mk_imsg(0xc0000002,4), mk_omsg(8),
]

random_msgs  = []
for i in range(10):
  a = randint(0,100)
  b = randint(0,100)
  random_msgs.extend([ mk_imsg(a,b), mk_omsg(a*b) ])

#-------------------------------------------------------------------------
# Test Case Table
#-------------------------------------------------------------------------

test_case_table = mk_test_case_table([
  (                  "msgs          delay"),
  [ "basic",         basic_msgs,    0     ],
  [ "overflow",      overflow_msgs, 0     ],
  [ "random",        random_msgs,   0     ],
  [ "random_delay1", random_msgs,   1     ],
  [ "random_delay3", random_msgs,   3     ],
])

#-------------------------------------------------------------------------
# run tests
#-------------------------------------------------------------------------

@pytest.mark.parametrize( **test_case_table )
def test( test_params, cmdline_opts ):

  imsgs = test_params.msgs[::2]
  omsgs = test_params.msgs[1::2]
  delay = test_params.delay

  th = TestHarness( IntMulScycleV3(), imsgs, omsgs, delay )
  run_sim( th, cmdline_opts, duts=['imul'] )

With a test case table, we can reuse the same input/output messages and simply vary the delays. Let’s try running the tests using this new approach:

% cd $TOPDIR/build
% pytest ../imul/test/IntMulScycleV3d_test.py -s -v

Add a new row to the test case table that reuses the random messages but with a delay of 10. Rerun the test case and look at the line trace to verify the longer delays.

Using Functional-Level Models

One challenge with our testing strategy so far, is that when there is a test failure we often don’t know if the issue is an incorrect test case or an incorrect design. It can be useful to have functional-level (FL) model (also called a golden reference model) of our design. We can then write all of our tests and ensure they pass on the FL model before running those tests on our RTL design. We could also use our FL model in random testing by sending the same inputs to both the FL and RTL models and ensuring the outputs are equal.

FL models can be written in pure Python using PyMTL3. Here is a FL model for our single-cycle multiplier.

class IntMulFL( Component ):
  def construct( s ):

    # Interface

    s.istream = IStreamIfc( Bits64 )
    s.ostream = OStreamIfc( Bits32 )

    # Queue Adapters

    s.istream_q = IStreamDeqAdapterFL( Bits64 )
    s.ostream_q = OStreamEnqAdapterFL( Bits32 )

    s.istream //= s.istream_q.istream
    s.ostream //= s.ostream_q.ostream

    # FL block

    @update_once
    def block():
      if s.istream_q.deq.rdy() and s.ostream_q.enq.rdy():
        msg = s.istream_q.deq()
        s.ostream_q.enq( msg[32:64] * msg[0:32] )

Our FL model has the exact same interface as the RTL model. Normally, an FL model just captures the functional behavior of a model and does not attempt to capture any of the timing behavior. The above FL model uses a combination of adapters and special PyMTL3 modeling constructs to be able to express the function of our multiplier at a high level. We can run all of our tests on our FL model like this:

% cd $TOPDIR/build
% pytest ../imul/test/IntMulFL_test.py -s -v

Since the FL model and RTL model have the exact same interface, it is possible with clever Python programming to reuse the exact same tests cases across both models. This means we can get all of our tests cases working on the FL model, then directly reuse those test cases on the RTL model, and be relatively confident that the test cases are correct.

Note that you can run all of the tests in the entire project like this:

% cd $TOPDIR/build
% pytest ../imul