Skip to content

Wrapper RAM - Cache-Line-Burst RAM Technical Documentation

Contents

  1. Overview
  2. Module Interface
  3. Memory Organization
  4. Cache-Line Read
  5. Byte-Granular Writes
  6. Programming Interface

Overview

Purpose

The wrapper_ram module, is a RAM with 128-bit cache-line reads and byte-granular writes. Also includes a UART programming interface.

File Location

rtl/wrapper/wrapper_ram.sv

RAM Block Diagram

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                            WRAPPER_RAM                                           β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                                  β”‚
β”‚    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚    β”‚                         RAM ARRAY                                         β”‚ β”‚
β”‚    β”‚                                                                           β”‚ β”‚
β”‚    β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”                        β”‚ β”‚
β”‚    β”‚   β”‚ Bank 0  β”‚ β”‚ Bank 1  β”‚ β”‚ Bank 2  β”‚ β”‚ Bank 3  β”‚  ← 32-bit banks       β”‚ β”‚
β”‚    β”‚   β”‚(word 0) β”‚ β”‚(word 1) β”‚ β”‚(word 2) β”‚ β”‚(word 3) β”‚                        β”‚ β”‚
β”‚    β”‚   β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜                        β”‚ β”‚
β”‚    β”‚        β”‚           β”‚           β”‚           β”‚                              β”‚ β”‚
β”‚    β”‚        β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜                              β”‚ β”‚
β”‚    β”‚              β”‚           β”‚           β”‚                                    β”‚ β”‚
β”‚    β”‚              β–Ό           β–Ό           β–Ό                                    β”‚ β”‚
β”‚    β”‚        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                                β”‚ β”‚
β”‚    β”‚        β”‚   128-bit Cache Line Output     β”‚                                β”‚ β”‚
β”‚    β”‚        β”‚   rdata_o[127:0]                β”‚                                β”‚ β”‚
β”‚    β”‚        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                β”‚ β”‚
β”‚    β”‚                                                                           β”‚ β”‚
β”‚    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚                                                                                  β”‚
β”‚    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                                 β”‚
β”‚    β”‚  CPU Port    β”‚        β”‚  Programming     β”‚                                 β”‚
β”‚    β”‚  (RW)        β”‚        β”‚  Port (UART)     β”‚                                 β”‚
β”‚    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                 β”‚
β”‚                                                                                  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Module Interface

Parametreler

module wrapper_ram
  import level_param::*;
#(
    parameter int CPU_CLK          = 50_000_000,
    parameter int PROG_BAUD_RATE   = 115200,
    parameter string PROGRAM_SEQUENCE = "LEVELTEST",
    parameter int RAM_SIZE_KB      = 1024,         // 1 MB default
    parameter int RAM_INIT_FILE    = ""            // Optional hex init
)

Port Definitions

(
    input  logic         clk_i,
    input  logic         rst_ni,

    // CPU Memory Interface
    input  logic [31:0]  addr_i,        // Word address
    input  logic [31:0]  wdata_i,       // Write data (32-bit)
    input  logic [3:0]   wstrb_i,       // Byte strobes
    output logic [127:0] rdata_o,       // Read data (128-bit cache line)
    input  logic         rd_en_i,       // Read enable

    // Programming Interface
    input  logic         ram_prog_rx_i, // UART RX for programming
    output logic         system_reset_o,// System reset during programming
    output logic         prog_mode_led_o// Programming mode LED
);

Memory Organization

Bank Structure

// 4 x 32-bit banks = 128-bit cache line
localparam int WORDS_PER_LINE = 4;
localparam int RAM_DEPTH = (RAM_SIZE_KB * 1024) / (WORDS_PER_LINE * 4);

// Bank memories
logic [31:0] bank0 [RAM_DEPTH];
logic [31:0] bank1 [RAM_DEPTH];
logic [31:0] bank2 [RAM_DEPTH];
logic [31:0] bank3 [RAM_DEPTH];

Address Mapping

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        ADDRESS MAPPING                               β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                      β”‚
β”‚   Input Address: addr_i[31:0]                                       β”‚
β”‚                                                                      β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚   β”‚31                                        4β”‚3    2β”‚1    0β”‚    β”‚  β”‚
β”‚   β”‚             Line Index                    β”‚ Bank β”‚Unusedβ”‚    β”‚  β”‚
β”‚   β”‚               (RAM_DEPTH)                 β”‚Selectβ”‚      β”‚    β”‚  β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚                                                                      β”‚
β”‚   Line Index = addr_i[31:4]   (128-bit line selection)              β”‚
β”‚   Bank Select = addr_i[3:2]   (Word within line: 0,1,2,3)           β”‚
β”‚   Byte Offset = addr_i[1:0]   (Byte within word - ignored)          β”‚
β”‚                                                                      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Address Decode

wire [27:0] line_index = addr_i[31:4];
wire [1:0]  word_sel   = addr_i[3:2];

Cache-Line Read

128-bit Read

// Read entire cache line (all 4 banks simultaneously)
always_ff @(posedge clk_i) begin
    if (rd_en_i) begin
        rdata_o <= {bank3[line_index],  // [127:96]
                    bank2[line_index],  // [95:64]
                    bank1[line_index],  // [63:32]
                    bank0[line_index]}; // [31:0]
    end
end

Cache Controller Interface

CPU Request: LW addr=0x8000_0010
             line_index = 0x800001
             word_sel = 1

wrapper_ram returns:
  rdata_o[127:0] = {word3, word2, word1, word0}

Cache controller extracts:
  word = rdata_o[word_sel*32 +: 32] = rdata_o[63:32] = word1

Byte-Granular Writes

Write Logic

always_ff @(posedge clk_i) begin
    // CPU write or programming-port write
    if (cpu_we || prog_we) begin
        logic [31:0] write_addr;
        logic [31:0] write_data;
        logic [3:0]  write_strb;

        write_addr = prog_active ? prog_addr : addr_i;
        write_data = prog_active ? prog_data : wdata_i;
        write_strb = prog_active ? 4'b1111  : wstrb_i;

        // Bank selection based on word offset
        case (write_addr[3:2])
            2'b00: begin
                if (write_strb[0]) bank0[line_index][7:0]   <= write_data[7:0];
                if (write_strb[1]) bank0[line_index][15:8]  <= write_data[15:8];
                if (write_strb[2]) bank0[line_index][23:16] <= write_data[23:16];
                if (write_strb[3]) bank0[line_index][31:24] <= write_data[31:24];
            end
            2'b01: begin
                if (write_strb[0]) bank1[line_index][7:0]   <= write_data[7:0];
                if (write_strb[1]) bank1[line_index][15:8]  <= write_data[15:8];
                if (write_strb[2]) bank1[line_index][23:16] <= write_data[23:16];
                if (write_strb[3]) bank1[line_index][31:24] <= write_data[31:24];
            end
            2'b10: begin
                if (write_strb[0]) bank2[line_index][7:0]   <= write_data[7:0];
                if (write_strb[1]) bank2[line_index][15:8]  <= write_data[15:8];
                if (write_strb[2]) bank2[line_index][23:16] <= write_data[23:16];
                if (write_strb[3]) bank2[line_index][31:24] <= write_data[31:24];
            end
            2'b11: begin
                if (write_strb[0]) bank3[line_index][7:0]   <= write_data[7:0];
                if (write_strb[1]) bank3[line_index][15:8]  <= write_data[15:8];
                if (write_strb[2]) bank3[line_index][23:16] <= write_data[23:16];
                if (write_strb[3]) bank3[line_index][31:24] <= write_data[31:24];
            end
        endcase
    end
end

Store Instruction Examples

SW x1, 0(x2)    // Word store: wstrb=1111, writes to one bank
SH x1, 0(x2)    // Halfword: wstrb=0011 or 1100
SB x1, 0(x2)    // Byte: wstrb=0001, 0010, 0100, or 1000

Programming Interface

RAM Programmer Integration

ram_programmer #(
    .CPU_CLK         (CPU_CLK),
    .PROG_BAUD_RATE  (PROG_BAUD_RATE),
    .PROGRAM_SEQUENCE(PROGRAM_SEQUENCE)
) i_programmer (
    .i_clk         (clk_i),
    .i_rst_n       (rst_ni),
    .i_uart_rx     (ram_prog_rx_i),
    .o_ram_we      (prog_we),
    .o_ram_addr    (prog_addr),
    .o_ram_wdata   (prog_data),
    .o_system_reset(system_reset_o),
    .o_prog_mode_led(prog_mode_led_o)
);

Priority Arbitration

// Programming port takes priority
wire prog_active = !system_reset_o;  // During programming
wire cpu_we      = |wstrb_i && !prog_active;

Initialization

Hex File Loading

initial begin
    if (RAM_INIT_FILE != "") begin
        // Separate init for each of the 4 banks
        $readmemh({RAM_INIT_FILE, "_b0.hex"}, bank0);
        $readmemh({RAM_INIT_FILE, "_b1.hex"}, bank1);
        $readmemh({RAM_INIT_FILE, "_b2.hex"}, bank2);
        $readmemh({RAM_INIT_FILE, "_b3.hex"}, bank3);
    end
end

Verilator Memory Loading

`ifdef VERILATOR
    // Load program via DPI
    import "DPI-C" function void load_program(
        input string filename,
        inout logic [31:0] mem0[],
        inout logic [31:0] mem1[],
        inout logic [31:0] mem2[],
        inout logic [31:0] mem3[]
    );

    initial begin
        load_program($test$plusargs("firmware"),
                     bank0, bank1, bank2, bank3);
    end
`endif

Timing Diagram

Cache Line Read

              β”Œβ”€β” β”Œβ”€β” β”Œβ”€β” β”Œβ”€β” β”Œβ”€β”
clk_i      β”€β”€β”€β”˜ β””β”€β”˜ β””β”€β”˜ β””β”€β”˜ β””β”€β”˜ └───

addr_i     ───── ADDR  β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€

rd_en_i        β”Œβ”€β”€β”€β”€β”€β”€β”€β”
           β”€β”€β”€β”€β”˜       └───────────

rdata_o    ─────────────128-bit LINEβ”œ
                       (registered)

Byte Write

              β”Œβ”€β” β”Œβ”€β” β”Œβ”€β” β”Œβ”€β” β”Œβ”€β”
clk_i      β”€β”€β”€β”˜ β””β”€β”˜ β””β”€β”˜ β””β”€β”˜ β””β”€β”˜ └───

addr_i     ───── ADDR  β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€

wdata_i    ───── DATA  β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€

wstrb_i    ───── 0001  β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
               (byte 0 only)

bank0[idx]     ───────── UPDATED β”œβ”€
                       byte 0 only

Performance

Throughput

Operation Latency Bandwidth
Cache Line Read 1 cycle 128 bits/cycle
Word Write 1 cycle 32 bits/cycle
Burst Read (4 words) 1 cycle 128 bits/cycle

Resource Usage (Typical)

Resource Usage
BRAM (1MB) 256 x 36Kb BRAM
LUTs ~500 (address decode)
FFs ~200 (control logic)

Summary

The wrapper_ram module:

  1. 128-bit Read: Full cache line single-cycle
  2. Byte-Granular Write: wstrb-based selective write
  3. 4 Banks: Parallel 32-bit memory banks
  4. Programming: UART-based boot loading
  5. Priority: Programming port > CPU port

This module is the backing memory for the Level cache system.