NetTLP: Setup

Overview

A NetTLP platform is composed of two hosts: adapter host and device host. An adapter host has at least one FPGA-based NetTLP adapter, and a device host performs a software device, which is the substance of the NetTLP adapter on the adapter host. The hosts are connected by a 10Gbps Ethernet link between the NetTLP adapter and a regular 10Gbps NIC on the device host.

Figure 1: A setup at our Lab.

Setup adapter host

Requirement of adapter host

Install NetTLP Adapter

Install a NetTLP adapter to a PCIe slot, and program The adapter via Vivado. The bit file can be found at https://github.com/NetTLP/adapter/releases. After booting the adapter host, you can see a device with vendor ID 3776:8022. This bit file is created with the evaluation version Ethernet-10G MAC license. Therefore the adapter stops after one day. Please reload the bit file to the FPGA board when the adapter stops.

$ lspci -d 3776:8022
1b:00.0 Memory controller: Device 3776:8022

$ sudo lspci -v -d 3776:8022
1b:00.0 Memory controller: Device 3776:8022
	Subsystem: Device 3776:8022
	Flags: fast devsel, NUMA node 0
	Memory at c0100000 (32-bit, non-prefetchable) [disabled] [size=16K]
	Memory at a0000000 (64-bit, prefetchable) [disabled] [size=256M]
	Memory at b0000000 (64-bit, prefetchable) [disabled] [size=256M]
	Capabilities: [40] Power Management version 3
	Capabilities: [60] Express Endpoint, MSI 00
	Capabilities: [9c] MSI-X: Enable- Count=22 Masked-
	Capabilities: [100] Device Serial Number 00-00-00-01-01-00-0a-35

The current NetTLP adapter (v0.17) has three Base Address Registers (BAR). BAR0 is used to store adapter specific information (i.e., MAC address and IP address), and BAR2 is used to store MSI-X table. BAR4 is quite special: Transaction Layer Packets (TLPs) to the BAR are encapsulated in Ethernet, IP, and UDP headers, and transmitted to the device host via the 10Gbps Ethernet Link. In the above example, memory accesses to 0xb0000000-0xbfffffff in the MMIO space are delivered to the device host by Ethernet/IP/UDP encapsulation.

Install NetTLP basic driver

Next, install a NetTLP basic driver found at https://github.com/nettlp/driver. This driver just enables the NetTLP adapter by the Linux way, and it provides a simple messaging API to obtain adapter information (i.e., bus and slot number, and BAR4 address) from the device host.

$ mkdir nettlp
$ cd nettlp
$ git clone https://github.com/nettlp/driver
Cloning into 'driver'...
remote: Enumerating objects: 30, done.
remote: Counting objects: 100% (30/30), done.
remote: Compressing objects: 100% (20/20), done.
remote: Total 42 (delta 14), reused 25 (delta 10), pack-reused 12
Unpacking objects: 100% (42/42), done.
$ cd driver
$ ls
Makefile  nettlp_main.c  nettlp_msg.c  nettlp_msg.h

$ make
make -C /lib/modules/4.20.2-tsukumo1-nopti/build M=/home/upa/work/test/nettlp/driver V=0 modules
make[1]: Entering directory '/home/upa/src/linux-4.20.2'
  CC [M]  /home/upa/work/test/nettlp/driver/nettlp_main.o
  CC [M]  /home/upa/work/test/nettlp/driver/nettlp_msg.o
  LD [M]  /home/upa/work/test/nettlp/driver/nettlp.o
  Building modules, stage 2.
  MODPOST 1 modules
  CC      /home/upa/work/test/nettlp/driver/nettlp.mod.o
  LD [M]  /home/upa/work/test/nettlp/driver/nettlp.ko
make[1]: Leaving directory '/home/upa/src/linux-4.20.2'
$ ls
Makefile        nettlp.ko     nettlp.o       nettlp_msg.c
Module.symvers	nettlp.mod.c  nettlp_main.c  nettlp_msg.h
modules.order	nettlp.mod.o  nettlp_main.o  nettlp_msg.o

# nettlp.ko is the basic driver.

Note that current basic driver implementation depends on udp_tunnel module. Please load it before insmod nettlp.ko.

$ sudo modprobe udp_tunnel
$ sudo insmod nettlp.ko

If the NetTLP adapter is correctory working, you can see dmesg output like shown below:

[454516.635720] nettlp: nettlp (v0.0.3) is loaded
[454516.636224] nettlp: nettlp_pci_init: register nettlp device 0000:1b:00.0
[454516.636237] nettlp 0000:1b:00.0: enabling device (0100 -> 0102)
[454516.636252] nettlp: BAR0 start  : 0xc0100000
[454516.636253] nettlp: BAR0 end    : 0xc0103fff
[454516.636254] nettlp: BAR0 flags  : 0x40200
[454516.636255] nettlp: BAR0 len    : 16384
[454516.636256] nettlp: BAR2 start  : 0xa0000000
[454516.636257] nettlp: BAR2 end    : 0xafffffff
[454516.636258] nettlp: BAR2 flags  : 0x14220c
[454516.636259] nettlp: BAR2 len    : 268435456
[454516.636385] nettlp: BAR2 mapped : 000000006b3cb24e
[454516.636386] nettlp: BAR4 start  : 0xb0000000
[454516.636387] nettlp: BAR4 end    : 0xbfffffff
[454516.636388] nettlp: BAR4 flags  : 0x14220c
[454516.636389] nettlp: BAR4 len    : 268435456
[454516.636390] nettlp: Allocate BAR4 as p2pdma memory
[454516.656933] 0000:1b:00.0 initialised, 65536 pages in 0ms
[454516.656945] nettlp 0000:1b:00.0: added peer-to-peer DMA memory [mem 0xb0000000-0xbfffffff 64bit pref]
[454516.657128] initialize nettlp message socket
[454516.657184] nettlp: MSIX [0]: Addr=0x0, Data=00000000
[454516.657185] nettlp: MSIX [1]: Addr=0x0, Data=00000000
[454516.657187] nettlp: MSIX [2]: Addr=0x0, Data=00000000
[454516.657188] nettlp: MSIX [3]: Addr=0x0, Data=00000000
[454516.657189] nettlp: MSIX [4]: Addr=0x0, Data=00000000
[454516.657190] nettlp: MSIX [5]: Addr=0x0, Data=00000000
[454516.657191] nettlp: MSIX [6]: Addr=0x0, Data=00000000
[454516.657192] nettlp: MSIX [7]: Addr=0x0, Data=00000000
[454516.657193] nettlp: MSIX [8]: Addr=0x0, Data=00000000
[454516.657194] nettlp: MSIX [9]: Addr=0x0, Data=00000000
[454516.657195] nettlp: MSIX [10]: Addr=0x0, Data=00000000
[454516.657196] nettlp: MSIX [11]: Addr=0x0, Data=00000000
[454516.657197] nettlp: MSIX [12]: Addr=0x0, Data=00000000
[454516.657198] nettlp: MSIX [13]: Addr=0x0, Data=00000000
[454516.657199] nettlp: MSIX [14]: Addr=0x0, Data=00000000
[454516.657200] nettlp: MSIX [15]: Addr=0x0, Data=00000000

Note: NetTLP might conflict with some security mechanisms. If you do not focus on such mechanisms and do not mind, we suggest to add "nopti nospectre_v2 nokaslr" to GRUB_CMDLINE_LINUX in /etc/default/grub, update-grub, and reboot the machine. Note that we have NOT tested NetTLP with iommu yet.

Setup device host

Requirement of device host

Setup connection between device host and NetTLP adapter

Connect the device host to the NetTLP adapter on the adapter host by 10Gbps Ethernet link, like shown in Figure 1. The NetTLP adapter always sends light during board power is on, so that only ip link set dev ethX up is required to bring the link up.

The next step is to setup a static ARP entry. The current implementation of NetTLP adapter (v0.17) does not respond to ARP requests. Thus, the device host needs to have a static ARP entry for the connected NetTLP adapter.

Note: default MAC address and IP address of current NetTLP adapter are 00:11:22:33:44:55 and 192.168.10.1. The NetTLP adapter uses FF:FF:FF:FF:FF:FF and 192.168.10.3 as default destination addresses for encapsulation. They can be changed. These addresses are used for only TLP encapsulation and decapsulation, not used for the network stack of the adapter host. Thus, this address (192.168.10.1 as default) does not respond to any packets except NetTLP packets.

In this setup, we use 192.168.10.3/24 for the device host according to the default IP address of the adapter.

# eth1 is an exmaple name of the 10Gbps port connected to NetTLP adapter

$ sudo ip link set dev eth1 up
$ sudo ip addr add dev eth1 192.168.10.3/24
$ sudo ip nei add 192.168.10.1 dev eth1 lladdr 00:11:22:33:44:55
$ ip nei show 192.168.10.1
192.168.10.1 dev eth1 lladdr 00:11:22:33:44:55 PERMANENT

Compile LibTLP

LibTLP works on the device host. Download and compile it from https://github.com/nettlp/libtlp. The LibTLP repository contains a shared library called libtlp and sample applications such as dma_read, pmem, and tlpperf.

$ mkdir nettlp
$ cd nettlp
$ git clone https://github.com/nettlp/libtlp
Cloning into 'libtlp'...
Username for 'https://github.com': upa
Password for 'https://upa@github.com': 
remote: Enumerating objects: 300, done.
remote: Counting objects: 100% (300/300), done.
remote: Compressing objects: 100% (198/198), done.
remote: Total 300 (delta 184), reused 211 (delta 95), pack-reused 0
Receiving objects: 100% (300/300), 69.79 KiB | 441.00 KiB/s, done.
Resolving deltas: 100% (184/184), done.
$ cd libtlp/
$ ls
Makefile  README.md  apps  include  lib  test
$ make
make[1]: Entering directory '/home/upa/work/test/nettlp/libtlp/lib'
gcc -I./ -I../include -c libtlp.c
ar rcs libtlp.a libtlp.o
make[1]: Leaving directory '/home/upa/work/test/nettlp/libtlp/lib'
make[1]: Entering directory '/home/upa/work/test/nettlp/libtlp/test'
gcc -g -Wall -I../include   -L../lib   test_dma_read.c  -ltlp -o test_dma_read
gcc -g -Wall -I../include   -L../lib   test_dma_write.c  -ltlp -o test_dma_write
gcc -g -Wall -I../include   -L../lib   test_msg_bar4.c  -ltlp -o test_msg_bar4
gcc -g -Wall -I../include   -L../lib   test_msg_msix.c  -ltlp -o test_msg_msix
gcc -g -Wall -I../include   -L../lib   test_msg_devid.c  -ltlp -o test_msg_devid
make[1]: Leaving directory '/home/upa/work/test/nettlp/libtlp/test'
make[1]: Entering directory '/home/upa/work/test/nettlp/libtlp/apps'
gcc -g -Wall -I../include   -L../lib   dma_read.c  -ltlp -lpthread -o dma_read
gcc -g -Wall -I../include   -L../lib   dma_write.c  -ltlp -lpthread -o dma_write
gcc -g -Wall -I../include   -L../lib   pmem.c  -ltlp -lpthread -o pmem
gcc -g -Wall -I../include   -L../lib   process-list.c  -ltlp -lpthread -o process-list
gcc -g -Wall -I../include   -L../lib   pgd-walk.c  -ltlp -lpthread -o pgd-walk
gcc -g -Wall -I../include   -L../lib   codedump.c  -ltlp -lpthread -o codedump
gcc -g -Wall -I../include   -L../lib   tlpperf.c  -ltlp -lpthread -o tlpperf
make[1]: Leaving directory '/home/upa/work/test/nettlp/libtlp/apps'
$ ls apps/
Makefile    dma_read	 pgd-walk    process-list	      tlpperf.c
README.md   dma_read.c	 pgd-walk.c  process-list.c	      util.h
codedump    dma_write	 pmem	     thread_affinity_apple.h
codedump.c  dma_write.c  pmem.c      tlpperf

Next section shows how to play example LibTLP applications on this environment.