Overview
A NetTLP platform is composed of two hosts: adapter host and device host. An adapter host has at least one FPGA-based NetTLP adapter, and a device host performs a software device, which is the substance of the NetTLP adapter on the adapter host. The hosts are connected by a 10Gbps Ethernet link between the NetTLP adapter and a regular 10Gbps NIC on the device host.
Setup adapter host
Requirement of adapter host
- OS: Ubuntu 18.04
- Xilinx KC705 Eval board (EK-K7-KC705-G)
- Xilinx Vivado 19.2
Install NetTLP Adapter
Install a NetTLP adapter to a PCIe slot, and program The adapter via Vivado. The bit file can be found at https://github.com/NetTLP/adapter/releases. After booting the adapter host, you can see a device with vendor ID 3776:8022. This bit file is created with the evaluation version Ethernet-10G MAC license. Therefore the adapter stops after one day. Please reload the bit file to the FPGA board when the adapter stops.
$ lspci -d 3776:8022
1b:00.0 Memory controller: Device 3776:8022
$ sudo lspci -v -d 3776:8022
1b:00.0 Memory controller: Device 3776:8022
Subsystem: Device 3776:8022
Flags: fast devsel, NUMA node 0
Memory at c0100000 (32-bit, non-prefetchable) [disabled] [size=16K]
Memory at a0000000 (64-bit, prefetchable) [disabled] [size=256M]
Memory at b0000000 (64-bit, prefetchable) [disabled] [size=256M]
Capabilities: [40] Power Management version 3
Capabilities: [60] Express Endpoint, MSI 00
Capabilities: [9c] MSI-X: Enable- Count=22 Masked-
Capabilities: [100] Device Serial Number 00-00-00-01-01-00-0a-35
The current NetTLP adapter (v0.17) has three Base Address Registers (BAR). BAR0 is used to store adapter specific information (i.e., MAC address and IP address), and BAR2 is used to store MSI-X table. BAR4 is quite special: Transaction Layer Packets (TLPs) to the BAR are encapsulated in Ethernet, IP, and UDP headers, and transmitted to the device host via the 10Gbps Ethernet Link. In the above example, memory accesses to 0xb0000000-0xbfffffff in the MMIO space are delivered to the device host by Ethernet/IP/UDP encapsulation.
Install NetTLP basic driver
Next, install a NetTLP basic driver found at https://github.com/nettlp/driver. This driver just enables the NetTLP adapter by the Linux way, and it provides a simple messaging API to obtain adapter information (i.e., bus and slot number, and BAR4 address) from the device host.
$ mkdir nettlp
$ cd nettlp
$ git clone https://github.com/nettlp/driver
Cloning into 'driver'...
remote: Enumerating objects: 30, done.
remote: Counting objects: 100% (30/30), done.
remote: Compressing objects: 100% (20/20), done.
remote: Total 42 (delta 14), reused 25 (delta 10), pack-reused 12
Unpacking objects: 100% (42/42), done.
$ cd driver
$ ls
Makefile nettlp_main.c nettlp_msg.c nettlp_msg.h
$ make
make -C /lib/modules/4.20.2-tsukumo1-nopti/build M=/home/upa/work/test/nettlp/driver V=0 modules
make[1]: Entering directory '/home/upa/src/linux-4.20.2'
CC [M] /home/upa/work/test/nettlp/driver/nettlp_main.o
CC [M] /home/upa/work/test/nettlp/driver/nettlp_msg.o
LD [M] /home/upa/work/test/nettlp/driver/nettlp.o
Building modules, stage 2.
MODPOST 1 modules
CC /home/upa/work/test/nettlp/driver/nettlp.mod.o
LD [M] /home/upa/work/test/nettlp/driver/nettlp.ko
make[1]: Leaving directory '/home/upa/src/linux-4.20.2'
$ ls
Makefile nettlp.ko nettlp.o nettlp_msg.c
Module.symvers nettlp.mod.c nettlp_main.c nettlp_msg.h
modules.order nettlp.mod.o nettlp_main.o nettlp_msg.o
# nettlp.ko is the basic driver.
Note that current basic driver implementation depends on udp_tunnel module. Please load it before insmod nettlp.ko.
$ sudo modprobe udp_tunnel
$ sudo insmod nettlp.ko
If the NetTLP adapter is correctory working, you can see dmesg output like shown below:
[454516.635720] nettlp: nettlp (v0.0.3) is loaded
[454516.636224] nettlp: nettlp_pci_init: register nettlp device 0000:1b:00.0
[454516.636237] nettlp 0000:1b:00.0: enabling device (0100 -> 0102)
[454516.636252] nettlp: BAR0 start : 0xc0100000
[454516.636253] nettlp: BAR0 end : 0xc0103fff
[454516.636254] nettlp: BAR0 flags : 0x40200
[454516.636255] nettlp: BAR0 len : 16384
[454516.636256] nettlp: BAR2 start : 0xa0000000
[454516.636257] nettlp: BAR2 end : 0xafffffff
[454516.636258] nettlp: BAR2 flags : 0x14220c
[454516.636259] nettlp: BAR2 len : 268435456
[454516.636385] nettlp: BAR2 mapped : 000000006b3cb24e
[454516.636386] nettlp: BAR4 start : 0xb0000000
[454516.636387] nettlp: BAR4 end : 0xbfffffff
[454516.636388] nettlp: BAR4 flags : 0x14220c
[454516.636389] nettlp: BAR4 len : 268435456
[454516.636390] nettlp: Allocate BAR4 as p2pdma memory
[454516.656933] 0000:1b:00.0 initialised, 65536 pages in 0ms
[454516.656945] nettlp 0000:1b:00.0: added peer-to-peer DMA memory [mem 0xb0000000-0xbfffffff 64bit pref]
[454516.657128] initialize nettlp message socket
[454516.657184] nettlp: MSIX [0]: Addr=0x0, Data=00000000
[454516.657185] nettlp: MSIX [1]: Addr=0x0, Data=00000000
[454516.657187] nettlp: MSIX [2]: Addr=0x0, Data=00000000
[454516.657188] nettlp: MSIX [3]: Addr=0x0, Data=00000000
[454516.657189] nettlp: MSIX [4]: Addr=0x0, Data=00000000
[454516.657190] nettlp: MSIX [5]: Addr=0x0, Data=00000000
[454516.657191] nettlp: MSIX [6]: Addr=0x0, Data=00000000
[454516.657192] nettlp: MSIX [7]: Addr=0x0, Data=00000000
[454516.657193] nettlp: MSIX [8]: Addr=0x0, Data=00000000
[454516.657194] nettlp: MSIX [9]: Addr=0x0, Data=00000000
[454516.657195] nettlp: MSIX [10]: Addr=0x0, Data=00000000
[454516.657196] nettlp: MSIX [11]: Addr=0x0, Data=00000000
[454516.657197] nettlp: MSIX [12]: Addr=0x0, Data=00000000
[454516.657198] nettlp: MSIX [13]: Addr=0x0, Data=00000000
[454516.657199] nettlp: MSIX [14]: Addr=0x0, Data=00000000
[454516.657200] nettlp: MSIX [15]: Addr=0x0, Data=00000000
Note: NetTLP might conflict with some
security mechanisms. If you do not focus on such mechanisms
and do not mind, we suggest to
add "nopti nospectre_v2 nokaslr"
to GRUB_CMDLINE_LINUX in /etc/default/grub, update-grub, and
reboot the machine. Note that we have NOT tested NetTLP with
iommu yet.
Setup device host
Requirement of device host
- OS: Ubuntu 18.04
- x1 10Gbps Ethernet port
Setup connection between device host and NetTLP adapter
Connect the device host to the NetTLP adapter on the adapter
host by 10Gbps Ethernet link, like shown in Figure 1. The
NetTLP adapter always sends light during board power is on,
so that only ip link set dev ethX
up
is required to bring the link up.
The next step is to setup a static ARP entry. The current implementation of NetTLP adapter (v0.17) does not respond to ARP requests. Thus, the device host needs to have a static ARP entry for the connected NetTLP adapter.
Note: default MAC address and IP address of current NetTLP adapter are 00:11:22:33:44:55 and 192.168.10.1. The NetTLP adapter uses FF:FF:FF:FF:FF:FF and 192.168.10.3 as default destination addresses for encapsulation. They can be changed. These addresses are used for only TLP encapsulation and decapsulation, not used for the network stack of the adapter host. Thus, this address (192.168.10.1 as default) does not respond to any packets except NetTLP packets.
In this setup, we use 192.168.10.3/24 for the device host according to the default IP address of the adapter.
# eth1 is an exmaple name of the 10Gbps port connected to NetTLP adapter
$ sudo ip link set dev eth1 up
$ sudo ip addr add dev eth1 192.168.10.3/24
$ sudo ip nei add 192.168.10.1 dev eth1 lladdr 00:11:22:33:44:55
$ ip nei show 192.168.10.1
192.168.10.1 dev eth1 lladdr 00:11:22:33:44:55 PERMANENT
Compile LibTLP
LibTLP works on the device host. Download and compile it from https://github.com/nettlp/libtlp. The LibTLP repository contains a shared library called libtlp and sample applications such as dma_read, pmem, and tlpperf.
$ mkdir nettlp
$ cd nettlp
$ git clone https://github.com/nettlp/libtlp
Cloning into 'libtlp'...
Username for 'https://github.com': upa
Password for 'https://upa@github.com':
remote: Enumerating objects: 300, done.
remote: Counting objects: 100% (300/300), done.
remote: Compressing objects: 100% (198/198), done.
remote: Total 300 (delta 184), reused 211 (delta 95), pack-reused 0
Receiving objects: 100% (300/300), 69.79 KiB | 441.00 KiB/s, done.
Resolving deltas: 100% (184/184), done.
$ cd libtlp/
$ ls
Makefile README.md apps include lib test
$ make
make[1]: Entering directory '/home/upa/work/test/nettlp/libtlp/lib'
gcc -I./ -I../include -c libtlp.c
ar rcs libtlp.a libtlp.o
make[1]: Leaving directory '/home/upa/work/test/nettlp/libtlp/lib'
make[1]: Entering directory '/home/upa/work/test/nettlp/libtlp/test'
gcc -g -Wall -I../include -L../lib test_dma_read.c -ltlp -o test_dma_read
gcc -g -Wall -I../include -L../lib test_dma_write.c -ltlp -o test_dma_write
gcc -g -Wall -I../include -L../lib test_msg_bar4.c -ltlp -o test_msg_bar4
gcc -g -Wall -I../include -L../lib test_msg_msix.c -ltlp -o test_msg_msix
gcc -g -Wall -I../include -L../lib test_msg_devid.c -ltlp -o test_msg_devid
make[1]: Leaving directory '/home/upa/work/test/nettlp/libtlp/test'
make[1]: Entering directory '/home/upa/work/test/nettlp/libtlp/apps'
gcc -g -Wall -I../include -L../lib dma_read.c -ltlp -lpthread -o dma_read
gcc -g -Wall -I../include -L../lib dma_write.c -ltlp -lpthread -o dma_write
gcc -g -Wall -I../include -L../lib pmem.c -ltlp -lpthread -o pmem
gcc -g -Wall -I../include -L../lib process-list.c -ltlp -lpthread -o process-list
gcc -g -Wall -I../include -L../lib pgd-walk.c -ltlp -lpthread -o pgd-walk
gcc -g -Wall -I../include -L../lib codedump.c -ltlp -lpthread -o codedump
gcc -g -Wall -I../include -L../lib tlpperf.c -ltlp -lpthread -o tlpperf
make[1]: Leaving directory '/home/upa/work/test/nettlp/libtlp/apps'
$ ls apps/
Makefile dma_read pgd-walk process-list tlpperf.c
README.md dma_read.c pgd-walk.c process-list.c util.h
codedump dma_write pmem thread_affinity_apple.h
codedump.c dma_write.c pmem.c tlpperf
Next section shows how to play example LibTLP applications on this environment.