-
Notifications
You must be signed in to change notification settings - Fork 108
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Created new floder and sequence diagram.
Addressed a few comments and added a sequence diagram as requested. Signed-off-by: nvinnakota <[email protected]>
- Loading branch information
1 parent
4845c1a
commit 00efdab
Showing
2 changed files
with
143 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,143 @@ | ||
|
||
# Overview | ||
- User Space networking is a feature that involves packet path from the VM to the data plane running in userspace like OVS-DPDK etc, by-passing the kernel. | ||
- With VhostUser interface, the kubevirt VM created can utilize the dpdk virtio functionality and avoid using kernel path for the traffic when DPDK is involved, allowing usage of fast datapath and improve performance. | ||
|
||
## Motivation | ||
- Any kubevirt VM uses linux kernel interfaces to connect to its host and to reach out externally or to communicate within the cluster. The current kubevirt VMs can only support the kernel data path for which the traffic needs to be through the kernel of the node/compute hosting the VM. | ||
|
||
- As a result, when we have a host with DPDK dataplane running in Userspace, the traffic from kubevirt VM will go through the kernel and reach the dataplane running in userspace. This longer path slows down the traffic between the application and the dataplane when both are running in userspace. | ||
|
||
- In order to improve the traffic performance and make use of the dpdk, we need to use a different path that can by-pass the kernel and reach the dataplane from the user space directly. This so called fast path will run within the user space and will allow fast packet processing. | ||
|
||
## Goals | ||
- The kubevirt VM when created with DPDK features, should be able to use fast datpath if the host supports the DPDK dataplane allowing fast packet processing to improve performance of the traffic. | ||
|
||
## Non Goals | ||
- The user space networking will require the host to a DPDK dataplane. | ||
- The kubevirtci creates kernel mode dataplane hosts for testing the VMs. It needs to be updated to support DPDK. | ||
|
||
## Definition of Users | ||
- Any user who has a (k8s) cluster with a dpdk dataplane can use the interface/feature. | ||
- The feature will be supported only if a multus userspace CNI is being used, for instance userspace cni from intel. | ||
|
||
## User Stories | ||
- As a user/admin, I want to use user space networking feature if I have DPDK supported VM and the host has a DPDK dataplane and userspace cni with multus support. | ||
|
||
## Repos | ||
|
||
- kubevirt/kubevirt | ||
- Introduction of vhostuser type interface. | ||
- kubevirt/kubevirtci | ||
- Enable usage of dpdk on the setups created by kubevirtci. | ||
|
||
# Design | ||
- The design involves introducing vHostUser interface type and required parameters to support it. An EmptyDIR volume (shared-dir) and DownwardAPI (podinfo) volume are mounted on the virtlauncher pod to have the support for new interface. | ||
- These mounts will allow the virt-launcher pod to create a VM with an additional interface which can be reached using the vhostSocket mentioned in the DownwardAPI. | ||
|
||
- Creating the VMs will follow the same process with a few changes highlighted as below: | ||
1. **Once the VM spec virt-controller will add two new volumes to the virt-launcher pod.**\ | ||
a. **EmptyDir volume named (shared-dir) "/var/lib/cni/usrcni/" is used to share the vhostuser socket file with the virt-launcher pod and dpdk dataplane. This socket file acts as an UNIX socket between the VM interface and the dataplane running in the UserSpace.**\ | ||
b. **DownwardAPI volume named (podinfo) is used to share vhostuser socket file name with the virt-launcher via pod annotations.** | ||
2. **The CNI will mount the shared-dir i.e., EmptyDir volume's host path /var/lib/kubelet/pods/<podID>/volumes/kubernetes.io~empty-dir/shared-dir . This will be deleted, to delete the vHostUser Interface while deleteing the VM and/or virt-launcher pod.** | ||
3. **The CNI will update the virt-launcher pod annotations with vhostsocket-file name and details.** | ||
4. **The virt-launcher reads the DownwardAPI volume and retrieves the vhostsocket-file name specified in pod annotations and uses it to create a unix socket in the VM, while launching the VM using libvirtd.** | ||
4. **The virt-launcher is modified to skip establishing the networking between VM and the virt-launcher pod using bridge(Refer in Kubevirt Networking section in https://kubevirt.io/2018/KubeVirt-Network-Deep-Dive.html).** | ||
5. **Instead of using the bridge through the launcher pod, the vHostUser interface of the VM will be directly connected to the DPDK datplane using the vhost socket** | ||
|
||
- With the above process a kubevirt VM with a vHostUser interface can be acheived. All the steps highlighted will be the changes made as part of the design. | ||
|
||
## API Examples | ||
|
||
Since, KubeVirt will always explicitly define the pod interface name for multus-cni. It will be computed from the VMI spec interface name, to allow multiple connections to the same multus provided network. | ||
|
||
The vHostUser Interface will be defined in the VM spec as shown below: | ||
|
||
```yaml | ||
devices: | ||
interfaces: | ||
- name: vhost-user-vn-blue | ||
vhostuser: {} | ||
useVirtioTransitional: true | ||
networks: | ||
- name: vhost-user-vn-blue | ||
multus: | ||
networkName: vn-blue | ||
``` | ||
## Scalability | ||
- There should not be any scalability issues, as the feature is to create an interface when useful. | ||
- VM migration works the same way, as the only change is having a new interace, which only requires a dataplane of the host in userpsace. | ||
## Update/Rollback Compatibility | ||
- Should have no impact. | ||
## Functional Testing Approach | ||
Functional test can: | ||
- Create VM with vhostUser interface | ||
- Create VM with multiple interfaces | ||
# Implementation Phases | ||
- The design will be implemented in two phases: | ||
- Add DPDK support on kubevirt/kubevirtci | ||
- Enable OVS with DPDK | ||
- Enable an option in gocli for DPDK support | ||
- Add UserSpace CNI in the setup instead of or along with calico. | ||
- Add vhostUser Interface type in kubevirt/kubevirt | ||
- Add vHostUser Interface type | ||
- Create Pod template to support the Interface | ||
- Create appropriate virsh xml elements | ||
- Add E2E test to send traffic between the 2 VMs created. | ||
## Annex | ||
The NAD can be generic which can just be used for defining the networks. The below is a NAD definition from userspace CNI based on ovs-dpdk by intel. | ||
```yaml | ||
apiVersion: "k8s.cni.cncf.io/v1" | ||
kind: NetworkAttachmentDefinition | ||
metadata: | ||
name: userspace-ovs-net-1 | ||
spec: | ||
config: '{ | ||
"cniVersion": "0.3.1", | ||
"type": "userspace", | ||
"name": "userspace-ovs-net-1", | ||
"kubeconfig": "/etc/kubernetes/cni/net.d/multus.d/multus.kubeconfig", | ||
"logFile": "/var/log/userspace-ovs-net-1-cni.log", | ||
"logLevel": "debug", | ||
"host": { | ||
"engine": "ovs-dpdk", | ||
"iftype": "vhostuser", | ||
"netType": "bridge", | ||
"vhost": { | ||
"mode": "client" | ||
}, | ||
"bridge": { | ||
"bridgeName": "br-dpdk0" | ||
} | ||
}, | ||
"container": { | ||
"engine": "ovs-dpdk", | ||
"iftype": "vhostuser", | ||
"netType": "interface", | ||
"vhost": { | ||
"mode": "server" | ||
} | ||
}, | ||
"ipam": { | ||
"type": "host-local", | ||
"subnet": "10.56.217.0/24", | ||
"rangeStart": "10.56.217.131", | ||
"rangeEnd": "10.56.217.190", | ||
"routes": [ | ||
{ | ||
"dst": "0.0.0.0/0" | ||
} | ||
], | ||
"gateway": "10.56.217.1" | ||
} | ||
}' | ||
|
||
``` |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.