Network Preparations

This document describes how an Oxide rack is connected to a broader network. The following is a high-level checklist of things to consider when planning for the oxide rack.

  • Need to connect between 2 and 4 RJ45-based Ethernet cables to the rack for management, with at least one connection for each switch.

  • The management network requires IPv6.

  • Decide what type of transceiver to use for the data network from the supported transcievers list.

  • Set up management and data network firewall rules.

  • Plan out a set of names and addresses to assign to the rack.

  • Plan a routing strategy between the rack and the broader network.

This guide will go over the basics of physical network setup along with guidance on how to set up the management and data networks to help inform these considerations.

Physical Setup

The Oxide rack has two middle-of-rack switches. Here’s one below.

Rack Switches

Each switch has 32 QSFP-compatible ports that can operate at either 40 Gbit/s (QSFP), 100 Gbit/s (QSFP28), or 200 Gbit/s (QSFP56). There are also two RJ45 ports that will be referred to as technician ports. The technician ports are 10/100/1000BASE-T capable.

Note
This guide will describe rack networking in terms of the physical ports on these switches. The same concepts apply to the rack fiber tray by using the port map printed inside the door of the tray.

The following table shows supported QSFP transceiver types.

Supported Optical Transceivers
TypeFiberStrands

40GBASE-LR4

Single-Mode

2 (LC)

100GBASE-CWDM4

Single-Mode

2 (LC)

100GBASE-LR4

Single-Mode

2 (LC)

100GBASE-SR4

Multi-Mode

8 (MPO/MTP)

100GBASE-SR-BiDi

Multi-Mode

2 (LC)

100GBASE-FR1

Single-Mode

2 (LC)

200GBASE-FR4

Single-Mode

2 (LC)

Management Network

The management network is how the rack is first set up and is accessible through the technician ports. The management network provides access to:

  • A terminal UI program called wicket that allows administrators to:

    • Provide an initial configuration for the rack and bring the system online.

    • Perform out-of-band system updates.

  • A support shell that allows Oxide support engineers to troubleshoot low-level system issues.

The details of using these programs are covered in the Initial Rack Setup Guide. This guide will focus on the network aspects of technician ports.

Addressing

Technician ports send out periodic IPv6 SLAAC advertisements at an interval of 30 seconds. Any device plugged into a technician port will receive these advertisements.

Note
Interfaces plugged into technician ports must be configured for IPv6 autoconfiguration (SLAAC).

When the device plugged into the technician port receives a SLAAC advertisement, it will auto-assign an address on the IPv6 network advertised by the rack’s technician port. For example if the technician port advertises a prefix

fdb1:a840:2504:195::/64

the connected device will assign an address in that space, typically based on the MAC address of the interface it’s connected on following EUI-64 conventions. For example, if the connected interface has a MAC address of 2:8:20:36:5c:8d, the resulting self-assigned IPv6 address on the advertised technician port prefix would be the following.

fdb1:a840:2504:195:8:20ff:fe36:5c8d/64

The technician port assigns the first address in this range to itself. So for the example prefix above, you can reach services provided by the rack over the technician port at the following address.

fdb1:a840:2504:195::1

Each technician port advertises a distinct IPv6 /64 prefix.

Firewall Considerations

In order to access the services provided over the management network, SSH port 22 must be accessible. Both the wicket program and the support shell are accessed through TCP port 22.

Data Network

The data network is accessible through the rack switch QSFP ports. The data network provides connectivity between services and instances running inside the rack and the broader network the rack is a part of. Services running inside the rack include:

  • The Oxide API

  • Per-switch networking daemons such as BGP and BFD.

  • DNS servers that provide name resolution for services running in the rack.

In order for the rack to function correctly, it needs access to a few services on the broader network, including the following.

  • NTP servers

  • Upstream DNS servers

Thinking at Layer 3

An important concept to highlight is that while we refer to the components in the center of the rack as switches, they are really more like routers. There is no common broadcast domain shared between any of the ports. When thinking about how an Oxide rack will integrate with a broader network, think about the switches as L3 edge routers. Any populated port on the switch will need an egress route assigned to it to forward packets into the broader network. Similarly for ingress traffic, the rack will not respond to ARP or NDP requests for any IP pool addresses it has been assigned. The rack switches must be assigned a gateway address that the broader network can use to direct off-subnet traffic into the rack. The switches will of course respond to ARP and NDP requests for gateway addresses assigned to them.

Initial Setup

The way initial communication paths are set up between the broader network and the rack is through an initial configuration file. This configuration file is handled by the wicket setup program as described in the Initial Rack Setup Guide. This guide will focus on the networking details in that initial configuration. We’ll go through the configuration section-by-section and then provide a complete overview at the end.

Broader Network Services

The first part of the initial setup config tells the rack how to access the services it needs on the broader network. Here we are telling the rack that it can use 1.1.1.1 and 9.9.9.9 as upstream DNS servers and it can use "ntp.acme.com" as a time source.

In the examples that follow IPv4 is used. However, IPv6 is also supported. The value provided to ntp_servers may be a DNS name or IP address. For the time being, there is a limit of 3 DNS and NTP servers. The upstream DNS servers must be recursive resolvers and must be specified as IP addresses. In addition to rack infrastructure, end user instances are given these DNS servers via DHCP options.

dns_servers = [
"1.1.1.1",
"9.9.9.9",
]

ntp_servers = [
"ntp.acme.com",
]

Assignment of Names and Numbers to the Rack

The DNS names and IP address numbers assigned to the rack from the broader network include.

  • A DNS domain with a subdomain for each Silo.

  • A set of IP addresses for routing between rack switches and the broader network.

  • A set of IP addresses for rack-hosted DNS servers.

  • A set of IP addresses for the Oxide API server.

  • A set of IP addresses for end-user instances.

In this example, the DNS name cloud.acme.com is assigned to the rack. The DNS servers for the broader network infrastructure will need to delegate cloud.acme.com to the IP addresses described below and use glue records to forward DNS requests to the rack hosted DNS servers.

The IP addresses that will be used by the rack to host DNS servers are set as 172.20.26.1 and 172.20.26.2. These are just example addresses, and we’ll generally use addresses from this subnet for the rest of this example. The only limitation on these addresses is that there must be at least two provided. Once the rack control plane is up, these addresses will respond to DNS queries. Critically, you will be able to resolve the rack recovery address via recovery.sys.cloud.acme.com.

The internal-services IP pool provides the rack with a set of addresses to assign to rack-hosted services such as the Oxide API, DNS, etc. Generally speaking, IP pools are a resource that the rack control plane can dynamically allocate IPs from. In this case, we are defining an IP pool for internal services. IP pools are also used for end-user instances and can be defined using the Oxide API once the rack is initialized. These are addresses from the broader network that are assigned to the rack. It’s recommended to assign at least 16 addresses to the rack for high-availability (HA) setups. A minimal HA setup uses the following addresses:

  • 5 addresses for DNS.

  • 3 addresses for the Oxide API.

  • 2 addresses for boundary NTP daemons.

DNS addresses are specified explicitly in configuration. Other address types are allocated dynamically from the provided IP pool. Oxide API addresses are discoverable via the external DNS servers by querying for records of the form:

.sys.cloud.acme.com

On initialization the rack automatically sets up the recovery silo.

external_dns_zone_name = "cloud.acme.com"

external_dns_ips = [
"172.20.26.1",
"172.20.26.2",
]

internal_services_ip_pool_ranges = [
{ first = "172.20.26.1", last = "172.20.26.16" }
]

Rack Switch Configuration with Static Routes

The final bit of initial rack configuration relevant to networking is switch configuration. This configuration sets up the routes and addresses on the rack switches that are needed for services and instances within the rack to communicate with the broader network.

The set of addresses infra_ip_first and infra_ip_last at the beginning of the configuration define a range of addresses that may be assigned to rack switches. These addresses may be used exactly once. An attempt to assign the same address to multiple switches or to multiple ports on the same switch will result in an error. This constraint may be relaxed in a later release when anycast addresses become supported. This range is inclusive meaning the first and last addresses are included in the range.

Next, an uplink port is configured for each rack switch. In this example one uplink is configured per switch. However, there is no limit to the number of uplinks that may be configured here.

Each uplink configuration includes the following.

  • gateway_ip: the address of the upstream router that will provide off-subnet communications for the rack on this uplink.

  • port: specifies which switch port this configuration applies to. The ports on the switch are physically labeled with a number. In this configuration that number is prefixed with "qsfp".

  • uplink_port_speed: the speed of the transceiver module plugged into the QSFP port.

  • uplink_port_fec: the forward error correction mode to be used for the port. This can currently be rs for Reed-Solomon or none.

  • uplink_cidr: the IP and subnet mask in CIDR format to assign to this port. This address must be pulled from the infra_ip address range.

  • switch: which rack switch this configuration applies to, may be either switch0 or switch1.

[rack_network_config]
infra_ip_first = "172.20.15.21"
infra_ip_last = "172.20.15.22"

[[rack_network_config.ports]]
routes = [{nexthop = "172.20.15.17", destination = "0.0.0.0/0"}]
addresses = ["172.20.15.21/29"]
port = "qsfp0"
uplink_port_speed = "100G"
uplink_port_fec = "rs"
bgp_peers = []
switch = "switch0"

[[rack_network_config.ports]]
routes = [{nexthop = "172.20.15.17", destination = "0.0.0.0/0"}]
addresses = ["172.20.15.22/29"]
port = "qsfp0"
uplink_port_speed = "100G"
uplink_port_fec = "none"
bgp_peers = []
switch = "switch1"

Complete Configuration

The following is all of the above configuration in one place.

#
# Broader network services
#

dns_servers = [
"1.1.1.1",
"9.9.9.9",
]

ntp_servers = [
"ntp.acme.com",
]

#
# Assign names and numbers to the rack
#

external_dns_zone_name = "cloud.acme.com"

external_dns_ips = [
"172.20.26.1",
"172.20.26.2",
]

internal_services_ip_pool_ranges = [
{ first = "172.20.26.1", last = "172.20.26.16" }
]

#
# Configure rack switches
#

[rack_network_config]
infra_ip_first = "172.20.15.21"
infra_ip_last = "172.20.15.22"
bgp = []

[[rack_network_config.ports]]
routes = [{nexthop = "172.20.15.17", destination = "0.0.0.0/0"}]
addresses = ["172.20.15.21/29"]
port = "qsfp0"
uplink_port_speed = "100G"
uplink_port_fec = "rs"
bgp_peers = []
switch = "switch0"

[[rack_network_config.ports]]
routes = [{nexthop = "172.20.15.17", destination = "0.0.0.0/0"}]
addresses = ["172.20.15.22/29"]
port = "qsfp0"
uplink_port_speed = "100G"
uplink_port_fec = "none"
bgp_peers = []
switch = "switch1"

Rack Switch Configuration with BGP

Setting up BGP as a part of rack setup requires supplying two types of information.

  1. A set of BGP router configurations must be specified as a part of the rack_network_config.

  2. Each port that peering will take place over must have a BGP peer config for each neighbor.

The BGP router config below configures a router with an autonomous system number of 47. This router will announce the prefix 172.20.26.0/24 to any peers it establishes BGP sessions with.

[[rack_network_config.bgp]]
asn = 47
originate = [ "172.20.26.0/24" ]

The port configurations that follow are a direct translation from the previous static routing configurations to BGP. Here the routes field is empty and the bgp_peers field filled in. Because each rack switch can have multiple BGP routers running on different ASNs, peers must specify which ASN they are in. Each peer configuration also specifies the address of the neighbor it is expecting to peer with.

Note
The port field of BGP peers is redundant with the port field of the rack_network_config.ports port field and will be removed in a future release.
[[rack_network_config.ports]]
routes = []
addresses = ["172.20.15.21/29"]
port = "qsfp0"
uplink_port_speed = "100G"
uplink_port_fec = "rs"
bgp_peers = [{asn = 47, addr = "172.20.15.17", port = "qsfp0"}]
switch = "switch0"

[[rack_network_config.ports]]
routes = []
addresses = ["172.20.15.22/29"]
port = "qsfp0"
uplink_port_speed = "100G"
uplink_port_fec = "none"
bgp_peers = [{asn = 47, addr = "172.20.15.17", port = "qsfp0"}]
switch = "switch1"

Additional configurations such as timeout, filters, and MD5 authentication key can be included in the initial setup or added at a later time. A complete list of the supported configurations can be found in the BGP guide.

Beyond Initial Setup

This guide has primarily focused on network considerations for getting the rack up and running. Once the rack is set up, there are additional considerations for transiting traffic to and from VM instances. The Oxide API provides a set of endpoints for managing IP pools. These IP pools are the same basic abstraction as the internal services IP pool covered above. The only difference is the IP pools that are managed through the Oxide API are used to hand out IP addresses to VM instances. The addresses in these pools need to be routed to the rack, and the rack needs to have egress routes set up pointing at appropriate gateways for the address space covered by the IP pool.

There are no restrictions on what IP ranges can be used in IP pools.

The configuration provided during initial rack setup may be changed later through the Oxide API once the rack is up and running. You may also make other network topology changes such as

Firewall Considerations

The following ports are used by the rack and should be made available on the broader network segment the rack is a part of. The direction in identifies traffic to the rack from the broader network, out identifies traffic from the rack to the broader network, and both indicates bidirectional traffic.

Data Network Firewall Ports
PortProtocolDirectionUsage

443

TCP / HTTPS

in

Oxide rack API

53

UDP / DNS

both

Name resolution for rack services (out). Rack provided name resolution (in).

123

UDP / NTP

both

Network time protocol (NTP) message exchange.

179

TCP / BGP

both

Border gateway protocol (BGP) peering and prefix exchange between the rack and broader network routers.

4784

UDP / BFD

both

Bidirectional forwarding detection (BFD) messaging. The Oxide platform uses BFD for Multihop Paths as described in RFC5883.

22

TCP / SSH

in

SSH access to instances. Not strictly required for rack functionality but likely needed by end users.

Last updated
         Â