Introduction

Overview

Oxide Computer is an efficient, rack-scale computing system complete with hardware and software. It has many of the same benefits of cloud hyperscale infrastructure in terms of developer productivity but at a much lower long-term infrastructure cost. The product is intended for use in on-premises deployments and the unit of purchase is a fully-integrated rack of systems.

The rack consists of several major components:

  • Up to 32 "Gimlet" Compute Sleds, each featuring:

    • One 225W TDP 64-core AMD Milan CPU

    • Support for up to 1 TiB of DRAM across 16 DIMMs (2 DIMMs per channel)

    • 8-12 front-facing hot-swappable PCIe Gen 4 U.2 storage devices

    • 2 internal M.2 devices

    • 2 ports of 100 GbE networking

    • Root of Trust (RoT) and Service Processor (SP)

  • Two "Sidecar" Rack Switches, featuring:

    • Based on Intel’s Tofino 2 ASIC supporting 64 200 GbE ports

    • 32 front facing QSFP28 ports for uplink and inter-rack connectivity

    • 32 rear facing ports that are blind mated to servers

    • Integrated management network ASIC

    • Root of Trust (RoT) and Service Processor (SP)

    • Termination of presence and power control auxiliary signals

    • Connects to a server over PCIe for Tofino 2 ASIC management

  • An Integrated AC→DC Power Shelf:

    • Able to deliver 15 kW of power to the rack at 54.5V

    • Supports 6 rectifiers in N+1 or N+N configuration

    • Contains an Oxide Power Shelf Controller (PSC)

      • Provides telemetry and management for the power shelf

      • Root of Trust (RoT) and Service Processor (SP)

  • A custom cabled backplane:

    • Allows all servers to blind mate all Ethernet, power, and auxiliary signaling

    • With planned support for up to 200GBASE-CR4

  • A fiber patch panel

  • An integrated control plane:

    • Comes with an API, a portal, and various SDKs through which operators and developers can provision compute infrastructure (virtual machines), storage (elastic block storage), and networking (VPC-like networking capabilities)

    • Provides operator the visibility into major rack component information such as software version, status, health, and other useful metrics

Racks will eventually be connected together to build a single pool of infrastructure (from a management and user perspective). Support for this is currently not available but is a key priority on the product roadmap.

Feature Highlights

Device Placement and Cabling

The rack form factor is based on the OCP Open Rack version 3 (ORV3). It uses a Cubby design for physical housing of compute servers. Each cubby consists of two side-by-side bays and occupies 2 Open U (96mm) in height each. These cubbies provide a fixed-pitch mechanical framework for servers, allowing the units to be hot pluggable/unpluggable during maintenance.

To support full serviceability from the front (cold aisle), the rack cable management design allows a trench running under each cubby to route any rear-facing cabling from back to front.

Power Surface and Distribution

For optimal efficiency, power is distributed throughout a single power shelf via a low-voltage DC bus bar pair, with the maximum power consumption below 18 kW.

The out-of-band management interface for the power shelf is a RJ-45 Ethernet port on each of the two switches that allows access from the network. The ports, located on the front panel, are connected over 1000BASE-T (or equivalent) Cat 6 cables to the uplinks.

Compute Servers

Each computer server, also known as “sled”, uses a single AMD Milan processor package. The processor choice is based on factors such as performance, power and thermal characteristics, price, susceptibility to known security vulnerabilities, as well as other architectural and operational considerations. The choice of NVMe storage and NIC is also based on rigorous evaluations of many of the same factors.

Every Oxide system board includes a hardware root-of-trust and an embedded service processor in place of the traditional BMC design incorporated into servers.

Backplane

The Identity and Presence Backplane (IPB) encompasses a series of small PCBAs and cabling that allows each server to identify its location within the rack, and a central management server to determine the number of cubbies in the rack and the current population status of each.

Virtual Machine and Network Management

VM instance and networking features are designed to provide a self-service experience similar to that of public clouds. Virtual machines support Unicast traffic with emulated broadcast/multicast for ARP/NDP. Useful defaults are provided for getting started quickly whereas full customization is available for more advanced workload deployment.

Each project is given their own notion of a Virtual Private Cloud (VPC) for address control, isolation, and traffic control. Each VPC is built on top of Geneve which provides UDP encapsulation with custom headers.

Oxide Rack uses Delay Driven Multipath (DDM) for routing internally but interfaces with common networking gear seamlessly. Boundary services leverage Tofino 2 Programmable ASIC and P4 functionality where packets are encapsulated on the way into the rack and decapsulated on the way out. This allows the rack to adapt to different network environments.

The Oxide Packet Transformation Engine (OPTE) sits between Virtual Machines and Physical Interfaces which services core functionality such as firewalling, routing, NATing, encapsulation and decapsulation. Each sled connects to both switches in the rack via multipath routing for high availability.

Last updated