Important Notes

  1. IP pools have been reworked for greater flexibility and can now be managed through the web console UI.

    1. In v5, an IP pool was either fleet-scoped (i.e., available to users in all silos) or silo-scoped (i.e., available in one silo). In v6, an IP pool can be linked to any number of silos. This enables configurations that were not possible in v5, such as an IP pool shared by silos A and B but not C.

    2. There is no longer a concept of fleet-scoped pool: users can only allocate IP addresses from pools explicitly linked to their silo. The behavior of a fleet-scoped pool can be recreated in the new model by linking a pool to every silo individually.

    3. The software update process will set up links between existing pools and silos in a way that preserves v5 behavior:

      1. Formerly fleet-scoped pools will be linked to every silo.

      2. If a formerly fleet-scoped pool was marked “default” for the fleet, it will continue to be the default for each silo unless that silo had its own default pool overriding the fleet-level default.

    4. After setting up a new silo, you will need to link an IP pool to it before users can allocate external IPs.

    5. Please review the updated IP Pool Management guide and API docs and update any API client that manipulates IP pools.

  2. The Oxide CLI, Go SDK, and Terraform Provider have been updated for floating IP attach/detach support and the IP Pool changes mentioned above as well as other API enhancements, please ensure you obtain the latest versions.

  3. The NewClient function in the Go SDK has been modified to no longer require user agent, and now takes a Config struct instead. You can find more information about this, and other changes in the Go SDK changelog.

  4. The latest release of the Oxide CLI produces JSON-formatted output.

Installation

Oxide Computer Model 0 must be installed and configured under the guidance of Oxide technicians. The requirement may change in future releases.

Upgrade Compatibility

Upgrade from version 5 is supported. We recommend shutting down all running instances on the rack before the software update commences.

All existing setup and data (e.g., projects, users, instances) should remain intact after the software update.

New Features

This release comes with two new features for rack switch failover support, security fixes, and other minor enhancements.

Fault-Tolerant Multipath Connectivity

Each of the two Oxide rack switches (aka “sidecars”) can now be connected to two or more uplinks allowing rack connectivity to remain available should there be a single physical switch or uplink failure. For static multipath routing configurations, this is made possible through Bidirectional Forwarding Detection (BFD) provided in this release. New tunnel routing capabilities within the rack’s internal network ensure packets leaving the rack are always routed to a rack switch with sufficient connectivity to forward packets to their final destination. For BGP routing configurations, tunnel routes adapt to changes in BGP routing tables. These features work together to eliminate single points of failure by automatically detecting connectivity issues and redirecting network traffic to the functional network paths. (Note: BFD verifies IP connectivity between the source and destination by actively sending control packets and/or passively responding to control packets from the neighboring devices.)

Floating IP address attach/detach

The floating IP feature introduced in v5 allows a consistent IP addresses to be allocated to a new instance and the address to be de-allocated upon instance termination. The feature has been further enhanced in v6 to allow floating IPs to be allocated to, or de-allocated from, running instances. In other words, you can move a floating IP from one instance to another on the fly. See the latest Guest Networking Guide for more information.

Web console improvements

  • Manage IP pools: add/remove IP ranges, link/unlink silos, set default pool (console#1910)

  • Select from list of SSH keys on instance create form (console#1867)

  • Firewall rules table includes priority and direction, is sorted by priority (console#1887)

  • Add user data (e.g., cloud-init config YAML) field under Advanced on instance create form (console#800)

  • Show external IPs at top of instance page (console#1882)

Bug fixes and minor enhancements

  • TCP state machine race condition could leave Nexus API or guest instance TCP connections in the wrong state (opte#442)

  • Outbound TCP flow occasionally hung as old TCP flows were in FLOW_WAIT, blocking port reuse (opte#436)

  • Instances could not transition to failed state when their propolis zones crashed or were purged (omicron#4709)

  • API

    • Select the SSH keys to inject into instances at create time (omicron#3056)

    • Users can list IP pools available to them (omicron#2148)

    • Project deletion did not enforce the removal of the project’s floating IP addresses (omicron#4854)

    • Instance create requests with hostnames not conforming to RFC 1035 are now prohibited (omicron#4938)

  • Web console

    • Enhance number field and use it more consistently (console#1926)

    • Fix y-axis units for large numbers on instance disk metrics charts (console#1916)

    • Clickable styling (underline and hover) on links in tables (console#1899)

    • Handle empty IP address in network interface create form (console#1854)

    • Instance networking config moved under Advanced/Networking on instance create (console#800)

    • Don’t allow editing on the instance create form while submit is in progress (console#1893)

    • Relative times (e.g., “7d ago”) have a tooltip showing absolute time (console#1879)

    • Increase page size on tables from 10 to 25 (console#1878)

    • Pressing enter while adding target, host, or port to firewall rule should not submit form (console#1919)

  • Reliability improvements

    • Improved network device driver error handling (propolis#583)

    • Stop checking the disk parent image/snapshot if the scrub is done (crucible#1093)

  • Storage performance improvements (crucible-PR#1058, crucible-PR#1066, crucible-PR#1089, crucible-PR#1094, crucible-PR#1107)

  • Control plane datastores now use ZFS datasets in encrypted mode (omicron-PR#4853)

Known Behavior and Limitations

End-user features

Feature AreaKnown Issue/LimitationIssue Number

Image/snapshot management

Disks in importing_from_bulk_writes state cannot be deleted directly. The procedures to unstick a canceled disk import are not obvious to CLI users.

omicron#2987

Image/snapshot management

Image upload sometimes stalls with HTTP/2 on Firefox.

omicron#3559

Image/snapshot management

Unable to create snapshots for disks attached to stopped instances. As a workaround, user can detach a disk temporarily for snapshotting and re-attach it to the instance afterwards.

omicron#3289

Image/snapshot management

The ability to modify image metadata is not available at this time.

omicron#2800

Instance orchestration

Instance or disk provisioning requests may fail due to unhandled sled or storage failure on rare occasions. Users can retry the requests to work around the failures.

omicron#4259, omicron#4331

Instance orchestration

Disk volume backend repair may fail to complete under heavy large write workload, preventing instances from starting or stopping.

crucible#837

Instance orchestration

Instance hostname validation has been strengthened. Instances with a now-invalid hostname will fail to start, though they can still be listed and viewed. If the disks attached to them are valuable, they may be detached from the invalid instances, and re-attached to a new instance. The invalid instance may be deleted at that time.

omicron-PR#4938

Telemetry

Guest VM vcpu and memory metrics are unavailable at this time.

-

VPC and routing

Inter-subnet traffic routing is not available by default. Router and routing rules will be supported in future releases.

omicron#2232

Operator features

Feature AreaKnown Issue/LimitationIssue Number

Access control

Device tokens do not expire.

omicron#2302

Control plane

Sled and physical storage availability status are not available in the inventory UI and API yet.

omicron#2035

Control plane

When sleds attached to the switches are restarted outside of rack cold-start, a full rack power cycle may be required to re-propagate sled NAT configurations.

omicron#3631

Control plane

Operator-driven software update is currently unavailable. All updates need to be performed by Oxide technicians.

-

Control plane

Operator-driven instance migration across sleds is currently unavailable. Instance migrations need to be performed by Oxide technicians.

-

Telemetry

Hardware metrics such as temperatures, fan speeds, and power consumption are not exposed to the control plane at this time.

-

User management

User offboarding from the rack is not supported at this time. Apart from updating the identity provider to remove obsolete users from the relevant groups, operators will need to remove any IAM roles granted directly to those users in silos and projects.

omicron#2587