Important Notes

  1. The instance_create API endpoint now returns a success response as soon as orchestration is finished. The hypervisor-level setup and booting has been made asynchronous to avoid timeouts (e.g., when there are many concurrent large instance requests). Care must be taken when provisioning instances to poll for instance state and connect to an instance only when it has transitioned to the running state.

  2. The Oxide CLI, Go SDK, and Terraform Provider have been updated for various API enhancements such as IP pool utilization described under New Features. Please ensure you obtain the latest version of the clients.

Installation

Oxide Computer Model 0 must be installed and configured under the guidance of Oxide technicians. The requirement may change in future releases.

Upgrade Compatibility

Upgrade from version 6 is supported. We recommend shutting down all running instances on the rack before the software update commences.

All existing setup and data (e.g., projects, users, instances) will remain intact after the software update.

New Features

Floating IPs in web console

The web console now supports creating and deleting floating IP addresses, as well as attaching to and detaching from instances.

IP pool utilization in API and web console

The IP pool management page in the web console now includes real-time utilization data, giving fleet administrators visibility into the number of external IP addresses currently allocated in each pool. See also the ip_pool_utilization_view API endpoint.

IP Pool Utilization

Bug fixes and minor enhancements

  • New API endpoint floating_ip_update for updating floating IP name and description (omicron#5016)

  • New API endpoint networking_bgp_message_history for retrieving BGP message history (maghemite-PR#179)

  • Fix capacity and utilization showing "undefined" in some cases (console#1954)

  • Crucible worker thread tunable is reduced to avoid hitting the kernel limit, resulting in instances stuck in stopping state under heavy disk I/O. (crucible#1184)

  • Fixes around crucible disk repair reliability (crucible#1146, crucible#1155)

  • Additional validations to prevent cross-project floating IP attach (omicron-PR#5177)

  • Graceful transition to/from BFD-based networks (maghemite-PR#174)

  • ClickHouse upgrade from v22.8.9.24 to v23.8.7.24 (omicron-PR#5127)

  • "Probes" experimental API. Probes are instance-like objects used for emulating instance and network interface lifecycle events. They will consume IP addresses but do not take up any compute and storage resources. Probes may be used by Oxide technicians from time to time for instrumentation purposes with the rack operator’s permissions. (omicron-PR#4585)

Firmware update

Known Behavior and Limitations

End-user features

Feature AreaKnown Issue/LimitationIssue Number

Image/snapshot management

Disks in importing_from_bulk_writes state cannot be deleted directly. The procedures to unstick a canceled disk import are not obvious to CLI users.

omicron#2987

Image/snapshot management

Image upload sometimes stalls with HTTP/2 on Firefox.

omicron#3559

Image/snapshot management

Unable to create snapshots for disks attached to stopped instances. As a workaround, user can detach a disk temporarily for snapshotting and re-attach it to the instance afterwards.

omicron#3289

Image/snapshot management

The ability to modify image metadata is not available at this time.

omicron#2800

Instance orchestration

Instance or disk provisioning requests may fail due to unhandled sled or storage failure on rare occasions. Users can retry the requests to work around the failures.

omicron#4259, omicron#4331

Instance orchestration

Disk volume backend repair may fail to complete under heavy large write workload, preventing instances from starting or stopping.

crucible#837

Instance orchestration

Instance hostname validation has been strengthened. Instances with a now-invalid hostname will fail to start, though they can still be listed and viewed. If the disks attached to them are valuable, they may be detached from the invalid instances, and re-attached to a new instance. The invalid instance may be deleted at that time.

omicron-PR#4938

Telemetry

VM instance memory utilization and network throughput metrics are unavailable at this time.

-

VPC and routing

Inter-subnet traffic routing is not available by default. Router and routing rules will be supported in future releases.

omicron#2232

Operator features

Feature AreaKnown Issue/LimitationIssue Number

Access control

Device tokens do not expire.

omicron#2302

Control plane

Sled and physical storage availability status are not available in the inventory UI and API yet.

omicron#2035

Control plane

When a sled is rebooted outside of the maintenance settings, new instances on the sled may be unable to reach existing instances on other sleds until those instances have been restarted.

omicron#5214

Control plane

Operator-driven software update is currently unavailable. All updates need to be performed by Oxide technicians.

-

Control plane

Operator-driven instance migration across sleds is currently unavailable. Instance migrations need to be performed by Oxide technicians.

-

Telemetry

Hardware metrics such as temperatures, fan speeds, and power consumption are not exposed to the control plane at this time.

-

User management

User offboarding from the rack is not supported at this time. Apart from updating the identity provider to remove obsolete users from the relevant groups, operators will need to remove any IAM roles granted directly to those users in silos and projects.

omicron#2587