Important notes
The Oxide CLI, Go SDK, and Terraform Provider have been updated for API enhancements described under New features. Please be sure to upgrade.
System requirements
Please refer to v1.0.0 release notes.
Installation
Oxide Computer Model 0 must be installed and configured under the guidance of Oxide technicians. The requirement may change in future releases.
Upgrade compatibility
Upgrade from version 19, 19.1, 19.2, 19.3 or 19.4 is supported. We recommend
shutting down all running instances on the rack before software update commences.
Any instances that aren’t stopped for software update are transitioned to the failed
state when the control plane comes up. They can be configured to start automatically
with auto-restart policy or
they can be started manually by the user.
All existing setup and data (e.g., projects, users, instances) remain intact after the software update.
New features
Jumbo frame support for external networking
Instance external networking now supports the use of jumbo frames, which raise the instance’s primary interface MTU from 1500 to 8500 bytes. The higher MTU can substantially improve throughput for flows that send large messages, provided that the network upstream of the rack has end-to-end jumbo frame support.
Please refer to the Jumbo Frames guide for details about jumbo frame settings and workload configuration recommendations.
Network performance improvements
Besides jumbo frames support, there are other improvements in this release to enable higher intra- and inter-VPC network throughput between VM instances:
viona IPv6 TCP segmentation offload support (illumos#18086)
Exclude viona worker LWPs from core-pinning in large vCPU instances (propolis#1146)
System update validation
The update system now checks for abnormal conditions before and after an update and prompts the operator to contact Oxide Support if needed (omicron#10271). See the System Update guide for more information.
Tunable for telemetry retention period
System telemetry stored in the ClickHouse database is retained for 30 days. The retention period is now adjustable through Oxide Support and will be configurable by the fleet admin in a future release. (omicron#10366)
Storage lifecycle management improvements
Local disk cleanup on expunged disks/sleds: Local disks backed by expunged physical disks or sleds could sometimes become undeletable if the sled or disk in question no longer responds to volume backend termination. Cleanup now accounts for expungement, so these disks can be deleted. (omicron#10222, omicron#10257)
Expunged disks re-adoption: Prior to this release, expunged SSDs were assumed to be unrecoverable, and could only be replaced with new disks. The limitation has been removed to allow physical disks to be re-adopted if they have been repaired in software (e.g., firmware update). More information about this can be found in the updated Hardware Maintenance guide and omicron#10221.
Web console
The console changes in this release are all about UX polish. A few examples: new users didn’t know where to edit firewall rules for a given instance; now the instance networking page links to the firewall rules page under the relevant VPC. We also added support for ICMPv6 filters in firewall rules. Comboboxes are now much faster when there are many items. See the full list of changes below.
Full console changelog
Support ICMPv6 filters in firewall rules (console#3212)
View IP pool details in a sidebar (console#3158)
Show server errors inline in modals instead of toasts, including image upload errors (console#3192, console#3227)
Combobox improvements: fix value editing, virtualize long lists, remove dropdown dead space (console#3217, console#3221, console#3186)
Clearer copy in all confirmation modals (console#3205)
Show "contact support" message on update status page when the API says to (console#3226, console#3238)
Link directly to firewall rules from the instance networking tab (console#3216)
Improve subnet pool member and IP range validation (console#3188)
Many small UI fixes (console#3210, console#3204, console#3200, console#3187, console#3181, console#3215)
Other bug fixes and enhancements
Compress external API responses by up to 90% with gzip (omicron#10341, dropshot#1448)
Support bundles include
reason_for_creation(omicron#10240)Add OxQL metrics for ZFS pool and dataset usage (omicron#10453)
Work around slow thread renaming in ClickHouse to reduce CPU usage (omicron#10431)
Remove oximeter schema cache to support database endpoint mobility (omicron#10292)
Prevent concurrent instance-start operations from making multiple sled reservations (omicron#10479)
Mark running instances
failedbefore starting a planned sled update to hasten instance restart on other available sleds (omicron#10334)Allow instances to transition to stopped state when one switch is down (omicron#10389)
Nexus startup is independent of IP allowlist plumbing (omicron#10305)
Improve NTP time sync checks to avoid starting CockroachDB prematurely (omicron#7668)
Ensure firewall rule update validation happens before DB write (omicron#10563)
BFD status no longer returns 404 when only one switch is available (omicron#9979)
Ensure stable DUID values are used in DHCPv6 exchanges (dendrite#267)
Fix Intel CPUID leaf 4 cache topology for SMT (propolis#1002)
Add
amd_turin_v2CPU platform for glibc 2.34-2.36 compatibility (omicron#10560)
Patches
None
Firmware update
AMD Turin microcode update to B002162/B101059
Storage firmware updates for Western Digital SN861 series NVMe drives (not applicable to all hardware configurations)
Helios update from v2.0 to v3.0 (helios#244)
Known behavior and limitations
End-user features
| Feature Area | Known Issue/Limitation | Issue Number |
|---|---|---|
Disk/image management | Disks in | |
Disk/image management | Disk rejected by guest OS due to duplicate nvme device names. The issue is caused by a 20-character limit in applying the disk name to the device serial number. See the Troubleshooting guide for more information. | - |
Disk/image management | The ability to modify image metadata is not available at this time. | |
Instance orchestration | Unable to start an instance that has a disk replica on a sled being updated. | |
Instance orchestration | Instance start API frequently times out when attached to local disks. | |
Instance orchestration | New instances cannot be created when the total number of NAT entries (private-to-external IP mappings) in the system exceeds 1024. | |
Instance performance | The | |
Instance performance | Linux guests unable to capture hardware events using | |
VPC internet gateway | Changing a silo’s default IP pool causes some instances to lose their outbound internet access. This is due to a mismatch between the pool containing the instances' external IP (which are allocated from the new default pool) and the pool attached to the system-created internet gateways (which are linked to the old pool during creation time). Please see the Troubleshooting guide for some possible options for restoring instance outbound connectivity. | |
VPC routing | Subnet update clears custom router ID when the field is left out of the request body. | |
VPC routing | Network interface update clears transit ips when the field is left out of the request body. | - |
Telemetry | VM instance memory utilization and VPC network/firewall metrics are unavailable at this time. | - |
Operator features
| Feature Area | Known Issue/Limitation | Issue Number |
|---|---|---|
Silo management | The ability to modify silo and IDP metadata is not available at this time. | |
System management | Real-time availability status for sleds and physical storage is not yet shown in the inventory API or UI. | |
System management | Operator-driven instance migration across sleds is currently unavailable. | - |
System management | Some running instances transitioned to the "stopped" state after update. |