Important Notes
The Oxide CLI, Go SDK, and Terraform Provider have been updated for API enhancements described under New Features. Please be sure to upgrade.
System Requirements
Please refer to v1.0.0 release notes.
Installation
Oxide Computer Model 0 must be installed and configured under the guidance of Oxide technicians. The requirement may change in future releases.
Upgrade Compatibility
Upgrade from version 18 or 18.1 is supported. We recommend shutting down all
running instances on the rack before software update commences. Any instances
that aren’t stopped for software update are transitioned to the failed state
when the control plane comes up. They can be configured to start automatically
with auto-restart policy or
they can be started manually by the user.
All existing setup and data (e.g., projects, users, instances) remain intact after the software update.
New Features
Enhanced trust quorum
Trust Quorum is a platform attestation feature of the Oxide rack to protect sensitive data on the rack from both casual theft and casual physical attacks inside the data center. A quorum of servers inside a rack cooperate to securely construct a shared secret that allows rack storage to be unlocked and rack data to be decrypted when certain conditions are met.
The trust quorum mechanism in this release has been enhanced to allow rack secrets to be re-computed after server sled replacement. This removes a previous restriction that limited the number of replaceable "original" sleds (the sleds that hold shares of the rack secrets during rack setup).
The enhanced trust quorum also gates control plane and underlay network access at runtime via remote attestation. Sled-to-sled communication takes place over a mutual TLS connection using PKI baked into the RoT of a sled that ties back to a root manufacturing key. During rack initialization, cold boot, and sled addition, attestation verifies that the expected software is running. If certificate validation or attestation fails at either end of the connection handshake, rack secret shares will not be exchanged.
Read-only distributed disks
Distributed disks may now be created as read-only. Read-only disks are attached to VM instances as read-only block devices and may not be modified by the guest. See the Disks and Snapshots guide for more information on creating and using read-only disks.
Audit log retention period
To prevent the audit log database table from growing unbounded, a 90-day retention period is now in place. Log entries outside of the retention period are permanently removed from the system by a background task. If you rely on retaining the audit log for a longer period, use the API to fetch the audit log on a regular interval. See the Audit Log guide for more detail.
Note that the thousands of disk_bulk_write_import API calls made
during disk import are now excluded from the audit log to reduce noise
(omicron#10045). The
other calls made as part of disk import are still logged.
Web console
In this release we’ve added a light UI theme, support for managing subnet pools and external subnets, a fleet-level access control page highlighting both explicit role assignments and fleet roles mapped from silo roles, and many other small improvements.
Full console changelog
Light mode (console#3061, console#3116, console#3162, console#3138, console#3127, console#3093)
External subnets and subnet pools (console#3039, console#3146, console#3165)
Add many more actions to "Jump To" dialog (console#3129, console#3131, console#3132)
Add Fleet Access page (console#3095)
Dual-stack ephemeral IPs on instance create and detail (console#3057, console#3101)
Read-only disks re-enabled (console#3155)
Rename "subnets" to "VPC subnets" throughout the UI (console#3147)
Client-side max vCPUs per instance bumped to 254 (console#3108)
Nicer handling of expected errors in the browser console (console#3099)
Many small UI fixes (console#3167, console#3134, console#3133, console#3125, console#3122, console#3111, console#3049, console#3102, console#3091, console#3085, console#3082, console#3079)
Bug fixes and other enhancements
Improved VM network fan-in throughput through viona packet chain split over vrings (illumos#17926)
Fix sporadic network connectivity issues on Windows instances (propolis#1048)
Bump up maximum vCPU per instance to 254 in the web console (console#3108)
Expose sled physical CPU usage in OxQL
sled_cpu:cpu_nsecmetrics (omicron#9560, omicron#9968)Add support for ICMPv6 protocol filter in firewall rules (omicron#10101)
Add background task to clean up expired web sessions (omicron#10009)
Fix dual-stack instances being created with only IPv4 connectivity when the VPC subnet had no prior IPv6 addresses (omicron#9880)
Fix vnic creation error handling for subnet address allocation failure (omicron#9885)
Fix external subnet creation error handling and enforce pool-silo linking (omicron#9892)
Drop built-in alpine image from disk image source options (omicron#9919)
Enable users with limited collaborator role to use external subnets (omicron#9920)
Fix various issues around static route configuration (omicron#9907, dendrite#225)
Reinstate sled-agent periodic polling for hardware changes (omicron#9975)
Fix a race condition during disk expungement that could cause the blueprint executor to fail perpetually (omicron#10025)
Remove local storage parent dataset’s reservation to avoid running out of space (omicron#10126)
Fix local disk deletion consuming all available storage space on the sled, which could crash control plane services (omicron#10035)
Reinstate metrics collection for crucible disks (propolis#1073)
Patches
19.1
Fix local volume blocksize mismatch if system crashes during zvol creation (stlouis#961)
VMM: virtio socket should wait for device readiness before memory check (propolis#1110)
VMM: do not overflow when calculating region end in lookup (propolis#1109)
Console: instance create form automatically selects new SSH key (console#3173)
Console: fix dark theme flash and image progress bar color (console#3176, console#3180)
SP ereports should only include SPs with ignition detected (omicron#10227)
Firmware update
AMD Turin 1.0.0.a
Known Behavior and Limitations
End-user features
| Feature Area | Known Issue/Limitation | Issue Number |
|---|---|---|
Disk/image management | Disks in | |
Disk/image management | Disk rejected by guest OS due to duplicate nvme device names. The issue is caused by a 20-character limit in applying the disk name to the device serial number. See the Troubleshooting guide for more information. | - |
Disk/image management | The ability to modify image metadata is not available at this time. | |
Disk/image management | Unable to delete local disks on expunged sleds | |
Instance orchestration | Unable to start an instance that has a disk replica on a sled being updated. | |
Instance orchestration | Instance start API frequently times out when attached to local disks. | |
Instance orchestration | New instances cannot be created when the total number of NAT entries (private-to-external IP mappings) in the system exceeds 1024. | |
Instance performance | The | |
Instance performance | Linux guests unable to capture hardware events using | |
VPC internet gateway | Changing a silo’s default IP pool causes some instances to lose their outbound internet access. This is due to a mismatch between the pool containing the instances' external IP (which are allocated from the new default pool) and the pool attached to the system-created internet gateways (which are linked to the old pool during creation time). Please see the Troubleshooting guide for some possible options for restoring instance outbound connectivity. | |
VPC routing | Subnet update clears custom router ID when the field is left out of the request body. | |
VPC routing | Network interface update clears transit ips when the field is left out of the request body. | - |
Telemetry | VM instance memory utilization and VPC network/firewall metrics are unavailable at this time. | - |
Operator features
| Feature Area | Known Issue/Limitation | Issue Number |
|---|---|---|
Silo management | The ability to modify silo and IDP metadata is not available at this time. | |
System management | Sled and physical storage availability real-time status are not available in the inventory UI and API yet. | |
System management | Operator-driven instance migration across sleds is currently unavailable. | - |
System management | Some running instances transitioned to the "stopped" state after update. |