v19

Important notes

The Oxide CLI, Go SDK, and Terraform Provider have been updated for API enhancements described under New Features. Please be sure to upgrade.

Installation

Oxide Computer Model 0 must be installed and configured under the guidance of Oxide technicians. The requirement may change in future releases.

Upgrade from version 18 or 18.1 is supported. We recommend shutting down all running instances on the rack before software update commences. Any instances that aren’t stopped for software update are transitioned to the failed state when the control plane comes up. They can be configured to start automatically with auto-restart policy or they can be started manually by the user.

All existing setup and data (e.g., projects, users, instances) remain intact after the software update.

New features

Enhanced trust quorum

Trust Quorum is a platform attestation feature of the Oxide rack to protect sensitive data on the rack from both casual theft and casual physical attacks inside the data center. A quorum of servers inside a rack cooperate to securely construct a shared secret that allows rack storage to be unlocked and rack data to be decrypted when certain conditions are met.

The trust quorum mechanism in this release has been enhanced to allow rack secrets to be re-computed after server sled replacement. This removes a previous restriction that limited the number of replaceable "original" sleds (the sleds that hold shares of the rack secrets during rack setup).

The enhanced trust quorum also gates control plane and underlay network access at runtime via remote attestation. Sled-to-sled communication takes place over a mutual TLS connection using PKI baked into the RoT of a sled that ties back to a root manufacturing key. During rack initialization, cold boot, and sled addition, attestation verifies that the expected software is running. If certificate validation or attestation fails at either end of the connection handshake, rack secret shares will not be exchanged.

Read-only distributed disks

Distributed disks may now be created as read-only. Read-only disks are attached to VM instances as read-only block devices and may not be modified by the guest. See the Disks and Snapshots guide for more information on creating and using read-only disks.

Audit log retention period

To prevent the audit log database table from growing unbounded, a 90-day retention period is now in place. Log entries outside of the retention period are permanently removed from the system by a background task. If you rely on retaining the audit log for a longer period, use the API to fetch the audit log on a regular interval. See the Audit Log guide for more detail.

Note that the thousands of disk_bulk_write_import API calls made during disk import are now excluded from the audit log to reduce noise (omicron#10045). The other calls made as part of disk import are still logged.

Web console

In this release we’ve added a light UI theme, support for managing subnet pools and external subnets, a fleet-level access control page highlighting both explicit role assignments and fleet roles mapped from silo roles, and many other small improvements.

Full console changelog

Light mode (console#3061, console#3116, console#3162, console#3138, console#3127, console#3093)
External subnets and subnet pools (console#3039, console#3146, console#3165)
Add many more actions to "Jump To" dialog (console#3129, console#3131, console#3132)
Add Fleet Access page (console#3095)
Dual-stack ephemeral IPs on instance create and detail (console#3057, console#3101)
Read-only disks re-enabled (console#3155)
Rename "subnets" to "VPC subnets" throughout the UI (console#3147)
Client-side max vCPUs per instance bumped to 254 (console#3108)
Nicer handling of expected errors in the browser console (console#3099)
Many small UI fixes (console#3167, console#3134, console#3133, console#3125, console#3122, console#3111, console#3049, console#3102, console#3091, console#3085, console#3082, console#3079)

Bug fixes and other enhancements

Improved VM network fan-in throughput through viona packet chain split over vrings (illumos#17926)
Fix sporadic network connectivity issues on Windows instances (propolis#1048)
Bump up maximum vCPU per instance to 254 in the web console (console#3108)
Expose sled physical CPU usage in OxQL sled_cpu:cpu_nsec metrics (omicron#9560, omicron#9968)
Add support for ICMPv6 protocol filter in firewall rules (omicron#10101)
Add background task to clean up expired web sessions (omicron#10009)
Fix dual-stack instances being created with only IPv4 connectivity when the VPC subnet had no prior IPv6 addresses (omicron#9880)
Fix vnic creation error handling for subnet address allocation failure (omicron#9885)
Fix external subnet creation error handling and enforce pool-silo linking (omicron#9892)
Drop built-in alpine image from disk image source options (omicron#9919)
Enable users with limited collaborator role to use external subnets (omicron#9920)
Fix various issues around static route configuration (omicron#9907, dendrite#225)
Reinstate sled-agent periodic polling for hardware changes (omicron#9975)
Fix a race condition during disk expungement that could cause the blueprint executor to fail perpetually (omicron#10025)
Remove local storage parent dataset’s reservation to avoid running out of space (omicron#10126)
Fix local disk deletion consuming all available storage space on the sled, which could crash control plane services (omicron#10035)
Reinstate metrics collection for crucible disks (propolis#1073)

Patches

19.1

Fix local volume blocksize mismatch if system crashes during zvol creation (stlouis#961)
VMM: virtio socket should wait for device readiness before memory check (propolis#1110)
VMM: do not overflow when calculating region end in lookup (propolis#1109)
Console: instance create form automatically selects new SSH key (console#3173)
Console: fix dark theme flash and image progress bar color (console#3176, console#3180)
SP ereports should only include SPs with ignition detected (omicron#10227)

19.2

Fix handling of instance opte ports that have no external IPs (opte#972)
Fix deadlock in the opte port manager when port creation fails (omicron#10280)

19.3

Offer IPv6 TSO flags in viona (illumos#18086)
Fix possible sled-agent hang in vfork (stlouis#971)
Refresh sled usage data during instance/disk placement to avoid contention (omicron#10353)
Temporarily disable fm_analysis background task to reduce noise in nexus logs (omicron#10360)
Properly check for expunged status during disk deletes (omicron#10257)
Check for stale inventory in mupdate override code (omicron#10391)
Support tool to modify Clickhouse data retention policy (omicron#10357)
P4: drop sctp in parser (dendrite#262)
Faster SYN expiry and bump firewall headroom from 8k to 512k (opte#987)

19.4

Bump v6 LPM table size to accommodate more routes (dendrite#268)
Set telemetry retention policy asynchronously (omicron#10416)
Optimize sled reservation search during instance/disk placement contention (omicron#10399)

Firmware update

AMD Turin 1.0.0.a

Known behavior and limitations

End-user features

Feature Area Known Issue/Limitation Issue Number

Feature Area	Known Issue/Limitation	Issue Number
Disk/image management	Disks in `importing_from_bulk_writes` state cannot be deleted directly. The procedure for unsticking a canceled disk import can be used as a workaround.	omicron#2987
Disk/image management	Disk rejected by guest OS due to duplicate nvme device names. The issue is caused by a 20-character limit in applying the disk name to the device serial number. See the Troubleshooting guide for more information.	-
Disk/image management	The ability to modify image metadata is not available at this time.	omicron#2800
Disk/image management	Unable to delete local disks on expunged sleds	omicron#10224
Instance orchestration	Unable to start an instance that has a disk replica on a sled being updated.	crucible#1690
Instance orchestration	Instance start API frequently times out when attached to local disks.	omicron#9953
Instance orchestration	New instances cannot be created when the total number of NAT entries (private-to-external IP mappings) in the system exceeds 1024.	omicron#6939
Instance performance	The `tsc` clocksource is treated as unreliable by guest, resulting in its fallback to use substantially slower timestamp syscalls. A workaround for this issue can be found in the Troubleshooting guide.	omicron#8001
Instance performance	Linux guests unable to capture hardware events using `perf record`. A workaround for this issue can be found in the Troubleshooting guide.	propolis#894
VPC internet gateway	Changing a silo’s default IP pool causes some instances to lose their outbound internet access. This is due to a mismatch between the pool containing the instances' external IP (which are allocated from the new default pool) and the pool attached to the system-created internet gateways (which are linked to the old pool during creation time). Please see the Troubleshooting guide for some possible options for restoring instance outbound connectivity.	omicron#7297
VPC routing	Subnet update clears custom router ID when the field is left out of the request body.	omicron#6406
VPC routing	Network interface update clears transit ips when the field is left out of the request body.	-
Telemetry	VM instance memory utilization and VPC network/firewall metrics are unavailable at this time.	-

Disk/image management

Disks in importing_from_bulk_writes state cannot be deleted directly. The procedure for unsticking a canceled disk import can be used as a workaround.

omicron#2987

Disk/image management

Disk rejected by guest OS due to duplicate nvme device names. The issue is caused by a 20-character limit in applying the disk name to the device serial number. See the Troubleshooting guide for more information.

Disk/image management

The ability to modify image metadata is not available at this time.

omicron#2800

Disk/image management

Unable to delete local disks on expunged sleds

omicron#10224

Instance orchestration

Unable to start an instance that has a disk replica on a sled being updated.

crucible#1690

Instance orchestration

Instance start API frequently times out when attached to local disks.

omicron#9953

Instance orchestration

New instances cannot be created when the total number of NAT entries (private-to-external IP mappings) in the system exceeds 1024.

omicron#6939

Instance performance

The tsc clocksource is treated as unreliable by guest, resulting in its fallback to use substantially slower timestamp syscalls. A workaround for this issue can be found in the Troubleshooting guide.

omicron#8001

Instance performance

Linux guests unable to capture hardware events using perf record. A workaround for this issue can be found in the Troubleshooting guide.

propolis#894

VPC internet gateway

Changing a silo’s default IP pool causes some instances to lose their outbound internet access. This is due to a mismatch between the pool containing the instances' external IP (which are allocated from the new default pool) and the pool attached to the system-created internet gateways (which are linked to the old pool during creation time). Please see the Troubleshooting guide for some possible options for restoring instance outbound connectivity.

omicron#7297

VPC routing

Subnet update clears custom router ID when the field is left out of the request body.

omicron#6406

VPC routing

Network interface update clears transit ips when the field is left out of the request body.

Telemetry

VM instance memory utilization and VPC network/firewall metrics are unavailable at this time.

Operator features

Feature Area	Known Issue/Limitation	Issue Number
Silo management	The ability to modify silo and IDP metadata is not available at this time.	omicron#3400, omicron#3125
System management	Sled and physical storage availability real-time status are not available in the inventory UI and API yet.	omicron#2035
System management	Operator-driven instance migration across sleds is currently unavailable.	-
System management	Some running instances transitioned to the "stopped" state after update.	omicron#9177

Feature Area

Known Issue/Limitation

Issue Number

Silo management

The ability to modify silo and IDP metadata is not available at this time.

omicron#3400, omicron#3125

System management

Sled and physical storage availability real-time status are not available in the inventory UI and API yet.

omicron#2035

System management

Operator-driven instance migration across sleds is currently unavailable.

System management

Some running instances transitioned to the "stopped" state after update.

omicron#9177

v19

Important notes

System requirements

Installation

Upgrade compatibility

New features

Enhanced trust quorum

Read-only distributed disks

Audit log retention period

Web console

Bug fixes and other enhancements

Patches

19.1

19.2

19.3

19.4

Firmware update

Known behavior and limitations

End-user features

Operator features

Related documentation

v19

Table of Contents