Important Notes
This is a patch release aimed at improving fault tolerance. A factory reset will be required to make use of the new features.
System Requirements
Please refer to v1.0.0 release notes.
Installation
Oxide Computer Model 0 must be installed and configured under the guidance of Oxide technicians. The requirement may change in future releases.
Upgrade Compatibility
Upgrade from version 1.0.0 is not supported.
New Features
This release covers a number of fault tolerance related improvements. There are no new user features.
New bootstore for persistent rack initialization data
Improved handling of network and service configurations during rack, sled, and service restart
Encryption key generation hardening
Improved storage space management for system log and dump files
Bug fixes:
I2C temperature error handling for U.2s could be improved (hubris-pr#1465)
IPv6 RIP router was enabled by default in general deployments. (omicron-pr#3736)
Identity provider descriptor endpoint was not resolvable in Nexus. (omicron#3724)
Sled-agent leaked contracts when executing commands from non-global zones. (omicron#3753)
Pantry service was not deployed as a cluster. (omicron#3609)
Backplane ports were not defaulted to RS FEC. (omicron-pr#3714)
ipadm did not allow creation of point to point links. (illumos#15806)
TSC sync produced unreliable results with caches disabled. (illumos#15810)
bhyve could take more care around VM_MAXCPU (illumos#15812)
fp_lwp_init allocated under p_lock, leading to deadlock under memory pressure. (stlouis#463)
Firmware update:
Chelsio cxgbe firmware is updated from version 1.27.1.0 to 1.27.4.0. (illumos#15804)
AMD microcode is updated from version 20230414 to 20230719. (illumos#15811)
Known Behavior and Limitations
End-user features
Feature Area | Known Issue/Limitation | Issue Number |
---|---|---|
Firewall rules | Firewall rules using VPC as target should allow/deny traffic based on an instance’s private IP only and not apply the rules against the instance’s public IP. As a workaround, use subnet as target to permit only intra-subnet traffic without allowing inbound traffic from other IP addresses on the same public network as the instance. | |
Image/snapshot management | Image upload sometimes stalls with HTTP/2. | |
Image/snapshot management | Unable to create snapshots for disks attached to stopped instances. | |
Image/snapshot management | Spurious errors after snapshot or disk deletion has been completed successfully. | |
Image/snapshot management | The ability to delete images is not available at this time. | |
Image/snapshot management | The ability to modify image metadata is not available at this time. | |
Instance orchestration | The maximum instance size is currently limited to 32 vcpus and 64 GiB of memory, and up to seven 1023 GiB disks. | |
Instance orchestration | The ability to select which SSH keys to be passed to a new instance is not available at this time. | |
Instance orchestration | Concurrent instance provisioning requests (e.g., as typically happens with programmatic orchestration such as Terraform) may return 500 errors. Users can reduce the concurrency level to avoid the error or retry the failed requests. | |
Instance orchestration | Instance or disk provisioning requests may fail due to unhandled sled or storage failure on rare occasions. Users can retry the requests to work around the failures. | |
Instance orchestration | Booting up an instance after rack power-cycle currently requires an extra stop-start cycle to regain network connectivity. | |
Telemetry | Guest VM cpu and memory metrics are unavailable at this time. | - |
VPC and routing | Inter-subnet traffic routing is not available by default. Router and routing rules will be supported in future releases. |
Operator features
Feature Area | Known Issue/Limitation | Issue Number |
---|---|---|
Access control | Device tokens do not expire. | |
Control plane | Sled and physical storage availability status are not available in the inventory UI and API yet. | |
Control plane | When switch zones are bounced outside of rack cold-start, a full rack power cycle is required to re-propagate sled NAT configurations. | |
Control plane | Operator-driven software update is currently unavailable. All updates need to be performed by Oxide technicians. | - |
Control plane | Operator-driven instance migration across sleds is currently unavailable. Instance migrations need to be performed by Oxide technicians. | - |
Network management | Public IP addresses used for VM instances are currently assigned from a single pool named “default”. End-users do not have the ability to see the names of other IP pools. The ability to set up and query per-project IP pools will be available soon in future releases. | |
Network management | Routing between the rack and on-premise L2 networks is currently restricted to static routes only. The use of Border Gateway Protocol (BGP) for dynamic route configuration will be supported in upcoming releases. | |
Telemetry | Hardware metrics such as temperatures, fan speeds, and power consumption are not exposed to the control plane at this time. | - |
User management | User offboarding from the rack is not supported at this time. Apart from updating the identity provider to remove obsolete users from the relevant groups, operators will need to remove any IAM roles granted directly to those users in silos and projects. |