Unencrypted Control Plane datastores in Oxide software

advisories

The ZFS datasets which contain the Oxide Control Plane datastores, including CockroachDB, ClickHouse timeseries database and Crucible file systems, did not have the encryption setting correctly configured prior to version 6 of the Oxide software.

The issue allows attackers who have physical access to the Oxide Rack to remove physical disks and potentially obtain the Control Plane data by examining the content of unencrypted datasets in the disks, locating one of the CockroachDB datasets, and reading the database tables with software capable of parsing the raw data. The information may in turn be used for further exploits such as:

  • Stealing another user’s device token and impersonating the user to access their VM instances

  • Using the crucible volume encryption keys to decrypt the virtual disk data

This issue is fixed in version 6 of the Oxide software; we recommend customers upgrade as soon as possible.

Revision History
RevisionDate (YYYYMMDD)Changes

1.2

20241212

Add CVE identifier and CVSS vector

1.1

20240409

Indicate affected versions and fix availability in summary

1.0

20240325

Add "Technical Background" section

0.1

20240118

Initial Release

Impacted Products

Oxide software releases version 5 and earlier.

Impact

An attacker who has physical access to the Oxide Rack and the knowledge about the Control Plane databases may be able to access the data on the physical storage devices. They may be able to view and modify system data such as user device tokens, rack configurations, VM instance and disk metadata. They may also be able to examine the disk data by decrypting it with the application-level encryption keys stored in the CockroachDB. The issue can manifest as unauthorized use of VM instances and sensitive data stored in the disks.

Oxide calculates a CVSS v3.1 base score of 5.7 (Medium). Vector: CVSS:3.1/AV:P/AC:H/PR:N/UI:N/S:U/C:H/I:H/A:N

Action Required

Update the Oxide software to version 6.

Mitigations

There are no mitigations available besides enforcing controls against unauthorized rack physical access.

Technical Background

Oxide provides encryption at rest for both customer data, and Oxide control plane data. The goal of such encryption is to prevent offline attacks, such that a casual attacker with physical access to the rack cannot steal a subset of disks or sleds and recover sensitive information off of them. Unique encryption keys per U.2 drive are derived from a shared rack secret which is computed at each sled via an implementation of Shamir secret sharing. In order to reconstruct the rack secret and derive encryption keys, at least a threshold, t, of sleds must participate in an online distributed algorithm. During installation, t = N/2 + 1, where N is the number of sleds in a rack. Concretely, for a 16 sled rack, t=9, where for a 32 sled rack, t = 17. If fewer than t sleds can participate, then no data can be learned about the rack secret and no encryption keys may be derived. The rack secret is split into distinct key shares which are stored on M.2 drives of each of the sleds, and are not immediately available to steal without taking a sled out of the rack. At least t M.2 drives would have to be stolen in order for an attacker to reconstruct the rack secret, derive encryption keys, and then access data on stolen U.2 drives. This provides sufficient mitigation against casual theft of few sleds or drives from being useful to an attacker.

Inside an Oxide rack, customer and Oxide control plane data are stored inside ZFS datasets. Each ZFS dataset has a path, which dictates its location in a hierarchy, and this path corresponds to the fileysystem mount path, but is distinct from it. There are up to 10 U.2 disks on each sled in the rack, and each of these has an encrypted root dataset at path oxp_<UUID>/crypt, where <UUID> is distinct per U.2 disk. ZFS allows child datasets to inherit encryption, such that if the child dataset resides under the path of an encrypted filesystem, the child dataset is also encrypted using the same key. By default, encryption is inherited, and so a dataset such as oxp_<UUID>/crypt/zone/cockroachdb would be encrypted. However, since we only encrypt under our crypt root, a dataset not under that hierarchy, such as oxp_<UUID>/cockroachdb will not be encrypted via ZFS dataset encryption. Keys used for ZFS encryption are derived from the Oxide rack secret as discussed above.

We have two primary types of datasets, ephemeral "zone" datasets which act as the root filesystems of illumos zones and may be recreated across reboots, and persistent "data" datasets that store data associated with the zone, including database state. On launch of a zone, the "data" datasets get mounted into the zone and are used to store customer and Oxide control plane specific data. While the ephemeral zone datasets had paths under our crypt root (oxp_<UUID>/crypt) and therefore inherited the encryption property, our persistent zones were created outside of the crypt hierarchy, and therefore were not encrypted. Unfortunately, persistent datasets contain the most critical data of the Oxide rack, such as cockroachdb data, debug data, clickhouse data, and dns data. None of this data was actually encrypted because all of these datasets resided outside the crypt hierarchy. It should be noted that while crucible datasets live outside the crypt hierarchy, they are encrypted with a different application level mechanism and remained encrypted. Unfortunately the encryption keys were stored in cockroachdb datasets which were unencrypted.

By upgrading to at least Oxide release version 6, existing deployments will have their unencrypted datasets migrated into encrypted datasets at boot time, before usage. New deployments will only create encrypted datasets. As an additional safety measure, any time a dataset is mounted into a zone, the zone will check to see if it is encrypted and error if it is not. This allows fail-fast identification of problems during both development and customer usage.

Additional Information

There is no additional information at this time.

Jan 18 2024

SSRF in Oxide software

advisories

A server-side request forgery (SSRF) issue exists in version 4 and older of the Oxide software. This issue allows authenticated users with the collaborator role on a project or silo to send HTTP HEAD and GET requests to arbitrary URLs on the underlay network, the primary physical rack-internal IPv6 network where services internal to the system communicate.

This issue allows such users to send arbitrary queries to the metrics service if the internal underlay network address for the ClickHouse service is known. The responses to those queries cannot be read, but such users can still tamper with metrics data and prevent metrics APIs from working.

This issue also potentially allows users to read from a limited set of debugging endpoints served by CockroachDB if the internal underlay network address is known and the endpoint meets certain requirements. The set of requirements necessary for information disclosure rules out nearly all HTTP endpoints served by CockroachDB, but Oxide is unable to rule out all endpoints at this time.

This issue is fixed in version 5 of the Oxide software; we recommend customers upgrade as soon as possible.

Revision History
RevisionDate (YYYYMMDD)Changes

1.0

20231215

Initial Release

Impacted Products

Oxide software releases version 4 and earlier.

Impact

An authenticated user with the collaborator role on a project or silo can trigger HTTP HEAD and GET requests to arbitrary URLs on the underlay network. This can manifest as a loss of integrity and availability of metrics data, and can possibly manifest as an information disclosure of debugging information for the primary database.

Oxide calculates a CVSS v3.1 base score of 4.9 (Medium). Vector: CVSS:3.1/AV:N/AC:H/PR:L/UI:N/S:C/C:N/I:L/A:L

Action Required

Update the Oxide software to version 5.

Mitigations

There are no mitigations available.

Technical Background

In version 4 and older of the Oxide software, a user with the collaborator role on a project or silo could create an image with the url source via the image_create endpoint, or import blocks from a URL into a disk via the disk_import_blocks_from_url endpoint. These were used during development prior to the ability to create an image from a snapshot and to bulk-write disk blocks via the disk_bulk_write_import endpoint. Version 5 of the Oxide software removes the url image source and the disk_import_blocks_from_url endpoint.

When a user creates an image with the url source, Nexus (the Oxide API server) makes an HTTP HEAD request to the user-provided URL to query the Content-Length of the response to determine the image size. If the response is successful (not HTTP 4xx or 5xx), and contains a Content-Length header that is divisible by the block size (any of 512, 2048, or 4096), Nexus considers the URL a valid image source, and the image is created. A user can then create a disk from the image and issue reads to it from an instance, which cause Propolis (the hypervisor userspace) to make HTTP GET requests with the Range header set; if the response is successful and has the requested Content-Length based on the Range header, that data is sent to the instance to complete the read operation.

The above process is similar for the disk_import_blocks_from_url endpoint, except that the Crucible pantry (part of the storage subsystem that allows modification of disks while no instance is running) makes the HTTP HEAD request, and that the Crucible pantry immediately makes HTTP GET requests with the Range header set until all data is read or an error condition occurs. The same requirements are necessary for the operation to succeed; if successful, the user can attach the disk to an instance and issue reads, which do not create additional HTTP requests to the URL. If the operation was unsuccessful, any partially-read data is available on the disk.

Because Nexus, Propolis, and the Crucible pantry are on the underlay network, they can make HTTP requests to any service on that network. This allows a user to send HTTP HEAD requests to arbitrary URLs on the underlay network; if the response to a HEAD request is successful and contains a Content-Length header divisible by the block size, and responses to GET requests with the Range header have a body with the requested length divisible by the block size, the response data can be viewed by the user.

Additionally, Nexus has access to networks outside the system, to allow users to access it and allow it to communicate with SAML providers. This allows a user to send HTTP HEAD requests to arbitrary URLs to networks routable from the system. These requests are performed without DNS resolution, so a user must specify an IP address. GET requests routed outside the system via this issue are not possible, as services making GET requests only run on the underlay network.

HTTP services on the underlay network are described in the following sections.

ClickHouse

ClickHouse is used by the Oxide software to store metrics. It offers a SQL-like query interface over HTTP, allowing queries to be passed in via URL query parameters or a POST request body.

The query interface responds to HEAD requests, but with no Content-Length header, so reading from the query interface is not possible. HEAD requests to this interface can run arbitrary SQL queries; thus it is possible for a user with knowledge of the underlay network address of the HTTP interface to tamper with data stored in ClickHouse, either by modifying or deleting metric data or by preventing the storage and querying of metrics altogether.

The rest of the system’s availability is not affected if either ClickHouse or the metrics system are unavailable.

CockroachDB

CockroachDB is used by the Oxide software to store all data except metrics and data stored on disks. Its primary interface is a PostgreSQL-compatible protocol, but by default it starts an HTTP console on the same IP address. Prior to version 5 of the Oxide software, this console was accessible via the underlay network.

Most endpoints served by the CockroachDB console respond 406 Method Not Allowed to HEAD requests, or require a valid session cookie. The endpoints that successfully respond to HEAD requests provide a very limited subset of database metrics and health information, assets for the console UI, and debug endpoints that we do not expect can provide database contents via GET requests alone. For data to be read from any of these endpoints, the Content-Length must be divisible by a valid user-selected block size (512, 2048, or 4096), and the endpoint must support Range requests.

Oxide has evaluated the HTTP endpoints served by CockroachDB and could not identify any which meet these requirements, and we are not confident that this issue could successfully be used for information disclosure; however, we cannot completely rule out the possibility at this time.

Oxide-authored HTTP services

The Oxide software consists of many HTTP services authored by Oxide. These services use a common HTTP library named Dropshot. Dropshot does not automatically handle HTTP HEAD requests, and no endpoints provided by any Oxide-authored HTTP services handle HEAD requests. There is no impact to these services.

Additional Information

There is no additional information at this time.

Dec 15 2023

AMD RAS Poisoning (Inception)

advisories

Researchers from ETH Zurich have discovered a novel Spectre V2 variant that impacts AMD Zen 1 through Zen 4 CPUs via the CPU’s return address predictor. Researchers have called this attack 'Inception'. AMD calls this Speculative Return Stack Overflow and has titled this AMD-SN-7005 Return Address Predictor Security Notice.

A given hardware thread may poison the branch predictors in the CPU core and use that to trick the CPU to access information in another context. Oxide gimlet servers are impacted by this vulnerability. Oxide is reviewing the guidance and expected performance impact from AMD and is awaiting publication of the research paper which should occur on Aug 9, 2023 so we can further analyze this vulnerability and provide additional guidance.

Revision History
RevisionDate (YYYYMMDD)Changes

1.0

20230808

Initial Release

Impacted Products

This impacts all Gimlet Compute Sleds.

Impact

An attacker in a virtual machine may be able to read privileged information through a cache-timing side channel attack from the hypervisor or from the guest operating sytems’s kernel that is executing on the same thread or core as the attacker.

Action Required

There is no immediate action required for Oxide systems. We will provide additional updates as we better understand this situation. Any mitigations will likely be part of the next software update. Oxide is evaluating the isolation of indirect branch predictors as well as reviewing what is required to ensure that guest virtual machines will be able to properly understand this based on information that will be released by the Linux kernel project and Microsoft.

The required microcode revision for Oxide’s Gimlet systems (0x0a0011d1) was already a part of Oxide Software Release 1.0.1.

Mitigations

Reducing the number of untrusted workloads may reduce the chance of this attack being executed. The primary means of exploitation in Oxide products is through hardware virtual machine guests that are created through the API.

Technical Background

As part of optimizing common function calls and branches in the processor, the hardware maintains what it calls a Return Address Stack. When a processor executes the call instruction, it pushes the return address on to the operating system stack and makes a note of it in inside the processor itself. The processor-internal structure is called the Return Address Stack (RAS). This is used because most subroutines/function calls will return to whence they came allowing the CPU to optimize this case. The RAS contains the full address of the target. Only addresses that are valid in the current address space (based on the %cr3 register) can enter the RAS.

In addition to the RAS, the CPU employs what is known as a branch target buffer (BTB). The CPU caches the location that a branch instruction will likely go to as part of its speculative execution engine. Put differently, when the CPU is scanning ahead and sees a branch instruction, it will guess on what path the program will jump to and speculatively execute assuming that is correct. It will then later come back and determine whether or not that is actually correct.

The AMD BTB design found in Zen family CPUs does not include the full instruction pointer inside of it, it instead includes a subset of the address bits. This means that entries in the BTB may alias several different virtual addresses in the CPU. Due to the aliasing, it is possible to cause confusion inside the RAS and cause the CPU to mispredict a return instruction.

A mispredicted instruction is then, when combined with the proper gadget, used to cause the CPU to create a side-effect in the cache that may be observable. The RAS exists on a per-thread basis. That is, each hardware thread has an independent return address stack. However, depending on the CPU configuration the BTB may be shared between hardware threads in the core. This leads to some theoretical difficulties in performing the attack. We expect additional information about this to be clearer when the research paper is made available.

AMD is releasing a microcode update for AMD Zen 3 and Zen 4 systems that will change the behavior of the Indirect Branch Prediction Barrier (IBPB) MSR which is used to flush out older indirect branch predictions. Microcode updates are not required for AMD Zen 1 and 2 systems. Existing CPU microcode on Zen 1 and 2 systems performs all necessary flushes on an IBPB. Proper use of IBPB when combined with the Single-thread indirect branch predictors, can significantly mitigate this attack according to AMD.

Aug 8 2023