Power Fault Detection and Management
The rectifiers of the Oxide rack are capable of detecting power faults and power supply changes in different parts of the system, proactively charging/discharging power supply rails, and transitioning between power states for safe rack shutdown and cold start. No custom configuration is required and no data loss as a result of power outage is expected.
The remote monitoring unit (RMU) in the Power Shelf Controller (PSC) is designed to monitor rack-level power consumption and PSU state of health and expose that information to the control plane. These capabilities will enable more advanced power management and monitoring features in the future releases of the Oxide rack.
Understanding Status and Fault LEDs
All LEDs in the system are monochrome LEDs with three possible modes:
Mode | Description |
---|---|
Solid On | device or component is functioning properly |
Solid Off | device is not present, incorrectly inserted, or so mechanically broken that it cannot function |
Blinking | device needs attention or it is being worked on |
A common cause for the “Solid Off” light is that hot-serviceable components such as server sleds, optical transceivers, and U.2 NVMe devices are not properly inserted and therefore cannot function.
Blinking is used as a combined service and fault indicator for signaling the device to be operated on. Here is an example of how the LED signal works in the case of a SSD replacement:
If the LED is “Solid On” afterwards, the device is operational and the service has been completed.
If the LED remains “Solid Off”, it means that the SSD fails to be completely inserted or it is so broken that it cannot be powered on or detected at all.
If a new fault arises, the LED will blink again.
Hardware Maintenance
In the initial version of Oxide Rack, all field-replaceable units (FRUs) are serviced through Oxide technical support engagement. Commonly-serviced units are designed to be hot-pluggable. In the future releases, certain FRUs will become customer-replaceable units (CRUs).
Here is a list of FRUs serviceable by Oxide:
Encompassing FRU | Component FRU | Hot-pluggable? |
---|---|---|
Gimlet | - | Yes |
Gimlet | Gimlet U.2 Drive/Carrier | Yes |
Gimlet | Gimlet Front Panel/Drive Cage | No |
Gimlet | Air Shroud | No |
Gimlet | DIMMs | No |
Gimlet | M.2 Device | No |
Gimlet | M.2 Heatsink | No |
Gimlet | CPU | No |
Gimlet | CPU Heatsink | No |
Gimlet | Gimlet Individual Fans | No |
Gimlet | Gimlet Storage Midplane (“Sharkfin”) | No |
Sidecar | - | Yes |
Sidecar | Sidecar Optical Transceiver (Single) | Yes |
Sidecar | Sidecar Rear Fan | Yes |
Sidecar | 4:1 Squid Cables | Yes |
Sidecar | Sidecar to Gimlet PCIe Cable | Yes |
Sidecar | PSC to Sidecar Cables | Yes |
Sidecar | Sidecar to Sidecar Aux Cable | Yes |
Sidecar | Sidecar Internal Cables | No |
Power Shelf | - | No |
Power Shelf | Power Shelf Rectifier | Yes |
Power Shelf | Power Shelf Controller | Yes |
Power Shelf | Fiber Optic Cables | Yes |
Power Shelf | Power Shelf Adapter Kit | No |
Power Shelf | Power Shelf Bus Bar Connector | No |
Power Shelf | Power Shelf Whip Adapters | No |
Power Shelf | Backplane 4:1 Squid Cables | No |
Power Shelf | Sidecar to Gimlet PCIe Cable | No |
Power Shelf | Core Rack | No |
Power Shelf | Bus Bar | No |
Power Shelf | Front and Rear Doors | Yes |
Power Shelf | Side Panels | Yes |
Power Shelf | Gimlet Cubby | No |
Power Shelf | Gimlet Blank | Yes |
Seismic Kit | - | Yes |