img troubleshooting lsi raid issues in proxmox with cli proxmox lsi raid

Troubleshooting LSI RAID issues in Proxmox with CLI

Managing LSI RAID in Proxmox: Using CLI Tools for Optimal Control

I had a Proxmox host with an LSI-based controller where a pulled drive did not autorebuild when I reinserted it. The kernel had the megaraid_sas driver, but no vendor CLI tools were installed. I used the vendor utilities to inspect controller settings, start a rebuild and confirm drive health. The steps below are what worked for me with Proxmox LSI RAID.

What you see

Symptoms are blunt and repeatable. The array shows degraded state in the controller, but the OS sees all physical drives present. Example user reports include: “it wont autorebuild when removing a drive and then a few moments later reinsert it” and front-panel drive LEDs not lighting despite a drive being in a RAID.

Look in these places for concrete signs.

  • Proxmox GUI: VM or storage warnings, datastore shown as degraded.
  • dmesg/syslog: look for the megaraid_sas driver messages or disk errors.
  • Controller CLI (when available): shows logical drive as degraded, missing, or foreign.

Useful quick checks

  • Check kernel driver and controller presence:
    • Command: lsmod | grep megaraid_sas
    • Expected: a line with “megaraid_sas”
    • If absent: driver not loaded.
    • Command: lspci -nnk | grep -A3 -i raid
    • Expected: an LSI/SAS controller entry with driver=megaraid_sas
    • Check logs:
    • Command: dmesg | grep -i megaraid
    • Look for firmware events, device resets or failed commands.

If drives are present to the OS but the controller reports a degraded array, the controller configuration likely controls rebuild behaviour.

Where it happens

This is Proxmox, not mdadm. Proxmox LSI RAID typically uses the megaraid_sas kernel module to present the adapter and logical drives. The hardware RAID state and rebuild policy live on the controller firmware. That means the OS can show disks, but the controller may refuse to auto-rebuild depending on settings.

Controller settings to check

  • Auto-rebuild policy: on some LSI firmware this can be disabled.
  • Hot-spare assignment and drive state: a drive can be marked as “unconfigured good”, “foreign”, or “online”.
  • Migration and patrol read settings: these affect rebuild timing and LED behaviour.

CLI tool availability

  • Proxmox does not always ship vendor CLI tools like storcli or MegaCLI by default. If they are missing, install the vendor package from Broadcom/LSI or copy it to the host if offline. The kernel driver can be present while utilities are not.
  • storcli and MegaCLI expose the controller state and let you start or monitor rebuilds from the OS rather than the BIOS.

Reference: a discussion of missing CLI tools and the megaraid_sas driver on a fresh Proxmox install is available here: https://reddit.com/r/Proxmox/comments/1pnw2cv/missing_clitools_for_lsi_raidadapters/ and Broadcom’s support pages list StorCLI as the vendor tool for MegaRAID controllers: https://www.broadcom.com/support/download-search

Find the cause

I follow a short checklist that narrows the problem to controller policy, physical fault, or missing tools.

1) Map host devices to controller objects

  • Command: storcli /c0 show
    • Expected: controller present and firmware version.
    • If storcli missing, use: megacli -AdpAllInfo -aAll or check /proc/scsi/scsi for device mapping.
  • Command: storcli /c0/vall show
    • Expected: logical volumes listed and their state (Optimal/Degraded).
  • If vendor tools are not present, you will only see SCSI block devices in the OS. That can hide controller-level flags like “foreign” or “no rebuild”.

2) Check physical drive state

  • Command: storcli /c0/eall/sall show
    • Expected: each slot shows “State: Online” or “UGood” (Unconfigured Good) if not assigned.
  • Alternative: smartctl -a /dev/sdX for individual drives to confirm SMART health.

3) Inspect driver and kernel logs

  • Command: dmesg | grep -iE ‘megaraid|raid|error|failed’
    • Expected: occasional info messages for normal operation.
    • Problem indicators: repeated firmware events, I/O errors to a physical disk, or controller timeouts.

4) Check BIOS/firmware settings

  • Boot to controller BIOS (Ctrl+R or vendor key) and verify:
    • Auto-rebuild enabled.
    • Foreign configuration handling (usually “Import” vs “Clear”).
    • Hot-spare behaviour and global hot-spare settings.

Root causes I’ve seen

  • Auto-rebuild disabled in firmware.
  • Drive reinserted marked as “Foreign configuration” and not automatically imported.
  • Drive marked “Unconfigured Good” but not added to the VD due to rebuild policy.
  • Missing CLI tools, so I could not view or change controller flags from the OS.

Fix

Get the vendor CLI on the host, inspect the controller, and change the policy or force a rebuild where appropriate. Be deliberate. Hardware RAID changes are destructive if used incorrectly.

Install the CLI tools

  • If online, download storcli from Broadcom for your controller and install (usually an RPM or Debian package). If offline, download on another machine and scp the package to the host.
  • Verify installation:
    • Command: storcli /c0 show
    • Expected: controller information.

Common corrective actions with storcli / MegaCLI

  • Check rebuild setting and enable auto-rebuild if disabled.
    • storcli example: storcli /c0 show rebuild settings (exact subcommand varies by version; consult StorCLI guide on Broadcom).
  • If the drive is “Unconfigured Good”, add it to the VD:
    • Example: storcli /c0/v0 add spares=E:S (E enclosure, S slot) or use the specific add or replace command for your RAID level.
  • Force a rebuild when safe:
    • Example commands vary with firmware. If uncertain, import foreign config first:
    • storcli /c0 /fall show foreign
    • storcli /c0 /fall import
    • Then start rebuild if controller does not auto-start.

If you only have MegaCLI

  • Use:
    • megacli -LDInfo -Lall -aAll to list logical drives.
    • megacli -PDList -aAll to list physical drives.
    • megacli -CfgForeign -Scan -aAll to see foreign configurations.
    • megacli commands to import or start a rebuild exist, but syntax is older and less friendly.

Rebuilding a logical drive

  • Rebuilds occur on the controller. Start only when the physical drive is confirmed healthy.
  • Verify the drive is not failing at SMART level before rebuild.

If the controller refuses to rebuild

  • Check for firmware bugs and consider updating firmware if vendor notes match your behaviour.
  • If drive LEDs do not light but controller marks the drive online, check backplane cabling and LED connectors.

Check it’s fixed

Verification must be explicit. I use the vendor CLI and host tools together.

1) Confirm logical drive state

  • Command: storcli /c0/vall show
    • Expected: state moves from Degraded to Optimal or shows progress and 100% completion.
  • Alternative: megacli -LDInfo -Lall -aAll

2) Monitor rebuild progress

  • Command: storcli /c0/v0 show rebuild (or check /progress fields)
    • Expected: progress percentage increments until complete.
  • Also watch dmesg for repeated errors while rebuilding.

3) Validate drive health

  • Command: smartctl -a /dev/sdX
    • Expected: SMART PASSED, no reallocated sectors, no pending sectors.
  • Check that the front-panel LEDs reflect activity if that is required for monitoring.

4) Lock the settings you changed

  • If you enabled auto-rebuild or changed foreign import behaviour, note the controller defaults and document the change in your runbook.
  • Reboot to BIOS and confirm the setting persisted if the controller stores the setting in NVRAM.

Final checks

  • Run a short scrub/patrol if supported.
  • Run a quick I/O test or boot a VM that uses that storage and watch for I/O errors.

Takeaways

  • Proxmox LSI RAID relies on the controller. Install storcli or MegaCLI to see and change controller-level behaviour. Map OS block devices to controller slots before acting. If a drive reinsertion did not trigger a rebuild, check firmware rebuild policy and foreign config handling. Use SMART to confirm the physical drive is healthy before forcing a rebuild.
Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Prev
Weekly Tech Digest | 22 Jan 2026
weekly tech digest

Weekly Tech Digest | 22 Jan 2026

Stay updated with the latest in tech!

Next
Oral-B iO2 Night Black Electric Toothbrush + 2 more Amazon tech bargains
weekly deals

Oral-B iO2 Night Black Electric Toothbrush + 2 more Amazon tech bargains

Discover the Oral-B iO2 Night Black Electric Toothbrush and two more tech

You May Also Like