Help with ZFS Array

Lem453@lemmy.ca · edit-2 2 months ago

Help with ZFS Array

hendrik@palaver.p3x.de · edit-2 2 months ago

I don’t know anything about ZFS, but in the future you might want to address them by /dev/disks/by-uuid/… or by-id and not by /dev/nvme…

Shdwdrgn@mander.xyz · 2 months ago

That is definitely true of zfs as well. In fact I have never seen a guide which suggests anything other than using the names found under /dev/disk/by-id/ or /dev/disk/by-id/uuid and that is to prevent this very problem. If the proper convention is used then you can plug the drives in through any available interface, in any order, and zfs will easily re-assemble the pool at boot.

So now this begs the question… is proxmox using some insane configuration to create drive clusters using the name they happen to boot up with???

Lem453@lemmy.ca · 2 months ago

Is there a way to change this on an existing zpool?

qupada@fedia.io · 2 months ago

Generally, you just need to export the pool with zpool export zfspool1, then import again with zpool import -d /dev/disk/by-id zfspool1.

I believe it should stick after that.

Whether that will apply in its current degrated state I couldn’t say.

Lem453@lemmy.ca · edit-2 2 months ago

Thanks, this worked. I made the ZFS array in the proxmox GUI and it used the nvmeX names by default. Interestingly, when I did zfs export, nothing seemed to happen and it -> I tried zpool import and is said no pools available to import, but then when I did zpool status it showed the array up and working with all 4 drives showing healthy and it was now using device IDs. Odd but seems to be working correctly now.

root@pve:~# zpool status
  pool: zfspool1
 state: ONLINE
  scan: resilvered 8.15G in 00:00:21 with 0 errors on Thu Nov  7 12:51:45 2024
config:

		NAME                                                                                 STATE     READ WRITE CKSUM
		zfspool1                                                                             ONLINE       0     0     0
		  raidz1-0                                                                           ONLINE       0     0     0
			nvme-eui.000000000000000100a07519e22028d6-part1                                  ONLINE       0     0     0
			nvme-nvme.c0a9-313932384532313335343130-435431303030503153534438-00000001-part1  ONLINE       0     0     0
			nvme-eui.000000000000000100a07519e21fffff-part1                                  ONLINE       0     0     0
			nvme-eui.000000000000000100a07519e21e4b6a-part1                                  ONLINE       0     0     0

errors: No known data errors

hendrik@palaver.p3x.de · 2 months ago

Strange. Okay, hope that spares you from similar troubles in the future.

Shdwdrgn@mander.xyz · edit-2 2 months ago

OP – if your array is in good condition (and it looks like it is) you have an option to replace drives one by one, but this will take some time (probably over a period of days). The idea is to remove a disk from the pool by its old name, then re-add the disk under the corrected name, wait for the pool to rebuild, then do the process again with the next drive. Double-check, but I think this is the proper procedure…

zpool offline poolname /dev/nvme1n1p1

zpool replace poolname /dev/nvme1n1p1 /dev/disk/by-id/drivename

Check zpool status to confirm when the drive is done rebuilding under the new name, then move on to the next drive. This is the process I use when replacing a failed drive in a pool, and since that one drive is technically in a failed state right now, this same process should work for you to transfer over to the safe names. Keep in mind that this will probably put a lot of strain on your drives since the contents have to be rebuilt (although there is a small possibility zfs may recognize the drive contents and just start working immediately?), so be prepared in case a drive does actually fail during the process.

Lem453@lemmy.ca · 2 months ago

Thanks for this! Luckily the above suggestion to export and import worked right away so this was not needed.

Shdwdrgn@mander.xyz · 2 months ago

Yeah I figured there would be multiple answers for you. Just keep in mind that you DO want to get it fixed at some point to use the disk id instead of the local device name. That will allow you to change hardware or move the whole array to another computer.

Lem453@lemmy.ca · 2 months ago

Thanks! I got it setup by IDs now. I originally set it up via the proxmox GUI and it defaulted to NVME names

Possibly linux@lemmy.zip · 2 months ago

I don’t believe this is the case

hendrik@palaver.p3x.de · 2 months ago

Care to explain?

Possibly linux@lemmy.zip · edit-2 1 month ago

I believe ZFS is smart enough to automatically find the disk on the system as it looks at all the other information like the disk id. It shouldn’t just lose a drive.

zpool just shows the original path of the disk when it was added. Behind the scenes ZFS knows your drives. There is a chance I am totally wrong about this.

What is the output of lsblk? Any missing drives?

hendrik@palaver.p3x.de · 1 month ago

Fair enough. Judging by OP’s later comments, the pool is online again.