My troubleshooting ability in Linux is not impressive, just so you know. I can follow instructions very well on the other hand. I have a Linux server with Linux raid. It was working well with no problems for about half a year but then I had a power failure and have been getting the same problem ever since. After rebuilding the raid all my files are still there, so that’s a good thing. When I reboot the server the raid device md0 is gone.
<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="fa8a93ba8a93949b89">[email protected]</a>:~ $ cat /proc/mdstat Personalities : unused devices: <none> <a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="73031a33031a1d1200">[email protected]</a>:~ $ ls /dev/md* <a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="d7a7be97a7beb9b6a4">[email protected]</a>:~ $
I found an issue here that seems to have helped other people but I tried it and it did not help. I also looked at several other sites all saying the similar things. I use webmin to create the raid and the mdadm.conf “looks” ok. I don’t know if I am searching the internet for the right things or even if I am looking in the right log files. Does anyone have any ideas?
Thanks in adavance.
***Edit 1
<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="f98b96968db9899097988a">[email protected]</a>:/home/pi# service mdadm start Failed to start mdadm.service: Unit mdadm.service is masked.
I am wondering if the mdadm service is even running. The process is not currently active on the system and I have no idea how to tell if it is configured to start on boot, how to start it or configure it to start on boot.
***Edit 2
systemctl list-unit-files <a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="3459505550591953465b4319575b5a405d5a4151741a475146425d5751">[email protected]</a> static <a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="6e030a0f0a0343020f1d1a431c0b1d011c1a2e401d0b1c18070d0b">[email protected]</a> static mdadm-waitidle.service masked mdadm.service masked <a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="f9949d949697b9d78a9c8b8f909a9c">[email protected]</a> static mdmonitor.service static
I found this. I don’t know exactly if this is bad but it is suspicious looking. Is this how it should be? None of them say enabled and I would think they should. Any Ideas?
***Edit 3
systemctl list-unit-files <a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="f8959c999c95d59f8a978fd59b97968c91968d9db8d68b9d8a8e919b9d">[email protected]</a> static <a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="563b3237323b7b3a3725227b2433253924221678253324203f3533">[email protected]</a> static mdadm-waitidle.service masked mdadm.service masked <a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="9bf6fff6f4f5dbb5e8fee9edf2f8fe">[email protected]</a> static mdmonitor.service static dpkg-reconfigure mdadm update-initramfs: deferring update (trigger activated) update-rc.d: warning: start and stop actions are no longer supported; falling back to defaults update-rc.d: warning: start and stop actions are no longer supported; falling back to defaults Processing triggers for initramfs-tools (0.120+deb8u3) ...
I ran the commands that @telcoM suggested and this is the output. I also tried doing a reinstall then those commands, but still the output is the same.
I have tried looking at several other similarly named treads on the net but so far I have not found anything that seems helpful. I think the problem is related to the service not starting on boot but I am not experienced enough with Linux services to know how to fix it. @roaima suggested that it was a problem with initramfs but I do not know how to check or correct that. Does anyone have any ideas?
***Edit 4
CREATE owner=root group=disk mode=0660 auto=yes HOMEHOST <system> MAILADDR root ARRAY /dev/md/0 metadata=1.2 UUID=d3434dfc:2fb4792e:0b64f806:67e35ee3 name=raspberrypi:0 ARRAY /dev/md/0 metadata=1.2 UUID=40fb937f:870c7c13:46774666:87445bc5 name=pinas:0
Here is the out put of my mdadm.conf file. which is interesting because the first listed array does not have the right name…
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
This recipe worked for me after having the same issue. Looked all over the net trying to find the answer, and finally coming across this, and still no help.
The problem as I see it is multifold.
-
mdadm reassigns the device files from
/dev/md0to something like/dev/md127on the next reboot. So you cannot just use the device file in the fstab. I ended up using the UUID, from the created filesystem. -
Almost all the RAID drive setup tutorials on the web are showing the creation of the RAID device using the driver device files like this:
mdadm --create --verbose /dev/md0 --level=0 --raid-devices=4 /dev/sda /dev/sdb /dev/sdc /dev/sdd
Instead I used the partition device files, like this:
mdadm --create --verbose /dev/md0 --level=0 --raid-devices=4 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1
The second form requires proper creation of partitions on each disk using
gdiskorfdisk. I usedgdiskand assigned it as typefd00, which is a raid partition. -
There’s lots of talk about needing to update
/etc/mdadm/mdadm.conf. This is wrong. I purposefully, deleted that file. It’s not needed. (See below)
That’s really all there is too it. Full recipe follows…
Partition each drive with one partition of type fd00, Linux RAID:
<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="45372a2a3105312024282029262d242b">[email protected]</a>:~# gdisk /dev/sda
Command (? for help): n
Partition number (1-128, default 1):
First sector (2048-3907029134, default = 2048) or {+-}size{KMGTP}:
Last sector (2048-3907029134, default = 3907029134) or {+-}size{KMGTP}:
Current type is 'Linux filesystem'
Hex code or GUID (L to show codes, Enter = 8300): fd00
Changed type of partition to 'Linux RAID'
Command (? for help): p
Disk /dev/sda: 3907029168 sectors, 1.8 TiB
Model: ST2000DM001-1ER1
Sector size (logical/physical): 512/4096 bytes
Disk identifier (GUID): F81E265F-2D02-864D-AF62-CEA1471CFF39
Partition table holds up to 128 entries
Main partition table begins at sector 2 and ends at sector 33
First usable sector is 2048, last usable sector is 3907029134
Partitions will be aligned on 2048-sector boundaries
Total free space is 0 sectors (0 bytes)
Number Start (sector) End (sector) Size Code Name
1
2048 3907029134 1.8 TiB FD00 Linux RAID
Command (? for help): w
Final checks complete. About to write GPT data. THIS WILL OVERWRITE EXISTING
PARTITIONS!!
Do you want to proceed? (Y/N): y
OK; writing new GUID partition table (GPT) to /dev/sda.
The operation has completed successfully.
Now you should see both the disk devices and partition devices in /dev
<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="d0a2bfbfa490a4b5b1bdb5bcb3b8b1be">[email protected]</a>:~# ls /dev/sd[a-d]* /dev/sda /dev/sda1 /dev/sdb /dev/sdb1 /dev/sdc /dev/sdc1 /dev/sdd /dev/sdd1
Now create the RAID of your choice with mdadm using the partition device files, not the disk devices
<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="097b66667d497d6c68646c656a616867">[email protected]</a>:~# mdadm --create --verbose /dev/md0 --level=0 --raid-devices=4 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 mdadm: chunk size defaults to 512K mdadm: /dev/sda1 appears to contain an ext2fs file system size=471724032K mtime=Sun Nov 18 19:42:02 2018 mdadm: /dev/sda1 appears to be part of a raid array: level=raid0 devices=4 ctime=Thu Nov 22 04:00:11 2018 mdadm: /dev/sdb1 appears to be part of a raid array: level=raid0 devices=4 ctime=Thu Nov 22 04:00:11 2018 mdadm: /dev/sdc1 appears to be part of a raid array: level=raid0 devices=4 ctime=Thu Nov 22 04:00:11 2018 mdadm: /dev/sdd1 appears to contain an ext2fs file system size=2930265540K mtime=Sun Nov 18 23:58:02 2018 mdadm: /dev/sdd1 appears to be part of a raid array: level=raid0 devices=4 ctime=Thu Nov 22 04:00:11 2018 Continue creating array? y mdadm: Defaulting to version 1.2 metadata mdadm: array /dev/md0 started.
Now check in /dev/disk to see if there’s any UUID associated with your new /dev/md0 RAID.
There should be none.
<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="582a37372c182c3d39353d343b303936">[email protected]</a>:~# ls -l /dev/disk/by-uuid total 0 lrwxrwxrwx 1 root root 10 Nov 22 04:24 4777-FB10 -> ../../sdf1 lrwxrwxrwx 1 root root 10 Nov 22 04:24 D616BDCE16BDAFBB -> ../../sde1 lrwxrwxrwx 1 root root 10 Nov 22 04:24 e79571b6-eb75-11e8-acb0-e0d55e117fa5 -> ../../sdf2
Make the new filesystem, and after that you should now have a UUID with /dev/md0
<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="8af8e5e5fecafeefebe7efe6e9e2ebe4">[email protected]</a>:~# mkfs.ext4 -F /dev/md0 mke2fs 1.44.1 (24-Mar-2018) Creating filesystem with 2685945088 4k blocks and 335745024 inodes Filesystem UUID: 7bd945b4-ded9-4ef0-a075-be4c7ea246fb Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968, 102400000, 214990848, 512000000, 550731776, 644972544, 1934917632, 2560000000 Allocating group tables: done Writing inode tables: done Creating journal (262144 blocks): done Writing superblocks and filesystem accounting information: done
Voila, there it is.
<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="aad8c5c5deeadecfcbc7cfc6c9c2cbc4">[email protected]</a>:~# ls -l /dev/disk/by-uuid total 0 lrwxrwxrwx 1 root root 10 Nov 22 04:24 4777-FB10 -> ../../sdf1 lrwxrwxrwx 1 root root 9 Nov 22 04:43 7bd945b4-ded9-4ef0-a075-be4c7ea246fb -> ../../md0 lrwxrwxrwx 1 root root 10 Nov 22 04:24 D616BDCE16BDAFBB -> ../../sde1 lrwxrwxrwx 1 root root 10 Nov 22 04:24 e79571b6-eb75-11e8-acb0-e0d55e117fa5 -> ../../sdf2
Modify your /etc/fstab and add the mount for your new RAID
Be sure to use the UUID, and not the device file.
<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="097b66667d497d6c68646c656a616867">[email protected]</a>:~# cat /etc/fstab UUID=e79571b6-eb75-11e8-acb0-e0d55e117fa5 / ext4 defaults 0 0 UUID=4777-FB10 /boot/efi vfat defaults 0 0 /swap.img none swap sw 0 0 UUID=7bd945b4-ded9-4ef0-a075-be4c7ea246fb /md0/tweets ext4 auto 0 0
Here, look no /etc/mdadm/mdadm.conf
It’s not needed.
<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="51233e3e25112534303c343d3239303f">[email protected]</a>:~# ls -l /etc/mdadm total 0
Reboot
<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="d4a6bbbba094a0b1b5b9b1b8b7bcb5ba">[email protected]</a>:~# reboot Connection to 192.168.0.131 closed by remote host. Connection to 192.168.0.131 closed.
The RAID is mounted, but mdadm has renamed the device file from md0 to md127
Good thing we used the UUID and not the actual device file.
<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="21534e4e55615544404c444d4249404f">[email protected]</a>:~# df /md0/tweets Filesystem 1K-blocks Used Available Use% Mounted on /dev/md127 10658016696 73660 10120737636 1% /md0/tweets
Look md0 is gone from /dev
<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="72001d1d06320617131f171e111a131c">[email protected]</a>:~# ls /dev/md* /dev/md127 /dev/md:
That’s it. Now I’m enjoying my 10 Terabyte RAID0 that operates at over 600 MB/sec
<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="ee9c81819aae9a8b8f838b828d868f80">[email protected]</a>:~# hdparm -tT /dev/md127 /dev/md127: Timing cached reads: 26176 MB in 1.99 seconds = 13137.47 MB/sec Timing buffered disk reads: 1878 MB in 3.00 seconds = 625.13 MB/sec
Method 2
Your /proc/mdstat indicates that none of the RAID personalities (i.e. RAID1, RAID5, etc.) have been loaded, so no attempt is made to even try activating a RAID set.
Failed to start mdadm.service: Unit mdadm.service is masked.
This message indicates mdadm.service has been disabled in the strongest possible way: no explicit attempt will be made to start the service, and even if something else depends on this service, it won’t be started.
As in the question linked by roaima, try running these commands as root:
dpkg-reconfigure mdadm # Choose "all" disks to start at boot update-initramfs -u # Updates the existing initramfs
The first will reconfigure the mdadm package and should detect all the RAID sets and let you choose which RAID sets to auto-activate at boot: usually “all” is a good answer. This should also take care of the mdadm.service being masked, if I’ve understood correctly.
Once that is done, the second command will update your initramfs, so that the updated configuration files will be updated in your initramfs too, and the scripts that will be executed in the earliest phases of boot will get the information that there is a RAID set that should be activated.
Method 3
At least part of the problem is that you have two definitions for the same RAID device /dev/md/0 in your mdadm conf. You need to fix that first.
Then get your array running, and finally you can follow the instructions at New RAID array will not auto assemble, leads to boot problems –
dpkg-reconfigure mdadm # Choose "all" disks to start at boot update-initramfs -u # Updates the existing initramfs
Method 4
The service was not starting because it was masked. here is how I found how to unmask it. the next problem was the mdadm-raid service was not starting the raid. this was how I got the raid to start on boot. Look down to “Mon Jul 31, 2017 7:49 pm” to find the relevant post. This may not be the best solution but after 10 reboots the raid is still starting every time. I do appreciate the efforts from the people who have tried to answer this thread. Now I just need to sort out the other services provided. but that is a problem for another day.
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0