Consider following kern.log snippet:
ata4.00: failed command: WRITE FPDMA QUEUED
ata4.00: cmd 61/00:78:40:1e:6c/04:00:f0:00:00/40 tag 15 ncq 524288 out
res 41/04:00:00:00:00/04:00:00:00:00/00 Emask 0x1 (device error)
ata4.00: status: { DRDY ERR }
ata4.00: error: { ABRT }
ata4: hard resetting link
ata4: nv: skipping hardreset on occupied port
ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata4.00: configured for UDMA/133
ata4: EH complete
How can I identify which hard drive the kernel actually means when it talks about ata4.00?
How can I find the corresponding /dev/sdY device name?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
You can find the corresponding /dev/sdY device via traversing the /sys tree:
$ find /sys/devices | grep '/ata[0-9]+/.*/block/s[^/]+$'
| sed '<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="7e0d3e">[email protected]</a>^.+/(ata[0-9]+)/.+/block/(.+)<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="d8fc98">[email protected]</a>1 => /dev/<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="655725">[email protected]</a>'
With a more efficient /sys traversal (cf. lsata.sh):
$ echo /sys/class/ata_port/ata*/../../host*/target*/*/block/s* | tr ' ' 'n'
| awk -F/ '{printf("%s => /dev/%sn", $5, $NF)}'
Example output from a 2 disk system:
ata1 => /dev/sda ata2 => /dev/sdb
Then, for reliably identifying the actual hardware you need to map /dev/sdY to the serial number, e.g.:
$ ls /dev/disk/by-id -l | grep 'ata.*sd[a-zA-Z]$'
lssci
The lssci utility can also be used to derive the mapping:
$ lsscsi | sed '<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="d1a291">[email protected]</a>^[([^:]+).+(/dev/.+)<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="5e7a1e">[email protected]</a>1,<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="a391e3">[email protected]</a>'
| awk -F, '{ printf("ata%d => %sn", $1+1, $2) }'
Note that the relevant lsscsi enumeration starts from 0 while the ata enumeration starts from 0.
Syslog
If nothing else works one can look at the syslog/journal to derive the mapping.
The /dev/sdY devices are created in the same order as the ataX identifiers are enumerated in the kern.log while ignoring non-disk devices (ATAPI) and not-connected links.
Thus, following command displays the mapping:
$ grep '^May 28 2' /var/log/kern.log.0 |
grep 'ata[0-9]+.[0-9][0-9]: ATA-' |
sed 's/^.*] ata//' |
sort -n | sed 's/:.*//' |
awk ' { a="ata" $1; printf("%10s is /dev/sd%cn", a, 96+NR); }'
ata1.00 is /dev/sda
ata3.00 is /dev/sdb
ata5.00 is /dev/sdc
ata7.00 is /dev/sdd
ata8.00 is /dev/sde
ata10.00 is /dev/sdf
(Note that ata4 is not displayed because the above log messages are from another system.)
I am using /var/log/kern.log.0 and not /var/log/kern.log because the boot messages are already rotated. I grep for May 28 2 because this was the last boot time and I want to ignore previous messages.
To verify the mapping you can do some checks via looking at the output of:
$ grep '^May 28 2' /var/log/kern.log.0 | grep 'ata[0-9]+.[0-9][0-9]: ATA-' May 28 20:43:26 hn kernel: [ 1.260488] ata1.00: ATA-7: SAMSUNG SV0802N, max UDMA/100 May 28 20:43:26 hn kernel: [ 1.676400] ata5.00: ATA-5: ST380021A, 3.19, max UDMA/10 [..]
And you can compare this output with hdparm output, e.g.:
$ hdparm -i /dev/sda /dev/sda: Model=SAMSUNG SV0802N [..]
(using Kernel 2.6.32-31)
Method 2
Here’s my version, modified from above. Since I don’t know the exact date the system was booted (for testing this it was 27 days ago), and I don’t know which kern.log contains the data I need (some may be gzipped on my system), I use uptime and date to calculate an approximate system boot date (to the day, anyway), then use zgrep to search through all available kern.log files.
I also slightly modified the second grep statement, since it will now also show an ATAPI CD/DVD drive as well as ATA-* drives.
It could still use refinement (i.e. if system uptime is greater than a year), but should work OK for now.
#!/bin/bash
uptime=$(uptime | awk -F' ' '{ print $3" "$4 }' | sed s/,//)
date=$(date -d "$uptime ago" | awk '{print $2" "$3 }')
zgrep "$date" /var/log/kern.log* |
grep 'ata[0-9]+.[0-9][0-9]: ATA' |
sed 's/^.*] ata//' |
sort -n | sed 's/:.*//' |
awk ' { a="ata" $1; printf("%10s is /dev/sd%cn", a, 96+NR); }'
Method 3
Just had this same problem and found a another solution which one might like.
The lsscsi tool lists SCSI devices (or hosts) and their attribute.
With lsscsi one gets the ata name and the device name.
Looks like this:
$ lsscsi --long [0:0:1:0] cd/dvd MATSHITA DVD-ROM UJDA780 1.50 /dev/sr0 state=running queue_depth=1 scsi_level=6 type=5 device_blocked=0 timeout=30 [2:0:0:0] disk ATA WDC WD3000FYYZ-0 01.0 /dev/sda state=running queue_depth=1 scsi_level=6 type=0 device_blocked=0 timeout=30 [3:0:0:0] disk ATA WDC WD1002FBYS-0 03.0 /dev/sdb state=running queue_depth=1 scsi_level=6 type=0 device_blocked=0 timeout=30 [4:0:0:0] disk ATA WDC WD1002FBYS-0 03.0 /dev/sdc state=running queue_depth=1 scsi_level=6 type=0 device_blocked=0 timeout=30 [5:0:0:0] disk ATA WDC WD1002FBYS-0 03.0 /dev/sdd state=running queue_depth=1 scsi_level=6 type=0 device_blocked=0 timeout=30 [6:0:0:0] disk ATA WDC WD3000FYYZ-0 01.0 /dev/sde state=running queue_depth=1 scsi_level=6 type=0 device_blocked=0 timeout=30 [7:0:0:0] disk ATA WDC WD1002FBYS-0 03.0 /dev/sdf state=running queue_depth=1 scsi_level=6 type=0 device_blocked=0 timeout=30
On Ubuntu one can install lsscsi simply with
$ sudo apt-get install lsscsi
Method 4
None of the above answers worked for me, and the lsscsi approach actually yielded the wrong answer, due to discrepancies between SCSI bus numbers and ATA numbers. On a 21-disk system, I had many syslog reports about problems with ATA18 (HSM violations). Which disk was causing these errors? Some were usb drives, which made things considerably more confusing. I needed an accounting of how each and every SCSI drive is attached to the system, and I wrote the script below that yields tabular listings for all SCSI disks (/dev/s[dr]?) regardless of whether ATA or USB.
Then, with all disk drives fully accounted-for, I was surprised to see that my ATA errors had nothing to do with any of my disk drives. I had been asking the wrong question, and I think others might easily fall into the same trap, which is why I mention it here. I then used a second approach that identified the hardware that was generating the HSM violation messages, also detailed in the documentation appearing in the script below.
#!/bin/bash
## This script lists the ata and usb bus numbers, as well as the
## overall "host" numbers, of each scsi disk. The same information
## appears formatted four ways, redundantly, for ease of lookup by (1)
## device lettername, (2) ata bus, (3) usb bus, or (4) overall "host"
## number.
#######################################################
## Q: What if you're looking for an ATA bus number, e.g. ata18, that
## isn't listed by this script?
## (1) Well, it's probably not a SCSI disk, at least not one that's
## operating.
## (2) Somewhere in /sys you can find a mapping from the ATA bus
## number to some overall host number, such as host17. For example,
## if you're looking for ata18, you can use a find command...
## find /sys -type l -exec bash -c 'link=`readlink "$0"`; if [[ "$link" =~ /ata18/ ]] ; then echo $link ; fi' {} ;
## ...which, after some delay, might yield output something like this:
## ../../devices/pci0000:00/0000:00:02.0/0000:02:00.0/ata18/ata_port/ata18
## ../../devices/pci0000:00/0000:00:02.0/0000:02:00.0/ata18/host17/target17:0:0/17:0:0:0/scsi_generic/sg5
## ../../devices/pci0000:00/0000:00:02.0/0000:02:00.0/ata18/link18/dev18.0/ata_device/dev18.0
## ../../devices/pci0000:00/0000:00:02.0/0000:02:00.0/ata18/host17/scsi_host/host17
## ../../devices/pci0000:00/0000:00:02.0/0000:02:00.0/ata18/link18/ata_link/link18
## ../../devices/pci0000:00/0000:00:02.0/0000:02:00.0/ata18/host17/target17:0:0/17:0:0:0/bsg/17:0:0:0
## ../../devices/pci0000:00/0000:00:02.0/0000:02:00.0/ata18/host17/target17:0:0/17:0:0:0/scsi_device/17:0:0:0
## ../../devices/pci0000:00/0000:00:02.0/0000:02:00.0/ata18/host17/target17:0:0/17:0:0:0/scsi_generic/sg5
## ../../devices/pci0000:00/0000:00:02.0/0000:02:00.0/ata18/host17/target17:0:0/17:0:0:0/bsg/17:0:0:0
## ../../../devices/pci0000:00/0000:00:02.0/0000:02:00.0/ata18/host17/target17:0:0/17:0:0:0
## ../../../devices/pci0000:00/0000:00:02.0/0000:02:00.0/ata18/host17
## ../../../devices/pci0000:00/0000:00:02.0/0000:02:00.0/ata18/host17/target17:0:0
## Then you might notice the "/host17/" or "scsi_device/17:0:0:0"
## in the above output lines, and look in the output of...
## lshw
## .. for "scsi17" or "17:0" or such, and discover, somewhere in it ...
## ...
## *-scsi:5
## physical id: 8
## logical name: scsi17
## capabilities: emulated
## *-processor UNCLAIMED
## description: SCSI Processor
## product: 91xx Config
## vendor: Marvell
## physical id: 0.0.0
## bus info: <a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="562535253f166761">[email protected]</a>:0.0.0
## version: 1.01
## capabilities: removable
## configuration: ansiversion=5
## ...
## ...thus learning that ata18 corresponds to an unclaimed device (but
## not actually a disk). Q.E.D.
## P.S. the lsscsi command yields the following, which might lead
## one to think that the problem was being caused by a CD-ROM drive
## (SCSI18:0) rather than emanating from the Marvell (SCSI17:0):
## [17:0:0:0] process Marvell 91xx Config 1.01 -
## [18:0:0:0] cd/dvd HL-DT-ST DVDRAM GH22NS90 HN00 /dev/sr0
## ... but ATA != SCSI, and 17 != 18. The CD/DVD drive was ATA19,
## actually. You can still use lsscsi, but
## bear in mind that what you're seeing in the left column
## is *not* ATA numbers but rather SCSI bus numbers, and the two
## are not to be confused.
#######################################################
blockDevsDir=/sys/dev/block
declare -A scsiDevLetters
declare -A hostNumbers
declare -A ataNumbers
declare -A usbNumbers
scsiDevLetterRE='/s(d[a-z]|r[0-9])$'
hostNumberRE='/host([0-9]+)/'
ataNumberRE='/ata([0-9]+)/'
usbNumberRE='/usb([0-9]+)/'
cd "$blockDevsDir"
for busid in `ls -1` ; do
linkval=`readlink "$busid" `
if [[ "$linkval" =~ $scsiDevLetterRE ]] ; then
scsiDevLetter="${BASH_REMATCH[1]}"
if [[ "$linkval" =~ $hostNumberRE ]] ; then
hostNumber="${BASH_REMATCH[1]}"
if [[ "$linkval" =~ $ataNumberRE ]] ; then
ataNumber="${BASH_REMATCH[1]}"
scsiDevLetters[$scsiDevLetter]=`printf 'ata%-2.2s host%-2.2s' "${ataNumber}" "${hostNumber}"`
hostNumbers[${hostNumber}]=`printf '/dev/sd%s ata%-2.2s' "${scsiDevLetter}" "${ataNumber}"`
ataNumbers[${ataNumber}]=`printf '/dev/sd%s host%-2.2s' "${scsiDevLetter}" "${hostNumber}"`
elif [[ "$linkval" =~ $usbNumberRE ]] ; then
usbNumber="${BASH_REMATCH[1]}"
scsiDevLetters[$scsiDevLetter]=`printf 'usb%-2.2s host%-2.2s' "${usbNumber}" "${hostNumber}"`
hostNumbers[${hostNumber}]=`printf '/dev/sd%s usb%-2.2s' "${scsiDevLetter}" "${usbNumber}"`
existingUsbValue="${usbNumbers[${usbNumber}]}"
addedUsbValue=`printf '/dev/sd%s host%-2.2s' "${scsiDevLetter}" "${hostNumber}"`
if [ -n "$existingUsbValue" ] ; then
usbNumbers[${usbNumber}]="$existingUsbValue | $addedUsbValue"
else
usbNumbers[${usbNumber}]="$addedUsbValue"
fi
else
echo "Neither ata nor usb: /dev/sd${scsiDevLetter} (host${hostNumber}) !"
fi
else
echo "No host number for /dev/sd${scsiDevLetter}"
fi
fi
done
echo '/dev/sd?'
echo '--------'
for scsiDevLetter in `echo "${!scsiDevLetters[*]}" | tr ' ' 'n' | sort` ; do
echo "/dev/sd${scsiDevLetter} ${scsiDevLetters[$scsiDevLetter]}"
done
echo
echo 'ataNN'
echo '-----'
for ataNumber in `echo "${!ataNumbers[*]}" | tr ' ' 'n' | sort -n` ; do
printf 'ata%-2.2s %sn' "$ataNumber" "${ataNumbers[$ataNumber]}"
done
echo
echo 'usbNN'
echo '-----'
for usbNumber in `echo "${!usbNumbers[*]}" | tr ' ' 'n' | sort -n` ; do
printf 'usb%-2.2s %sn' "$usbNumber" "${usbNumbers[$usbNumber]}"
done
echo
echo 'hostNN'
echo '------'
for hostNumber in `echo "${!hostNumbers[*]}" | tr ' ' 'n' | sort -n` ; do
printf 'host%-2.2s %sn' "$hostNumber" "${hostNumbers[$hostNumber]}"
done
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0