smartmontools

Smartmontools is a hard drive detection tool that is implemented through SMART (Self Monitoring Analysis and Reporting Technology) technology that controls and manages hard drives.

##Install

sudo aptitude install smartmontools

grammar

smartctl (options) (parameters)

Options

-i <hard disk> displays the identification information of the hard disk device
-a <hard disk> Display all SMART information of the device
-H <hard drive> Display device health information
-A <hard> Display device SMART vendor-specific attributes and values

Parameters

Hard disk device: Specify the hard disk to be viewed (you can use fdisk -l to get which hard disk devices are there)

~ sudo fdisk -l
Device Start End Sector Size Type
/dev/sda1 2048 1050623 1048576 512M EFI system
/dev/sda2 1050624 976771071 975720448 465.3G Linux file system

Example

Check the health status of the /dev/sda1 hard disk. In this command, the "-s on" flag turns on the SMART function on the specified device. If SMART support is enabled on /dev/sda, omit it. (PASSED means healthy; FAILED means a failure is imminent, so you need to start backing up important data on this disk)

~ sudo smartctl -s on -H /dev/sda1

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

View /dev/sda1 hard drive specific attributes and values (Power_On_Hours: indicates the power-on time of 18195 hours)

~ sudo smartctl -A /dev/sda1

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
   3 Spin_Up_Time 0x0023 100 100 002 Pre-fail Always - 1326
   4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 3752
   9 Power_On_Hours 0x0032 055 055 000 Old_age Always - 18195
  10 Spin_Retry_Count 0x0033 174 100 030 Pre-fail Always - 0
  12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 3118
183 Runtime_Bad_Block 0x0032 100 100 001 Old_age Always - 0
184 End-to-End_Error 0x0033 100 100 097 Pre-fail Always - 0
185 Unknown_Attribute 0x0032 100 100 001 Old_age Always - 65535
187 Reported_Uncorrect 0x0032 001 001 000 Old_age Always - 134
188 Command_Timeout 0x0032 100 098 000 Old_age Always - 48
191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 2850
192 Power-Off_Retract_Count 0x0022 100 100 000 Old_age Always - 32047593
193 Load_Cycle_Count 0x0032 095 095 000 Old_age Always - 51738
194 Temperature_Celsius 0x0022 060 055 040 Old_age Always - 40 (Min/Max 16/44)

Run at a specified interval, and at the same time notify the test results of the hard disk

First, edit smartctl's configuration file (/etc/default/smartmontools) to start smartd at system startup and specify the interval in seconds (e.g. 7200 = 2 hours)

start_smartd=yes
smartd_opts="--interval=7200"

Next, edit the smartd configuration file (/etc/smartd.conf) and add the following lines.

/dev/sda -m [email protected] -M test

Option description -m: Specifies to send the test report to an email address. This can be a system user such as root, or an email address like [email protected] if the server has been configured to send email outside the system. -M: Specifies the desired type of email report. once: Send only one warning email for each disk problem detected. daily: Send an additional warning reminder email every other day for each disk problem detected. diminishing: Send an additional warning reminder email for each issue detected, starting with every other day, then every two days, then every four days, and so on. Each interval is twice the size of the previous interval. test: As soon as smartd starts, a test email will be sent immediately. exec PATH: Replaces the default mail command and runs the executable file in the PATH path. PATH must point to an executable binary or script. When a problem is detected, you can specify a desired action to be performed (flashing the console, shutting down the system, etc.).

Save changes and restart smartd.