Detect I/O failures on Tails partition(s)
Example logs of errors when trying to create a Persistent Storage, caused by hardware problems:
(from #5856 (comment 226495))
kernel: sd 6:0:0:0: [sdb] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=31s
kernel: sd 6:0:0:0: [sdb] tag#0 Sense Key : Unit Attention [current]
kernel: sd 6:0:0:0: [sdb] tag#0 Add. Sense: Not ready to ready change, medium may have changed
kernel: sd 6:0:0:0: [sdb] tag#0 CDB: Write(10) 2a 00 04 cc 80 00 00 00 40 00
kernel: I/O error, dev sdb, sector 80510976 op 0x1:(WRITE) flags 0x800 phys_seg 8 prio class 2
kernel: Buffer I/O error on dev dm-0, logical block 7962624, lost async page write
(from !1427 (comment 228649))
Feb 21 12:03:13 amnesia kernel: JBD2: I/O error when updating journal superblock for dm-0-8.
Feb 21 12:03:13 amnesia kernel: EXT4-fs error (device dm-0): ext4_journal_check_start:83: comm python3: Detected aborted journal
Feb 21 12:03:13 amnesia kernel: sd 10:0:0:0: [sdc] tag#0 device offline or changed
Feb 21 12:03:13 amnesia kernel: I/O error, dev sdc, sector 16809984 op 0x1:(WRITE) flags 0x3800 phys_seg 1 prio class 2
Feb 21 12:03:13 amnesia kernel: Buffer I/O error on dev dm-0, logical block 0, lost sync page write
Feb 21 12:03:13 amnesia kernel: EXT4-fs (dm-0): I/O error while writing superblock
Feb 21 12:03:13 amnesia kernel: EXT4-fs (dm-0): Remounting filesystem read-only
(from #5856 (comment 228671))
kernel: critical target error, dev sdc, sector 21266432 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 2
Execute the following python script as root to trigger a fake I/O error (select one of the three):
from systemd import journal
journal.send("SQUASHFS error: A fake error.", SYSLOG_IDENTIFIER="kernel", PRIORITY=journal.LOG_ERR) # NOT trigger _update_patterns; create /var/lib/live/tails.disk.ioerrors
journal.send("A fake I/O error.", SYSLOG_IDENTIFIER="kernel", PRIORITY=journal.LOG_ERR) # trigger _update_patterns; DON'T create /var/lib/live/tails.disk.ioerrors
journal.send(""EXT4-fs error (device dm-0)", SYSLOG_IDENTIFIER="kernel", PRIORITY=journal.LOG_ERR) # trigger _update_patterns; create /var/lib/live/tails.disk.ioerrors
Manual tests
-
start ISO in VM, trigger fake I/0 errors. -
Check that boot_device is correct
-
start USB image in VM, trigger fake I/0 errors. -
Check that boot_device is correct
-
start USB image with tps activate in VM, trigger fake I/0 errors. -
Check that boot_device is correct
Skipped tests, as we don't have the faulty hardware
-
tests with real faulty USB sticks -
Check that boot_device is correct
-
tests with real faulty ISOs -
Check that boot_device is correct
Tests that may be nice, but out of scope
a solution mentioned by boyska in !1427 (comment 229941) would give us a solution to test this. But I think this is more a case for #15451 (closed).
-
start from USB stick without tps, while creating the tps an I/O happen -
Check if we detect it (unclear if udisk already know the device name)
-
boot with tps created before, while mounting the tps an I/O happen -
Check if the error is detected (unclear if udisk already know the device name)
Closes #5856 (closed)