Recover from buggy HTML
On my sid system, check-mirrors.rb fails on one mirror whose web page has invalid HTML: $ ./check-mirrors.rb --channel ${DIST:?} --allow-multiple --debug --fast tails-amd64-${VERSION:?} --url-prefix https://ftp.nluug.nl/os/Linux/distr/tails/tails/ version: tails-amd64-4.15 fetch: https://tails.boum.org/inc/trace trace: 1611584352 mirror: https://ftp.nluug.nl/os/Linux/distr/tails/tails/ fetch: https://ftp.nluug.nl/os/Linux/distr/tails/tails//project/trace trace: 1611584352 fetch: https://ftp.nluug.nl/os/Linux/distr/tails/tails//stable/ Traceback (most recent call last): 7: from ./check-mirrors.rb:447:in `<main>' 6: from ./check-mirrors.rb:447:in `each' 5: from ./check-mirrors.rb:465:in `block in <main>' 4: from ./check-mirrors.rb:200:in `check_versions' 3: from ./check-mirrors.rb:187:in `scan_for_links' 2: from /usr/lib/ruby/vendor_ruby/nokogiri/html.rb:16:in `HTML' 1: from /usr/lib/ruby/vendor_ruby/nokogiri/html/document.rb:215:in `parse' /usr/lib/ruby/vendor_ruby/nokogiri/html/document.rb:215:in `read_memory': Parser without recover option encountered error or warning: 30:8: ERROR: Opening and ending tag mismatch: font and b (Nokogiri::XML::SyntaxError) zsh: exit 1 ./check-mirrors.rb --channel ${DIST:?} --allow-multiple --debug --fast To workaround this, enable the Nokogiri RECOVER parser option, whose documentation reads "Recover from errors": https://nokogiri.org/rdoc/Nokogiri/XML/ParseOptions.html
Loading
Please register or sign in to comment