Commit ff604e55 authored by anonym's avatar anonym
Browse files

Make our uBlock Origin database dump even more diff-friendly.

There were some super long lines listing different things separated
with "|" and "," so let's split on those characters too.
parent abf0a12b
......@@ -154,10 +154,18 @@ The patterns+settings file is stored as a converted sqlite text dump in
from this Tor Browser instance into the root of Tails' Git repo and
run the following command:
# For posterity: the general idea is to introduce \r\n as a
# token where we have made a line break to make the dump more
# diff-friendly (and, hence, Git-friendly). The most complex
# expression is the one done with perl, where we employ
# negative lookahead. What it means, is: replace single
# occurrences of | except when followed by \\n.
echo '.dump' | sqlite3 ublock0.sqlite | \
grep -v "cached_asset_content://cache://compiled-" | \
awk '!/^INSERT/; /^INSERT/ {print $0 | "sort -n"}' | \
sed 's_\\n_\\n\r\n_g' | \
sed 's_,_,\r\n_g' | \
perl -pi -e 's/([^|])\|((?!\||\\n).)/\1\|\r\n\2/g' \
sed "/^INSERT INTO \"settings\" VALUES('\(remoteBlacklists\|cached_asset_entries\)'/"'s_,_,\r\n_g' > \
