Using `mkvtoolnix` and `GNU Parallel` to Whip Some .ASS
I recently needed to work some mkvmerge=/=mkvpropedit magic for some
anime. It’s been a long time since the days when I started watching
Bleach with my friends by sshing to one laptop from another and
using mplayer in a terminal to play the show and the state of mkv
support has come a very long way since then.
These days I prefer playing all my media through Plex which is just hard to describe in all its towering greatness. The show that I wanted to watch though had its ASS subtitles and accompanying fonts broken out from it’s MKV files for reasons unknown to me. Because of this and the amazing malleability of MKV I decided to roll up my sleeves and fix the problem myself.
shopt -s extglob
parallel -q --tagstring {/.} --line-buffer mkvmerge \
-o {.}_merged.mkv {} --language 0:eng {.}.en.ass \
--attachment-description '' \
--attachment-mime-type application/x-truetype-font \
--attach-file {//}/'fonts/A-OTF-FutoMinA101Pro-Bold.otf' \
… 147 similar lines …
::: Season*/!(*_merged).mkv
The main reasons I’m writing this up at all are several fold:
- Often times what I’m doing with GNU Parallel is complex enough that
it warrants an actual shell function or script or it’s so
simple that it more or less amounts to slapping one of the
arguments onto the end. Because of this I’ve rarely explored
parallel'snative expansion support which this task gave me an opportunity to do. - The task was complex enough also that I learned about
parallel'sdouble expansion which I’m sure has been the source of many frustrating missteps in the past now that I know it’s there. Essentially bash will process the arguments to parallel first obeying all the normal rules and then pass them to theexecofparallel, thenparallelwill pass them again to a subshell which willparsethem again through all the normal rules. Keeping this all straight in your head is not easy but I think the essential rule is probably something like “If you actually notice the double expansion you should write a function/script for the behavior and otherwise you should probably be running with -q” or something similar but better worded than that. - It’s fun to mess with mkv files. ¯\_(ツ)_/¯
To review the script:
- We’re setting extglob because we want to be able to process only
mkvfiles that haven’t been_mergedyet. - We’re invoking
parallelwith-qbecause we don’t want the subshell to word split our carefully constructed arguments again. We’re using
--tagstring {/.} --line-bufferbecause we want to see our job output live so we know it’s working (it takes a bit to make these changes to the matroska files). It’s not directly documented AFAICT that--tagstringsupports GNU Parallel expansion but I took a chance on it and was pleasantly surprised. This particular expansion is thebasename-sans-extensionversion which seemed like a sensical tag for the logged lines.This feature alone is worth using parallel for when you’re really doing complex parallel work. It’s a really efficient way to provide active log lines that are still comprehensible later on.
--line-bufferjust makes sure that you always get a full line from a job rather than the lines from the various jobs mixing mid-line.-o {.}_merged.mkvis a nice way to express the bash-ism of${x%.*}_merged.mkv.--language 0:eng {.}.en.assis a bit speculative as it’s what I think I should have run but I didn’t initially since I was working off an example that didn’t include it. Nevertheless I think I’m interpreting the docs correctly. Originally I just didn’t include the--language 0:engbit so the subtitles were attached as an unknown language and I had to then go back through and correct that with anmkvpropeditrun.- Then I used a bit of Emacs Keyboard Macro magic to transform the
Dired fonts buffer into a series of attachment arguments so that
the ASS subtitle track could be properly rendered. The
parallelexpansion there of{//}was also something I hadn’t seen before and is how I managed to get away with this from the root of the seasons directory rather than needing to process the seasons one at a time. That one specifically expands to the dirname of the input line or as a bash-ism${d%/*}. - Finally we’re taking arguments from the
extglobthat matches all theSeasonsmkv files excluding the_mergedfiles which you can see are what theparalleltasks are actually producing.
With all this I was able to completely saturate my wired connection to my Synology and efficiently process the entire show. I love the smell of burning silicon in the morning.
I hope this little foray into parallel/mkvtoolnix land teaches you
something like it taught me.
Happy scripting!