1---
  2title: "Replacing YouTube & Invidious"
  3description: "Simple script I created to download YouTube videos"
  4author: Amolith
  5cover: /assets/pngs/download-video.png
  6date: 2020-08-03T05:43:37-04:00
  7draft: false
  8toc: true
  9categories:
 10  - Technology
 11tags:
 12  - YouTube
 13  - Invidious
 14  - youtube-dl
 15  - Media
 16  - Automation
 17---
 18
 19Omar Roth, the developer of Invidious, recently wrote a blog post about
 20*[Stepping away from open
 21source.](https://omar.yt/posts/stepping-away-from-open-source)* While I
 22never used the official instance, I thought this was a good opportunity
 23to create a tool that downloads videos from YouTubers I'm subscribed to
 24so I can watch them offline in whatever manner I prefer.
 25
 26To that end, [youtube-dl](https://github.com/ytdl-org/youtube-dl) is by
 27far the most reliable and versatile option. Having been around since
 28before 2008[^1], I don't think the project is going anywhere.
 29[MPV](https://mpv.io/) is my media player of choice and it relies on
 30youtube-dl for watching online content from Twitch to YouTube to PornHub
 31to [much
 32more.](https://ytdl-org.github.io/youtube-dl/supportedsites.html)
 33
 34Conveniently, youtube-dl comes with all of the tools and flags needed
 35for exactly this purpose so scripting and automating it is incredibly
 36easy. Taking a look at `man youtube-dl` reveals a *plethora* of options
 37but only a few are necessary.
 38
 39## Preventing duplicates
 40The main thing when downloading an entire channel is ensuring the same
 41video isn't downloaded more than once. `--download-archive` will save the
 42IDs of all the videos downloaded to prevent them from being downloaded
 43again. All that's required is a file path at which to store the list.
 44
 45```bash
 46--download-archive .archives/<channel>.txt
 47```
 48
 49## Selecting quality
 50By default, youtube-dl downloads the *single* file with the best quality
 51audio and video so it doesn't need to mux[^2] them together after.
 52However, I prefer to have the highest possibly quality and don't mind
 53waiting a few seconds longer for FFmpeg to combine audio and video
 54files. `--format` lets the user decide which format they prefer and
 55whether they want to focus on storage efficiency or quality. The basic
 56options are `best`, `worst`, `bestvideo`, and `bestaudio`. These will
 57download a *single* file that is the highest or lowest quality of both
 58video/audio or video-only/audio-only. There are a lot of other options
 59for more fine-grained control over what to look for but, as I mentioned,
 60I want the highest video and the highest quality audio so I use
 61`bestvideo+bestaudio`. After downloading both of those files, they will
 62be muxed together.
 63
 64```bash
 65--format bestvideo+bestaudio
 66```
 67
 68## Embedding subtitles
 69I enjoy subtitles but I know many people don't so this section can
 70certainly be ignored.
 71
 72There are a few options for fetching and embedding subtitles. To get
 73"real" subtitles that someone transcribed manually, use `--write-sub`.
 74For YouTube's auto-generated subtitles that may or may not be horribly
 75inaccurate, use `--write-auto-sub`. For selecting the language,
 76`--sub-lang <language-code>`, for embedding them, `--embed-subs`, and
 77for selecting the format, `--sub-format <desired-format>`. I like using
 78SRT files so I have that first with `best` beside it like so:
 79`--sub-format srt/best`. SRT is preferred but, if unavailable, whatever
 80other highest-quality format will be used and embedded instead.
 81
 82```bash
 83--write-sub --write-auto-sub --sub-format srt/best --sub-lang en --embed-subs
 84```
 85
 86## Limiting downloads
 87When switching to this method, the initial download will pull *all* of a
 88channel's videos. I certainly don't want this so they should be limited
 89in some way. `--dateafter` and `--playlist-end` serve very nicely. The
 90former will only download videos published *after* a certain date and
 91the latter will only download X number of videos.
 92
 93```bash
 94--dateafter 20200801 --playlist-end 5
 95```
 96
 97**EDIT:** A reader sent me an email with this improvement to the archive
 98functionality. Rather than checking the last five days of videos to see
 99if they've already been downloaded, this snippet will check when the
100archive file was last edited and use that as the `--dateafter`
101parameter, making the script a bit more efficient.
102
103``` bash
104AFTER=$(date -r .archives/"$1".txt +%Y%m%d 2>/dev/null || date +%Y%m%d)
105--dateafter $AFTER --playlist-end 5
106```
107
108## Naming format
109The final parameter to look at is how to name the files once they're
110downloaded. `--output` provides templating functionality and there are a
111*lot* of options. For this use, an acceptable template might be
112something like `Channel/Title.ext`. In youtube-dl's templating format,
113that's `"%(uploader)s/%(title)s.%(ext)s"`.
114
115```bash
116--output "%(uploader)s/%(title)s.%(ext)s"
117```
118
119## Getting notifications
120I don't yet have a good method for getting notifications when there are
121*new* videos but there is a simple way to get notified when the script
122is finished running. `notify-send` is one of the easiest and has pretty
123simple syntax as well: the first string is the notification summary and
124the second is a longer description. You can optionally pass an icon name
125to make it look a little better.
126
127```bash
128notify-send -i video-x-generic "Downloads finished" "Check the YouTube folder for new videos"
129```
130
131For some reason, I don't get icons when the generic name is specified
132but I know the command will work on most systems. On mine, I have to
133pass the path to the icon file I want: `-i
134/usr/share/icons/Suru++-Dark/apps/64/video.svg`
135
136## Writing the script
137I want to store the videos in `~/Videos/YouTube` and I want the archive
138records stored in `.archives` so the first line (after the shebang[^3])
139creates those directories if they don't already exist and the second
140enters the `YouTube` folder.
141
142```bash
143mkdir -p "$HOME/Videos/YouTube/.archives"
144cd "$HOME/Videos/YouTube"
145```
146
147From here, a way to reuse the youtube-dl command is necessary so
148parameters can be changed in one place and they'll apply to all
149channels. Functions are intended for exactly this purpose and are
150formatted like so:
151
152```bash
153functionName () {
154    # Code here
155}
156```
157
158I've named the function `dl` so mine looks like this:
159```bash
160dl () {
161    AFTER=$(date -r .archives/"$1".txt +%Y%m%d 2>/dev/null ||   \
162        date +%Y%m%d)
163
164    youtube-dl --download-archive .archives/"$1".txt -f         \
165        bestvideo+bestaudio --dateafter 20200801 --write-sub    \
166        --write-auto-sub --sub-format srt/best --sub-lang en    \
167        --embed-subs -o "%(uploader)s/%(title)s.%(ext)s"        \
168        --playlist-end 5 "$2"
169    sleep 5
170}
171```
172
173Because it's in a function that's called repeatedly, the `AFTER`
174variable will be reevaluated each time using a different archive file to
175ensure no videos are missed. The backslashes at the end (`\`) tell bash
176that it's a single command spanning multiple lines. At the bottom,
177`sleep` just waits 5 seconds before downloading the next channel. It's
178unlikely that YouTube will ratelimit a residential address for this but
179it is still possible. Waiting a bit before continuing reduces the
180likelihood further.
181
182Note the use of `"$1"` and `"$2"` in the archive path and at the very
183end of the youtube-dl command. This lets the user define what the
184archive file should be named and what channel to download videos from. A
185line using the function would be something like:
186
187```bash
188dl linustechtips https://www.youtube.com/user/LinusTechTips
189```
190
191The result would be this directory structure:
192```text
193YouTube
194|-- .archives
195|   \-- linustechtips.txt
196\-- Linus Tech Tips
197    \-- They still make MP3 players.mkv
198```
199
200The last line is the notification command:
201```bash
202notify-send -i video-x-generic "Downloads finished" "Check the YouTube folder for new videos"
203```
204
205## Finished script
206This the script I have in use right now.
207
208```bash
209#!/bin/bash
210
211mkdir -p "$HOME/Videos/YouTube/.archives"
212cd "$HOME/Videos/YouTube"
213
214
215dl () {
216    AFTER=$(date -r .archives/"$1".txt +%Y%m%d 2>/dev/null ||   \
217        date +%Y%m%d)
218
219    youtube-dl --download-archive .archives/"$1".txt -f         \
220        bestvideo+bestaudio --dateafter $AFTER --write-sub      \
221        --write-auto-sub --sub-format srt/best --sub-lang en    \
222        --embed-subs -o "%(uploader)s/%(title)s.%(ext)s"        \
223        --playlist-end 5 "$2"
224    sleep 5
225}
226
227dl vsauce               https://www.youtube.com/user/Vsauce
228dl avikaplan            https://www.youtube.com/user/AviKaplanMusic
229dl robscallon           https://www.youtube.com/user/robs70986987
230dl logosbynick          https://www.youtube.com/channel/UCEQXp_fcqwPcqrzNtWJ1w9w
231dl andrewhuang          https://www.youtube.com/user/songstowearpantsto
232dl brandonacker         https://www.youtube.com/user/brandonacker
233dl lastweektonight      https://www.youtube.com/user/LastWeekTonight
234dl bingingwithbabish    https://www.youtube.com/user/bgfilms
235
236notify-send --icon /usr/share/icons/Suru++-Dark/apps/64/video.svg "Downloads finished" "Check the YouTube folder for new videos"
237```
238
239## Automation
240This is a very simple process.
2411. Store your script wherever you want but take note of the directory.
2422. Run `crontab -e`
243   * If you don't already have a cron utility installed, try `cronie`.
244     It should be in most repos.
2454. Paste and edit: `0 */6 * * * /home/user/path/to/script.sh`
2465. Save
2476. Exit
2487. ???
2498. Profit
250
251The pasted line runs the script every 6th hour of every day, every week,
252every month, and every year. To change the frequency just run `crontab
253-e`, edit the line, and save. [Crontab
254Generator](https://crontab-generator.org/) or [Crontab
255Guru](https://crontab.guru/) might be useful if the syntax is confusing.
256
257Have fun!
258
259## Edits
2601. Synchronised `--dateafter` parameter with last modified timestamp of the
261   archive file — thank you Dominic!
262
263[^1]: The [first commit](https://github.com/ytdl-org/youtube-dl/commit/4fa74b5252a23c2890ddee52b8ee5811b5bb2987) for the current youtube-dl repository was made on
26421 July 2008 and it references "the new youtube-dl", which suggests
265versions prior.
266[^2]: In the digital media world, muxing is process of combining two or
267more files into one. This might be a video track, multiple audio tracks,
268and/or subtitle tracks.
269[^3]: A *shebang* is a sequence of characters that tell your program
270loader what interpreter to use. `#!/bin/bash` is what you would use if
271it's a bash script. To use a Python interpreter, `#!/usr/bin/env python`
272will use the program search path to find the appropriate executable.