1---
2title: "Replacing YouTube & Invidious"
3description: "Simple script I created to download YouTube videos"
4author: Amolith
5cover: /assets/pngs/download-video.png
6date: 2020-08-03T05:43:37-04:00
7draft: false
8toc: true
9categories:
10 - Technology
11tags:
12 - YouTube
13 - Invidious
14 - youtube-dl
15 - Media
16 - Automation
17---
18
19Omar Roth, the developer of Invidious, recently wrote a blog post about
20*[Stepping away from open
21source.](https://omar.yt/posts/stepping-away-from-open-source)* While I
22never used the official instance, I thought this was a good opportunity
23to create a tool that downloads videos from YouTubers I'm subscribed to
24so I can watch them offline in whatever manner I prefer.
25
26To that end, [youtube-dl](https://github.com/ytdl-org/youtube-dl) is by
27far the most reliable and versatile option. Having been around since
28before 2008[^1], I don't think the project is going anywhere.
29[MPV](https://mpv.io/) is my media player of choice and it relies on
30youtube-dl for watching online content from Twitch to YouTube to PornHub
31to [much
32more.](https://ytdl-org.github.io/youtube-dl/supportedsites.html)
33
34Conveniently, youtube-dl comes with all of the tools and flags needed
35for exactly this purpose so scripting and automating it is incredibly
36easy. Taking a look at `man youtube-dl` reveals a *plethora* of options
37but only a few are necessary.
38
39## Preventing duplicates
40The main thing when downloading an entire channel is ensuring the same
41video isn't downloaded more than once. `--download-archive` will save the
42IDs of all the videos downloaded to prevent them from being downloaded
43again. All that's required is a file path at which to store the list.
44
45```bash
46--download-archive .archives/<channel>.txt
47```
48
49## Selecting quality
50By default, youtube-dl downloads the *single* file with the best quality
51audio and video so it doesn't need to mux[^2] them together after.
52However, I prefer to have the highest possibly quality and don't mind
53waiting a few seconds longer for FFmpeg to combine audio and video
54files. `--format` lets the user decide which format they prefer and
55whether they want to focus on storage efficiency or quality. The basic
56options are `best`, `worst`, `bestvideo`, and `bestaudio`. These will
57download a *single* file that is the highest or lowest quality of both
58video/audio or video-only/audio-only. There are a lot of other options
59for more fine-grained control over what to look for but, as I mentioned,
60I want the highest video and the highest quality audio so I use
61`bestvideo+bestaudio`. After downloading both of those files, they will
62be muxed together.
63
64```bash
65--format bestvideo+bestaudio
66```
67
68## Embedding subtitles
69I enjoy subtitles but I know many people don't so this section can
70certainly be ignored.
71
72There are a few options for fetching and embedding subtitles. To get
73"real" subtitles that someone transcribed manually, use `--write-sub`.
74For YouTube's auto-generated subtitles that may or may not be horribly
75inaccurate, use `--write-auto-sub`. For selecting the language,
76`--sub-lang <language-code>`, for embedding them, `--embed-subs`, and
77for selecting the format, `--sub-format <desired-format>`. I like using
78SRT files so I have that first with `best` beside it like so:
79`--sub-format srt/best`. SRT is preferred but, if unavailable, whatever
80other highest-quality format will be used and embedded instead.
81
82```bash
83--write-sub --write-auto-sub --sub-format srt/best --sub-lang en --embed-subs
84```
85
86## Limiting downloads
87When switching to this method, the initial download will pull *all* of a
88channel's videos. I certainly don't want this so they should be limited
89in some way. `--dateafter` and `--playlist-end` serve very nicely. The
90former will only download videos published *after* a certain date and
91the latter will only download X number of videos.
92
93```bash
94--dateafter 20200801 --playlist-end 5
95```
96
97**EDIT:** A reader sent me an email with this improvement to the archive
98functionality. Rather than checking the last five days of videos to see
99if they've already been downloaded, this snippet will check when the
100archive file was last edited and use that as the `--dateafter`
101parameter, making the script a bit more efficient.
102
103``` bash
104AFTER=$(date -r .archives/"$1".txt +%Y%m%d 2>/dev/null || date +%Y%m%d)
105--dateafter $AFTER --playlist-end 5
106```
107
108## Naming format
109The final parameter to look at is how to name the files once they're
110downloaded. `--output` provides templating functionality and there are a
111*lot* of options. For this use, an acceptable template might be
112something like `Channel/Title.ext`. In youtube-dl's templating format,
113that's `"%(uploader)s/%(title)s.%(ext)s"`.
114
115```bash
116--output "%(uploader)s/%(title)s.%(ext)s"
117```
118
119## Getting notifications
120I don't yet have a good method for getting notifications when there are
121*new* videos but there is a simple way to get notified when the script
122is finished running. `notify-send` is one of the easiest and has pretty
123simple syntax as well: the first string is the notification summary and
124the second is a longer description. You can optionally pass an icon name
125to make it look a little better.
126
127```bash
128notify-send -i video-x-generic "Downloads finished" "Check the YouTube folder for new videos"
129```
130
131For some reason, I don't get icons when the generic name is specified
132but I know the command will work on most systems. On mine, I have to
133pass the path to the icon file I want: `-i
134/usr/share/icons/Suru++-Dark/apps/64/video.svg`
135
136## Writing the script
137I want to store the videos in `~/Videos/YouTube` and I want the archive
138records stored in `.archives` so the first line (after the shebang[^3])
139creates those directories if they don't already exist and the second
140enters the `YouTube` folder.
141
142```bash
143mkdir -p "$HOME/Videos/YouTube/.archives"
144cd "$HOME/Videos/YouTube"
145```
146
147From here, a way to reuse the youtube-dl command is necessary so
148parameters can be changed in one place and they'll apply to all
149channels. Functions are intended for exactly this purpose and are
150formatted like so:
151
152```bash
153functionName () {
154 # Code here
155}
156```
157
158I've named the function `dl` so mine looks like this:
159```bash
160dl () {
161 AFTER=$(date -r .archives/"$1".txt +%Y%m%d 2>/dev/null || \
162 date +%Y%m%d)
163
164 youtube-dl --download-archive .archives/"$1".txt -f \
165 bestvideo+bestaudio --dateafter 20200801 --write-sub \
166 --write-auto-sub --sub-format srt/best --sub-lang en \
167 --embed-subs -o "%(uploader)s/%(title)s.%(ext)s" \
168 --playlist-end 5 "$2"
169 sleep 5
170}
171```
172
173Because it's in a function that's called repeatedly, the `AFTER`
174variable will be reevaluated each time using a different archive file to
175ensure no videos are missed. The backslashes at the end (`\`) tell bash
176that it's a single command spanning multiple lines. At the bottom,
177`sleep` just waits 5 seconds before downloading the next channel. It's
178unlikely that YouTube will ratelimit a residential address for this but
179it is still possible. Waiting a bit before continuing reduces the
180likelihood further.
181
182Note the use of `"$1"` and `"$2"` in the archive path and at the very
183end of the youtube-dl command. This lets the user define what the
184archive file should be named and what channel to download videos from. A
185line using the function would be something like:
186
187```bash
188dl linustechtips https://www.youtube.com/user/LinusTechTips
189```
190
191The result would be this directory structure:
192```text
193YouTube
194|-- .archives
195| \-- linustechtips.txt
196\-- Linus Tech Tips
197 \-- They still make MP3 players.mkv
198```
199
200The last line is the notification command:
201```bash
202notify-send -i video-x-generic "Downloads finished" "Check the YouTube folder for new videos"
203```
204
205## Finished script
206This the script I have in use right now.
207
208```bash
209#!/bin/bash
210
211mkdir -p "$HOME/Videos/YouTube/.archives"
212cd "$HOME/Videos/YouTube"
213
214
215dl () {
216 AFTER=$(date -r .archives/"$1".txt +%Y%m%d 2>/dev/null || \
217 date +%Y%m%d)
218
219 youtube-dl --download-archive .archives/"$1".txt -f \
220 bestvideo+bestaudio --dateafter $AFTER --write-sub \
221 --write-auto-sub --sub-format srt/best --sub-lang en \
222 --embed-subs -o "%(uploader)s/%(title)s.%(ext)s" \
223 --playlist-end 5 "$2"
224 sleep 5
225}
226
227dl vsauce https://www.youtube.com/user/Vsauce
228dl avikaplan https://www.youtube.com/user/AviKaplanMusic
229dl robscallon https://www.youtube.com/user/robs70986987
230dl logosbynick https://www.youtube.com/channel/UCEQXp_fcqwPcqrzNtWJ1w9w
231dl andrewhuang https://www.youtube.com/user/songstowearpantsto
232dl brandonacker https://www.youtube.com/user/brandonacker
233dl lastweektonight https://www.youtube.com/user/LastWeekTonight
234dl bingingwithbabish https://www.youtube.com/user/bgfilms
235
236notify-send --icon /usr/share/icons/Suru++-Dark/apps/64/video.svg "Downloads finished" "Check the YouTube folder for new videos"
237```
238
239## Automation
240This is a very simple process.
2411. Store your script wherever you want but take note of the directory.
2422. Run `crontab -e`
243 * If you don't already have a cron utility installed, try `cronie`.
244 It should be in most repos.
2454. Paste and edit: `0 */6 * * * /home/user/path/to/script.sh`
2465. Save
2476. Exit
2487. ???
2498. Profit
250
251The pasted line runs the script every 6th hour of every day, every week,
252every month, and every year. To change the frequency just run `crontab
253-e`, edit the line, and save. [Crontab
254Generator](https://crontab-generator.org/) or [Crontab
255Guru](https://crontab.guru/) might be useful if the syntax is confusing.
256
257Have fun!
258
259## Edits
2601. Synchronised `--dateafter` parameter with last modified timestamp of the
261 archive file — thank you Dominic!
262
263[^1]: The [first commit](https://github.com/ytdl-org/youtube-dl/commit/4fa74b5252a23c2890ddee52b8ee5811b5bb2987) for the current youtube-dl repository was made on
26421 July 2008 and it references "the new youtube-dl", which suggests
265versions prior.
266[^2]: In the digital media world, muxing is process of combining two or
267more files into one. This might be a video track, multiple audio tracks,
268and/or subtitle tracks.
269[^3]: A *shebang* is a sequence of characters that tell your program
270loader what interpreter to use. `#!/bin/bash` is what you would use if
271it's a bash script. To use a Python interpreter, `#!/usr/bin/env python`
272will use the program search path to find the appropriate executable.