aboutsummaryrefslogtreecommitdiff
path: root/youtube_dl/utils.py
AgeCommit message (Collapse)Author
2023-07-29[utils] Rework URL path munging for ., .. componentsdirkf
* move processing to YoutubeDLHandler * also process `Location` header for redirect * use tests from https://github.com/yt-dlp/yt-dlp/pull/7662
2023-07-29[utils] Rework decoding of `Content-Encoding`sdirkf
* support nested encodings * support optional `br` encoding, if brotli package is installed * support optional 'compress' encoding, if ncompress package is installed * response `Content-Encoding` has only unprocessed encodings, or removed * response `Content-Length` is decoded length (usable for filesize metadata) * use zlib for both deflate and gzip decompression * some elements taken from yt-dlp: thx especially coletdjnz
2023-07-25[utils] Fix update_Request() with empty data (not None)dirkf
2023-07-20[utils] Remove stray undocumented Host header in redirect (fix 46fde7c)dirkf
2023-07-19[utils] Fix broken Py 3.11+ compat in `traverse_obj()`dirkf
* inspect.getargspec is missing despite doc claiming backward compat * replace with emulation of `Signature.bind()`
2023-07-19[utils] Minor updates (merge_dicts, T)dirkf
A couple of mods to ease yt-dlp back-ports: * add kwargs to merge_dicts: `unblank=True` (disallow empty string), `rev=False` (reverse the merge list) * add `T(x)` shortcut for `{x}`, unsupported in Py2.6
2023-07-19[utils] Improve js_to_json, align with yt-dlpdirkf
* support variable substitution, from https://github.com/yt-dlp/yt-dlp/pull/#521 etc, thanks ChillingPepper, Grub4k, pukkandan * improve escape handling, from https://github.com/yt-dlp/yt-dlp/pull/#521 thanks Grub4k * support template strings from https://github.com/yt-dlp/yt-dlp/pull/6623 thanks Grub4k * add limited `!` evaluation (eg, !!0 -> false, see tests)
2023-07-19[utils] Align traverse_obj() with yt-dlpdirkf
Thanks Grub4k for these: * traverse `Iterable`s, from https://github.com/yt-dlp/yt-dlp/pull/6902, etc * traverse `set` key for transformations/filters, `re.Match` group names, from https://github.com/yt-dlp/yt-dlp/commit/776995bc109c5cd1aa56b684fada2ce718a386ec, etc * traverse `re.Match`es, from https://github.com/yt-dlp/yt-dlp/pull/5174 * always return list when branching, from https://github.com/yt-dlp/yt-dlp/pull/5170
2023-07-18[test] Fixes for old Pythonsdirkf
2023-07-18[utils] `YoutubeDLCookieJar`: Add `get_cookie_header` and ↵bashonly
`get_cookies_for_url` methods
2023-07-18[core] Remove `Cookie` header on redirect to prevent leaksdirkf
Adated from yt-dlp/yt-dlp-ghsa-v8mc-9377-rwjj/pull/1/commits/101caac Thx coletdjnz
2023-07-18[core] Update redirect handling from yt-dlpdirkf
* Thx coletdjnz: https://github.com/yt-dlp/yt-dlp/pull/7094 * add test that redirected `POST` loses its `Content-Type`
2023-07-18[utils] Add {expected_type} and Iterable support to traverse_obj()dirkf
2023-07-05[Misc] Fixes for 2.6 compatibilitydirkf
2023-05-11[utils] Fix `compiled_regex_type` in 249f2b6dirkf
2023-04-23[YouTube] Support Releases tabdirkf
2023-04-05[devscripts] Improve hack to convert command-line options to API optionsdirkf
* define equality for DateRange * don't show default DateRange
2023-03-19[utils] Ensure `allow_types` for `variadic()` is a tupledirkf
2023-02-20Escape URLs in `sanitized_Request`, not `sanitize_url` ↵pukkandan
d2558234cf5dd12d6896eed5427b7dcdb3ab7b5a added escaping of URLs while sanitizing. However, `sanitize_url` may not always receive an actual URL. Eg: When using `youtube-dl "search query" --default-search ytsearch`, `search query` gets escaped to `search%20query` before being prefixed with `ytsearch:` which is not the intended behavior. So the escaping is moved to `sanitized_Request` instead.
2023-02-13[utils] Add parse_qs, update_urldirkf
[skip ci]
2023-02-13[YouTube] Bypass age-gating for certain restricted videosdirkf
* Use TVHTML5_SIMPLY_EMBEDDED_PLAYER client * Also add and fix tests * Introduce and use new utility function `update_url()`
2022-11-03[utils] Backport traverse_obj (etc) from yt-dlp (#31156)Andrei Lebedev
* Backport traverse_obj and closely related function from yt-dlp (code by pukkandan) * Backport LazyList, variadic(), try_call (code by pukkandan) * Recast using yt-dlp's newer traverse_obj() implementation and tests (code by grub4k) * Add tests for Unicode case folding support matching Py3.5+ (requires f102e3d) * Improve/add tests for variadic, try_call, join_nonempty Co-authored-by: dirkf <fieldhouse@gmx.net>
2022-10-11[utils] Sanitize look-alike Unicode glyphs in non-ID filename fields when ↵dirkf
--restrict-filenames Implements https://github.com/ytdl-org/youtube-dl/issues/31216#issuecomment-1236102822, which has a test.
2022-08-21[utils] Ensure RFC3986 encoding result is unicodedirkf
2022-08-14[jsinterp] Overhaul JSInterp to handle new YT players 4c3f79c5, 324f67b9 ↵dirkf
(#31170) * back-port from yt-dlp 8f53dc44a0cc1c2d98c35740b9293462c080f5d0, thanks pukkandan * also support void, improve <</>> precedence, improve expressions in comma-list * add more tests
2022-06-10[utils, etc] Kill child processes when yt-dl is killedpukkandan
* derived from PR #26592, closes #26592 Authored by: Unrud
2022-06-06[utils] Escape URL while sanitizingpukkandan
Closes #31008, #yt-dlp/263 While this fixes the issue in question, it does not try to address the root-cause of the problem Refer: 915f911e365736227e134ad654601443dbfd7ccb, f5fa042c82300218a2d07b95dd6b9c0756745db3
2022-05-28[utils] Enable ALPN in HTTPS to satisfy broken serversdirkf
See https://github.com/yt-dlp/yt-dlp/issues/3878
2021-04-17[utils] PEP 8Sergey M․
2021-04-17[utils] Add support for support for experimental HTTP response status code ↵Sergey M․
308 Permanent Redirect (refs #27877, refs #28768)
2021-01-04[utils] add a function to clean podcast URLsRemita Amine
2020-12-30[utils] accept only supported protocols in url_or_noneRemita Amine
2020-11-21Fix typos (#27084)Josh Soref
* spelling: authorization Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: brightcove Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: creation Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: exceeded Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: exception Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: extension Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: extracting Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: extraction Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: frontline Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: improve Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: length Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: listsubtitles Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: multimedia Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: obfuscated Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: partitioning Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: playlist Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: playlists Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: restriction Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: services Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: split Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: srmediathek Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: support Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: thumbnail Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: verification Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: whitespaces Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-11-17[utils] Skip ! prefixed code in js_to_jsonSergey M․
2020-10-18[utils] Don't attempt to coerce JS strings to numbers in js_to_json (#26851)Kevin O'Connor
The current logic in `js_to_json` tries to rewrite octal/hex numbers to decimal. However, when the logic actually happens the `"` or `'` have already been trimmed off. This causes what were originally strings, that happen to look like octal/hex numbers, to get rewritten to decimal and returned as a number rather than a string. In practive something like: ```js { "0x40": "foo", "040": "bar", } ``` would get rewritten as: ```json { 64: "foo", 32: "bar } ``` This is problematic since this isn't valid JSON as you cannot have non-string keys.
2020-09-06[utils] Recognize wav mimetype (closes #26463)Sergey M․
2020-05-20[utils] Fix file permissions in write_json_file (closes #12471) (#25122)Rob
2020-05-05[utils] Improve cookie files supportSergey M․
+ Add support for UTF-8 in cookie files * Skip malformed cookie file entries instead of crashing (invalid entry len, invalid expires at)
2020-03-10[utils] Add reference to cookie file formatSergey M․
2020-03-10Revert "[utils] Add support for cookies with spaces used instead of tabs"Sergey M․
According to [1] TABs must be used as separators between fields. Files produces by some tools with spaces as separators are considered malformed. 1. https://curl.haxx.se/docs/http-cookies.html This reverts commit cff99c91d150df2a4e21962a3ca8d4ae94533b8c.
2020-03-08[utils] Add support for cookies with spaces used instead of tabsSergey M․
2020-02-29[YoutubeDL] Force redirect URL to unicode on python 2Sergey M․
2019-12-15[utils] Improve str_to_intSergey M․
2019-11-29[utils] handle int values passed to str_to_intRemita Amine
2019-11-27[utils] Add generic caesar cipher and rot47Sergey M․
2019-11-27[utils] Handle rd-suffixed day parts in unified_strdate (#23199)InfernalUnderling
2019-10-29[utils] Actualize major IPv4 address blocks per countrySergey M․
2019-10-18[utils] Improve subtitles_filename (closes #22753)Sergey M․
2019-06-29[utils] Introduce random_user_agent and use as default User-Agent (closes ↵Sergey M․
#21546)
2019-06-14[utils] Restrict parse_codecs and add theora as known vcodec (#21381)Sergey M․