aboutsummaryrefslogtreecommitdiff
path: root/youtube_dl/extractor
AgeCommit message (Collapse)Author
2023-02-20Escape URLs in `sanitized_Request`, not `sanitize_url` ↵pukkandan
d2558234cf5dd12d6896eed5427b7dcdb3ab7b5a added escaping of URLs while sanitizing. However, `sanitize_url` may not always receive an actual URL. Eg: When using `youtube-dl "search query" --default-search ytsearch`, `search query` gets escaped to `search%20query` before being prefixed with `ytsearch:` which is not the intended behavior. So the escaping is moved to `sanitized_Request` instead.
2023-02-20[Vimeo] Fix e19ec52 for tween-age Pythonsdf
* a check in older Pythons in the 2.7 and earlier, 3.3, 3.4 series caused "sre_constants.error: nothing to repeat" * satisfy the check by avoiding nested qualifiers that can match empty string Resolves #31597
2023-02-17[YouTube] Avoid crash if uploader_id extraction failsdirkf
See #31530.
2023-02-14[InfoExtractor] Handle unquoted values in OpenGraph searchesdirkf
2023-02-13[StreamsbIE] Add extractor for streamsb.com (viewsb.com) (#31517)fonkap
* Add extractor for streamsb.com (viewsb.com) * make data url using app.js version --------- Co-authored-by: dirkf <fieldhouse@gmx.net>
2023-02-13[KommunetvIE] Add extractor for kommunetv.no (#31516)fonkap
* Add extractor for kommunetv.no * Using utils.update_url instead of regex --------- Co-authored-by: dirkf <fieldhouse@gmx.net>
2023-02-13[FileMoonIE] Add extractor for filemoon.sx (#31515)fonkap
--------- Co-authored-by: dirkf <fieldhouse@gmx.net>
2023-02-13[rbgtum] Add new extractor (#31305)Valentin Metz
* [rbgtum] Add new extractor * Small update, force CI --------- Co-authored-by: dirkf <fieldhouse@gmx.net>
2023-02-13[YouTube] Fix testsdirkf
2023-02-13[YouTube] Refresh compat/utils usagedirkf
* import parse_qs() * import parse_qs in lazy_extractors (clears old TODO) * clean up old compiled lazy_extractors for Py2 * use update_url()
2023-02-13[YouTube] Add `signatureTimestamp` for age-gate bypassdirkf
2023-02-13[YouTube] Bypass age-gating for certain restricted videosdirkf
* Use TVHTML5_SIMPLY_EMBEDDED_PLAYER client * Also add and fix tests * Introduce and use new utility function `update_url()`
2023-02-12[Vimeo] Support /user{video_id}/{slug} URL formatdirkf
2023-02-12[Vimeo] Fix `Unable to extract info section` reduxdirkf
* as reported in yt-dlp/yt-dlp#6149 * also allow newline in target JSON object
2023-02-12[IGN] Overhaul extractor to avoid URL redirection loopdirkf
Consequently/also: * centralise video data extraction * detect 404 and 503 expected errors * handle the test video in IGNVideo * handle two additional page formats for the tests in IGNArticle
2023-02-03[ITV] Overhaul ITV extractor (#30266)dirkf
* support ITVX URLs (thanks Vangelis66) * support legacy ITV Hub URLs * include extraction fix 4c57dd2 from sleaux-meaux 3 May 2021 * include extraction fix 6fbcc16, fix by staubichsauger & pukkandan * work-around duration parsing pending fix to utils.parse_duration * apply default vanilla UA for pages and media to avoid site blocking * also detect and report `Episode not found` instead of generic 404 * rework ITVBTCCIE with geo-block detection, best effort geo-restriction handling, news article support * fix tests
2023-02-02[myvideoge] Add new extractor (#31360)dirkf
NB download tests on CI servers blocked Co-authored-by: Alfonso Solbes <fonk666@gmail.com>
2023-02-02[xhamster] add support for new domain xhvid.com (#31370)afterdelight
2023-02-02[FIFA] Back-port extractor from yt-dlp (#31385)dirkf
2023-02-02[Blerp] Add new extractor (#31398)Epsilonator
Co-authored-by: dirkf <fieldhouse@gmx.net>
2023-02-02[YouTube] Fix not finding videos listed under a channel's "shorts" subpage. ↵zhangeric-15
(#31409) Resolves #31336 Co-authored-by: Jouni Järvinen <rautamiekka@users.noreply.github.com> Co-authored-by: dirkf <fieldhouse@gmx.net>
2023-02-02[Callin] Add new extractor (#31414)Ruowang Sun
Co-authored-by: dirkf <fieldhouse@gmx.net>
2023-02-02[pr0gramm] implement InfoExtractor, Resolves #31433 (#31434)Leon Etienne
* [pr0gramm] implement infoextractor * [pr0gramm] remove misplaced comment, uncapture regex-group * [pr0gramm]: specify utf-8 coding * [pr0gramm]: add trailing comma to lists for maintainability * [pr0gramm]: ie only sets upload_date attribute * [pr0gramm]: add video_id to title * [pr0gramm]: more forgiving _valid_url regex * [pr0gramm]: add uploader to title, if set * Discriminate URL pattern --------- Co-authored-by: dirkf <fieldhouse@gmx.net>
2023-02-02[cammodels] fix and improve extractor (#31453)JChris246
Co-authored-by: dirkf <fieldhouse@gmx.net>
2023-02-02[americastestkitchen] Add support for downloading entire series (#31493)Brian Marks
Also * support new sites and URL patterns * back-port from yt-dlp Co-authored-by: dirkf <fieldhouse@gmx.net>
2022-11-13[generic] Improve KVS (etc) extractiondirkf
2022-11-13[generic] Improve KVS (etc) extractiondirkf
* detect kt_player('kt_player', 'https://.../kt_player.swf?v=5... * detect age limit if 18 USC 2257 is mentioned * test with shooshtime.com Partially resolves #31332.
2022-11-13Added ThisVid.com support (#29187)FraFraFra-LongD
* add ThisVidIE, ThisVidMemberIE, ThisVidPlaylistIE * redirect embed to main page for more metadata * use KVS extraction newly added to GenericIE and remove duplicate tests * also add MrDeepFake etc compat to GenericIE (closes #22390) Co-authored-by: dirkf <fieldhouse@gmx.net>
2022-11-12[generic] Add KVS player extractiondirkf
2022-11-11[common:jwplayer] Improve jwplayer extraction and parsing (#31000)dirkf
* don't crash parser if jwplayer_data is invalid (empty, or no formats) * use `label` in `sources[n]` as `format_id` * relax `jwplayer().setup(...)` RE (also rework PR #27274 enhancement) * detect more manifest formats in _parse_jwplayer_formats() (from PR #29596) * improve metadata extraction (from PR #25433) * remember URLs in a set * use parse_resolution() in format * extract filesize in format (from yt-dlp) Co-authored-by: kikuyan <kikuyan@users.noreply.github.com> Co-authored-by: martin54 <martin54@users.noreply.github.com>
2022-11-09[PeekVids, PlayVids] Add new extractor (#29765)Moises Lima
* Merge back-port from yt-dlp * Merge features from PR #29798 * Improve metadata extraction Co-authored-by: dirkf <fieldhouse@gmx.net> Co-authored by: AXDOOMER
2022-11-04[extractor/ceskatelevize] Back-port extractor from yt-dlp, etc (#30713)dirkf
* back-port extractor, removing CeskaTelevizePoradyIE * follow redirect URL * support liveBroadcast and videobonusDetail in __NEXT__ data * return single video for singleton playlist * fix/add tests
2022-10-30[netease] Support urls shared from mobile app (#31304)Xie Yanbo
Co-authored-by: dirkf <fieldhouse@gmx.net>
2022-10-30[netease] Impove error handling (#31303)Xie Yanbo
* add warnings for users outside of China * skip empty song urls Co-authored-by: dirkf <fieldhouse@gmx.net>
2022-10-27[Vimeo] Update variable name in hydration JSON patterndirkf
Fixes #31311
2022-10-20[BongaCams] Support new .net domaindirkf
Resolves #31262.
2022-10-18Fix ADN extractor (#31275)ache
* Rename Anime Digital Network to Animation Digital Network, animationdigitalnetwork.fr * Update the test to an available video * Update the decoding key of subtitles * Keep the support of old URLs * Add a test to match the old URL * Reduce redundancy of the URL name * Fix md5 ^^" * Fix undefined _BASE * Process HTTP error text (eg geo-block) correctly and uniformly in Py3, Py2 * Skip test for CI since geo-blocked Signed-off-by: ache <ache@ache.one> Co-authored-by: dirkf <fieldhouse@gmx.net>
2022-10-13[ManyVids] Support new single-page app structuredirkf
2022-10-13[ManyVids] Support new single-page app structuredirkf
See https://github.com/yt-dlp/yt-dlp/issues/5210#issuecomment-1276919962.
2022-10-12[Motherless] Pull from yt-dlp, etcdirkf
* use username field * loosen regexes * warn on page count 0 in group * avoid reloading group page 1 Closes #29626
2022-10-11[netease] Get netease music download url through player api (#31235)Xie Yanbo
* remove unplayable song from test * compatible with python 2 * using standard User_Agent, fix imports * use hash instead of long description * fix lint * fix hash
2022-10-11[Common:JWPlayer] Fix x1000 scaling errordirkf
See https://github.com/yt-dlp/yt-dlp/issues/5106#issuecomment-1264625161
2022-10-11[ZDF] Overhaul ZDF extractorsdirkf
* pull some yt-dlp changes into ZDFBaseIE._extract_format() * add test cases from yt-dlp to ZDFIE * fix crash in ZDFIE._extract_mobile() when object had no `formitaeten` * improve title extraction in ZDFChannelIE (remove trailing station ident) * avoid extracting non-video playlist items (fixes #31149)
2022-10-10[motherless] Fixed the broken uploader_id in the extractor (#31243)Xiyue
* Fixed the broken uploader_id in the extractor. * Make uploader_id RE looser * Fix uploader_id in test Motherless_3 * Fix group pagination * # coding: utf-8 Co-authored-by: Andy Xuming <xuminic@gmail.com> Co-authored-by: dirkf <fieldhouse@gmx.net>
2022-10-10[manyvids] Improve extraction (#31172)dirkf
* extract all formats from page * extract description, uploader, views, likes * downrate previews * fix tests * use txt_or_none()
2022-10-10[NRK] Remove explicit Accept-Encoding header that invites Brotlidirkf
Fixes #31285
2022-10-04[Telegraaf] Use mobile GraphQL API endpointcoletdjnz
Workaround for Cloudflare 403 Fixes https://github.com/yt-dlp/yt-dlp/issues/5000 Authored by: coletdjnz
2022-08-25[YouTube] Improve error check for n-sig processingdirkf
2022-08-19[infoq] Avoid crash if the page has no `mp3Form`gudata
* proposed fix for issue #31131, aligns with yt-dlp Co-authored-by: dirkf <fieldhouse@gmx.net>
2022-08-19[uktvplay] Support domain without .uktvdirkf