aboutsummaryrefslogtreecommitdiff
path: root/youtube_dl/extractor/common.py
AgeCommit message (Collapse)Author
2014-01-22Merge branch 'youtube-dash-manifest'Philipp Hagemeister
Conflicts: youtube_dl/extractor/youtube.py
2014-01-21[extractor/common] Clarify when and when not we generate the filenamePhilipp Hagemeister
2014-01-21Deal with implicitly UTF-16 decoded webpagesPhilipp Hagemeister
These webpages don't specify an encoding and rely on the BOM
2014-01-19[youtube] Download DASH manifestPhilipp Hagemeister
If given, download and parse the DASH manifest file, in order to get ultra-HQ formats. Fixes #2166
2014-01-17[extractor/common] Limit --write-pages filename to 200 charsPhilipp Hagemeister
This avoids problems with very long URLs.
2014-01-07[pornhub] Use centralized sortingPhilipp Hagemeister
2014-01-07[khanacademy] Add support (Fixes #2066)Philipp Hagemeister
2014-01-06[orf] Use new extraction method (Fixes #2057)Philipp Hagemeister
2014-01-03[jpopsuki] SimplifyPhilipp Hagemeister
2014-01-01[wistia] Prefer original video format above all othersPhilipp Hagemeister
We could also set up a formula which would weigh filesize/bitrate and vcodec/acodec (say, 1GB h264 < 3 GB MPEG2 < 2 GB h264), but that would get really messy real soon.
2013-12-26Document that format_id field should be presentPhilipp Hagemeister
2013-12-25[yahoo] Use centralized sorting, and add tbr fieldPhilipp Hagemeister
2013-12-24[zdf] Use centralized sortingPhilipp Hagemeister
2013-12-24[spiegel] Use centralized sortingPhilipp Hagemeister
2013-12-24Add temporary _sort_formats helper functionPhilipp Hagemeister
2013-12-24Add a resolution field and improve general --list-formats outputPhilipp Hagemeister
2013-12-23[myvideo] Use RTMP instead of RTMPT (Fixes #2032)Philipp Hagemeister
2013-12-23[bliptv] Remove support for direct downloadsPhilipp Hagemeister
This is now handled by the generic IE
2013-12-20[aparat] Add support (Fixes #2012)Philipp Hagemeister
2013-12-19[generic] Detect ooyala videos (fixes #2013)Jaime Marquínez Ferrándiz
2013-12-17[youtube] Do not warn for videos with allow_rating=0Philipp Hagemeister
This fixes #1982 Test video: http://www.youtube.com/watch?v=gi2uH3YxohU
2013-12-16_search_regex's "isatty" call fails with Py2exe'sItay Brandes
_search_regex calls the sys.stderr.isatty() function for unix systems. Py2exe uses a custom Stderr() stream which doesn't have an `isatty()` function, leading to it's crash. Fixes easily with checking that it's a unix system first.
2013-12-16Reorder info_dict documentationPhilipp Hagemeister
2013-12-16Document duration fieldPhilipp Hagemeister
2013-12-10[mtv] Fixup incorrectly encoded XML documentsPhilipp Hagemeister
2013-12-09Add fatal=False parameter to _download_* functions.Philipp Hagemeister
This allows us to simplify the calls in the youtube extractor even further.
2013-12-05[9gag] Like/dislike count (#1895)Philipp Hagemeister
2013-12-02[smotri] SimplifyPhilipp Hagemeister
2013-11-28[zdf] Use _download_xmlPhilipp Hagemeister
2013-11-25Merge branch 'opener-to-ydl'Philipp Hagemeister
2013-11-25Remove quality_name field and improve zdf extractorPhilipp Hagemeister
2013-11-25[zdf/common] Use API in ZDF extractor.Philipp Hagemeister
This also comes with a lot of extra format fields Fixes #1518
2013-11-24Merge branch 'master' into opener-to-ydlPhilipp Hagemeister
2013-11-24[collegehumor] Encode the xml before calling ↵Jaime Marquínez Ferrándiz
xml.etree.ElementTree.fromstring (fixes #1822) Uses a new helper method in InfoExtractor: _download_xml
2013-11-22Match --download-archive during playlist processing (Fixes #1745)Philipp Hagemeister
2013-11-22Move the opener to the YoutubeDL object.Philipp Hagemeister
This is the first step towards being able to just import youtube_dl and start using it. Apart from removing global state, this would fix problems like #1805.
2013-11-20Add support for tou.tv (Fixes #1792)Philipp Hagemeister
2013-11-16Add automatic generation of format note based on bitrate and codecsPhilipp Hagemeister
2013-11-15Don't accept '>' inside the content attribute in OpenGraph regexesJaime Marquínez Ferrándiz
2013-11-15Improve the OpenGraph regexJaime Marquínez Ferrándiz
* Do not accept '>' between the property and content attributes. * Recognize the properties if the content attribute is before the property attribute using two regexes (fixes the extraction of the description for SlideshareIE).
2013-11-12[common] Simplify og_search_propertyPhilipp Hagemeister
2013-11-05Fix AssertionError when og property not foundMarcin Cieślak
On tvp.pl some webpages contain OpenGraph metadata and some don't. If og property is not found, _og_search_description fails with WARNING: unable to extract OpenGraph description; please report this issue on http://yt-dl.org/bug Traceback (most recent call last): File "/usr/home/saper/bin/youtube-dl", line 18, in <module> youtube_dl.main() File "/usr/home/saper/sw/youtube-dl/youtube_dl/__init__.py", line 766, in main _real_main(argv) File "/usr/home/saper/sw/youtube-dl/youtube_dl/__init__.py", line 719, in _real_main retcode = ydl.download(all_urls) File "/usr/home/saper/sw/youtube-dl/youtube_dl/YoutubeDL.py", line 715, in download videos = self.extract_info(url) File "/usr/home/saper/sw/youtube-dl/youtube_dl/YoutubeDL.py", line 348, in extract_info ie_result = ie.extract(url) File "/usr/home/saper/sw/youtube-dl/youtube_dl/extractor/common.py", line 125, in extract return self._real_extract(url) File "/usr/home/saper/sw/youtube-dl/youtube_dl/extractor/tvp.py", line 56, in _real_extract info['description'] = self._og_search_description(webpage) File "/usr/home/saper/sw/youtube-dl/youtube_dl/extractor/common.py", line 331, in _og_search_description return self._og_search_property('description', html, fatal=False, **kargs) File "/usr/home/saper/sw/youtube-dl/youtube_dl/extractor/common.py", line 325, in _og_search_property return unescapeHTML(escaped) File "/usr/home/saper/sw/youtube-dl/youtube_dl/utils.py", line 494, in unescapeHTML assert type(s) == type(u'') AssertionError The patch allows me to use: try: info['description'] = self._og_search_description(webpage) info['thumbnail'] = self._og_search_thumbnail(webpage) except RegexNotFoundError: pass
2013-11-03Add the 'webpage_url' field to info_dictJaime Marquínez Ferrándiz
The url for the video page, it must allow to reproduce the result. It's automatically set by YoutubeDL if it's missing.
2013-10-30Remove superfluous spacePhilipp Hagemeister
2013-10-28Merge remote-tracking branch 'origin/master'Philipp Hagemeister
2013-10-28New debug option --write-pagesPhilipp Hagemeister
2013-10-28[Instagram] get the non-https link, as they are serving Akamai cert from a ↵Filippo Valsorda
instagram.com domain
2013-10-23[vimeo] Fix pro videos and player.vimeo.com urlsJaime Marquínez Ferrándiz
The old process can still be used for those videos. Added RegexNotFoundError, which is raised by _search_regex if it can't extract the info.
2013-10-21The 'format' field now defaults to '{format_id} - {width}x{height}{format_note}'Jaime Marquínez Ferrándiz
Following the YoutubeIE format. The 'format_note' gives additional info about the format, for example '3D' or 'DASH video'.
2013-10-18fix typosPhilipp Hagemeister