aboutsummaryrefslogtreecommitdiff
path: root/youtube_dl/extractor/common.py
AgeCommit message (Collapse)Author
2013-11-28[zdf] Use _download_xmlPhilipp Hagemeister
2013-11-25Merge branch 'opener-to-ydl'Philipp Hagemeister
2013-11-25Remove quality_name field and improve zdf extractorPhilipp Hagemeister
2013-11-25[zdf/common] Use API in ZDF extractor.Philipp Hagemeister
This also comes with a lot of extra format fields Fixes #1518
2013-11-24Merge branch 'master' into opener-to-ydlPhilipp Hagemeister
2013-11-24[collegehumor] Encode the xml before calling ↵Jaime Marquínez Ferrándiz
xml.etree.ElementTree.fromstring (fixes #1822) Uses a new helper method in InfoExtractor: _download_xml
2013-11-22Match --download-archive during playlist processing (Fixes #1745)Philipp Hagemeister
2013-11-22Move the opener to the YoutubeDL object.Philipp Hagemeister
This is the first step towards being able to just import youtube_dl and start using it. Apart from removing global state, this would fix problems like #1805.
2013-11-20Add support for tou.tv (Fixes #1792)Philipp Hagemeister
2013-11-16Add automatic generation of format note based on bitrate and codecsPhilipp Hagemeister
2013-11-15Don't accept '>' inside the content attribute in OpenGraph regexesJaime Marquínez Ferrándiz
2013-11-15Improve the OpenGraph regexJaime Marquínez Ferrándiz
* Do not accept '>' between the property and content attributes. * Recognize the properties if the content attribute is before the property attribute using two regexes (fixes the extraction of the description for SlideshareIE).
2013-11-12[common] Simplify og_search_propertyPhilipp Hagemeister
2013-11-05Fix AssertionError when og property not foundMarcin Cieślak
On tvp.pl some webpages contain OpenGraph metadata and some don't. If og property is not found, _og_search_description fails with WARNING: unable to extract OpenGraph description; please report this issue on http://yt-dl.org/bug Traceback (most recent call last): File "/usr/home/saper/bin/youtube-dl", line 18, in <module> youtube_dl.main() File "/usr/home/saper/sw/youtube-dl/youtube_dl/__init__.py", line 766, in main _real_main(argv) File "/usr/home/saper/sw/youtube-dl/youtube_dl/__init__.py", line 719, in _real_main retcode = ydl.download(all_urls) File "/usr/home/saper/sw/youtube-dl/youtube_dl/YoutubeDL.py", line 715, in download videos = self.extract_info(url) File "/usr/home/saper/sw/youtube-dl/youtube_dl/YoutubeDL.py", line 348, in extract_info ie_result = ie.extract(url) File "/usr/home/saper/sw/youtube-dl/youtube_dl/extractor/common.py", line 125, in extract return self._real_extract(url) File "/usr/home/saper/sw/youtube-dl/youtube_dl/extractor/tvp.py", line 56, in _real_extract info['description'] = self._og_search_description(webpage) File "/usr/home/saper/sw/youtube-dl/youtube_dl/extractor/common.py", line 331, in _og_search_description return self._og_search_property('description', html, fatal=False, **kargs) File "/usr/home/saper/sw/youtube-dl/youtube_dl/extractor/common.py", line 325, in _og_search_property return unescapeHTML(escaped) File "/usr/home/saper/sw/youtube-dl/youtube_dl/utils.py", line 494, in unescapeHTML assert type(s) == type(u'') AssertionError The patch allows me to use: try: info['description'] = self._og_search_description(webpage) info['thumbnail'] = self._og_search_thumbnail(webpage) except RegexNotFoundError: pass
2013-11-03Add the 'webpage_url' field to info_dictJaime Marquínez Ferrándiz
The url for the video page, it must allow to reproduce the result. It's automatically set by YoutubeDL if it's missing.
2013-10-30Remove superfluous spacePhilipp Hagemeister
2013-10-28Merge remote-tracking branch 'origin/master'Philipp Hagemeister
2013-10-28New debug option --write-pagesPhilipp Hagemeister
2013-10-28[Instagram] get the non-https link, as they are serving Akamai cert from a ↵Filippo Valsorda
instagram.com domain
2013-10-23[vimeo] Fix pro videos and player.vimeo.com urlsJaime Marquínez Ferrándiz
The old process can still be used for those videos. Added RegexNotFoundError, which is raised by _search_regex if it can't extract the info.
2013-10-21The 'format' field now defaults to '{format_id} - {width}x{height}{format_note}'Jaime Marquínez Ferrándiz
Following the YoutubeIE format. The 'format_note' gives additional info about the format, for example '3D' or 'DASH video'.
2013-10-18fix typosPhilipp Hagemeister
2013-10-06Allow users to specify an age limit (fixes #1545)Philipp Hagemeister
With these changes, users can now restrict what videos are downloaded by the intented audience, by specifying their age with --age-limit YEARS . Add rudimentary support in youtube, pornotube, and youporn.
2013-10-04Clarify that url and ext are optional when formats is given (#980)Philipp Hagemeister
2013-10-04Document formats (for #980)Philipp Hagemeister
2013-08-29Fix detection of the webpage charset if it's declared using ' instead of "Jaime Marquínez Ferrándiz
Like in "<meta charset='utf-8'/>"
2013-08-28[sohu] Handle encoding, and fix testsPhilipp Hagemeister
2013-08-28Merge remote-tracking branch 'origin/reuse_ies'Philipp Hagemeister
2013-08-28[addanime] improvePhilipp Hagemeister
2013-08-23Merge pull request #937 from jaimeMF/subtitles_reworkJaime Marquínez Ferrándiz
Subtitles rework
2013-08-21Cache suitable regular expressionsPhilipp Hagemeister
This speeds up TestAllURLsMatching.test_no_duplicates by about 8000% at the cost of minimal memory overhead.
2013-07-20Use a dictionary for storing the subtitlesJaime Marquínez Ferrándiz
The errors while getting the subtitles are reported as warnings, if no subtitles are found return and empty dict.
2013-07-17Use unescapeHTML for OpenGraph propertiesPhilipp Hagemeister
These are attribute values, so we don't need the more complex and whitespace-destroying cleanHTML - we just need to unescape quotes, that's it.
2013-07-13Strip hash info from URL when making requests (Fixes #1038)Philipp Hagemeister
2013-07-13Improve OpenGraph property matchingPhilipp Hagemeister
2013-07-13Use re.DOTALL by default when searching OpenGraph propertiesJaime Marquínez Ferrándiz
2013-07-12InfoExtractor: add some helper methods to extract OpenGraph infoJaime Marquínez Ferrándiz
2013-07-11Remove video_result helper methodPhilipp Hagemeister
Calling it was more complex then actually including the type in the video info
2013-07-08YoutubeIE: reuse instances of InfoExtractors (closes #998)Jaime Marquínez Ferrándiz
When a IE is added to the list, it's also added to a dictionary. When a IE is requested it first looks in the dictionary and if there's no instance it will create a new one. That way _real_initialize is only called once for each IE, saving time if it needs to login for example.
2013-07-08Merge branch 'master' of github.com:rg3/youtube-dlPhilipp Hagemeister
2013-07-08[3sat] Add support (Fixes #1001)Philipp Hagemeister
2013-07-07VimeoIE: authentication support (closes #885) and add a method in the base ↵Jaime Marquínez Ferrándiz
InfoExtractor to get the login info
2013-07-01Add --list-extractor-descriptions (human-readable list of IEs)Philipp Hagemeister
2013-06-29Document view_count (Closes #963)Philipp Hagemeister
2013-06-25improve generic and encrypted signature error messagesFilippo Valsorda
2013-06-23Remove useless headersPhilipp Hagemeister
2013-06-23Fix generic class move (add all files)Philipp Hagemeister