diff options
author | Marco Bonetti <sid77@slackware.it> | 2010-05-12 23:29:46 +0200 |
---|---|---|
committer | David Somero <xgizzmo@slackbuilds.org> | 2010-05-12 23:29:46 +0200 |
commit | 0c91c7f00b8e234d7a94ca1a0427d67bdaa63fc5 (patch) | |
tree | 6f85c1a8ab9e4dbb93819843786278d5c7c126ee /libraries/BeautifulSoup/README | |
parent | f1ff3efef1fdc18608a58f3f478da49596952d03 (diff) |
libraries/BeautifulSoup: Added to 12.2 repository
Diffstat (limited to 'libraries/BeautifulSoup/README')
-rw-r--r-- | libraries/BeautifulSoup/README | 26 |
1 files changed, 26 insertions, 0 deletions
diff --git a/libraries/BeautifulSoup/README b/libraries/BeautifulSoup/README new file mode 100644 index 000000000000..002dc2b40981 --- /dev/null +++ b/libraries/BeautifulSoup/README @@ -0,0 +1,26 @@ +Beautiful Soup is a Python HTML/XML parser designed for quick turnaround +projects like screen-scraping. Three features make it powerful: + + 1. Beautiful Soup won't choke if you give it bad markup. It yields a + parse tree that makes approximately as much sense as your original + document. This is usually good enough to collect the data you need + and run away. + + 2. Beautiful Soup provides a few simple methods and Pythonic idioms for + navigating, searching, and modifying a parse tree: a toolkit for + dissecting a document and extracting what you need. You don't have to + create a custom parser for each application. + + 3. Beautiful Soup automatically converts incoming documents to Unicode and + outgoing documents to UTF-8. You don't have to think about encodings, + unless the document doesn't specify an encoding and Beautiful Soup can't + autodetect one. Then you just have to specify the original encoding. + +Beautiful Soup parses anything you give it, and does the tree traversal +stuff for you. You can tell it "Find all the links", or "Find all the links +of class externalLink", or "Find all the links whose urls match "foo.com", +or "Find the table heading that's got bold text, then give me that text." + +Valuable data that was once locked up in poorly-designed websites is now +within your reach. Projects that would have taken hours take only minutes +with Beautiful Soup. |