-
-
Notifications
You must be signed in to change notification settings - Fork 352
Update docs for dropped HTTP client code #520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,7 +3,8 @@ Password-Protected Feeds | |
|
||
:program:`Universal Feed Parser` supports downloading and parsing | ||
password-protected feeds that are protected by :abbr:`HTTP (Hypertext Transfer Protocol)` | ||
authentication. Both basic and digest authentication are supported. | ||
basic authentication. For any other types of authentication, you can handle the | ||
authentication yourself and then parse the retrieved feed. | ||
|
||
|
||
Downloading a feed protected by basic authentication (the easy way) | ||
|
@@ -17,89 +18,23 @@ In this example, the username is test and the password is basic. | |
.. code-block:: pycon | ||
|
||
>>> import feedparser | ||
>>> d = feedparser.parse('http://test:basic@feedparser.org/docs/examples/basic_auth.xml') | ||
>>> d = feedparser.parse('http://test:basic@$READTHEDOCS_CANONICAL_URL/examples/basic_auth.xml') | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The previous URL was no longer available, so the example didn't work I believe this should get expanded to https://feedparser.readthedocs.io/en/latest/examples/basic_auth.xml instead, which should work - though this feed doesn't actually seem to be protected by basic auth (and same for https://feedparser.readthedocs.io/en/latest/examples/digest_auth.xml), but I checked this works with a different URL that does have basic auth |
||
>>> d.feed.title | ||
'Sample Feed' | ||
|
||
The same technique works for digest authentication. (Technically, | ||
:program:`Universal Feed Parser` will attempt basic authentication first, but | ||
if that fails and the server indicates that it requires digest authentication, | ||
:program:`Universal Feed Parser` will automatically re-request the feed with | ||
the appropriate digest authentication headers. *This means that this technique | ||
will send your password to the server in an easily decryptable form.*) | ||
|
||
Downloading a feed with other types of authentication | ||
----------------------------------------------------- | ||
|
||
.. _example.auth.inline.digest: | ||
For any other type of authentication, you should retrieve the feed yourself and | ||
handle authentication as needed (e.g. via `requests | ||
<https://requests.readthedocs.io>` - this is what :program:`Universal Feed Parser` | ||
uses internally), and then you can just call ``feedparser.parse`` on the | ||
retrieved feed content. | ||
|
||
Downloading a feed protected by digest authentication (the easy but horribly insecure way) | ||
------------------------------------------------------------------------------------------ | ||
|
||
In this example, the username is test and the password is digest. | ||
|
||
.. code-block:: pycon | ||
|
||
>>> import feedparser | ||
>>> d = feedparser.parse('http://test:[email protected]/docs/examples/digest_auth.xml') | ||
>>> d.feed.title | ||
'Sample Feed' | ||
|
||
|
||
|
||
You can also construct a HTTPBasicAuthHandler that contains the password | ||
information, then pass that as a handler to the ``parse`` function. | ||
HTTPBasicAuthHandler is part of the standard `urllib2 <http://docs.python.org/lib/module-urllib2.html>`_ module. | ||
|
||
Downloading a feed protected by :abbr:`HTTP (Hypertext Transfer Protocol)` basic authentication (the hard way) | ||
-------------------------------------------------------------------------------------------------------------- | ||
|
||
.. code-block:: python | ||
|
||
import urllib2, feedparser | ||
|
||
# Construct the authentication handler | ||
auth = urllib2.HTTPBasicAuthHandler() | ||
|
||
# Add password information: realm, host, user, password. | ||
# A single handler can contain passwords for multiple sites; | ||
# urllib2 will sort out which passwords get sent to which sites | ||
# based on the realm and host of the URL you're retrieving | ||
auth.add_password('BasicTest', 'feedparser.org', 'test', 'basic') | ||
|
||
# Pass the authentication handler to the feed parser. | ||
# handlers is a list because there might be more than one | ||
# type of handler (urllib2 defines lots of different ones, | ||
# and you can build your own) | ||
d = feedparser.parse( | ||
'$READTHEDOCS_CANONICAL_URL/examples/basic_auth.xml', | ||
handlers=[auth], | ||
) | ||
|
||
|
||
|
||
Digest authentication is handled in much the same way, by constructing an | ||
HTTPDigestAuthHandler and populating it with the necessary realm, host, user, | ||
and password information. This is more secure than | ||
:ref:`stuffing the username and password in the URL <example.auth.inline.digest>`, | ||
since the password will be encrypted before being sent to the server. | ||
|
||
|
||
Downloading a feed protected by :abbr:`HTTP (Hypertext Transfer Protocol)` digest authentication (the secure way) | ||
----------------------------------------------------------------------------------------------------------------- | ||
|
||
.. code-block:: python | ||
|
||
import urllib2, feedparser | ||
|
||
auth = urllib2.HTTPDigestAuthHandler() | ||
auth.add_password('DigestTest', 'feedparser.org', 'test', 'digest') | ||
d = feedparser.parse( | ||
'$READTHEDOCS_CANONICAL_URL/examples/digest_auth.xml', | ||
handlers=[auth], | ||
) | ||
|
||
|
||
The examples so far have assumed that you know in advance that the feed is | ||
password-protected. But what if you don't know? | ||
Determining that a feed is password-protected | ||
--------------------------------------------- | ||
|
||
If you try to download a password-protected feed without sending all the proper | ||
password information, the server will return an | ||
|
@@ -113,12 +48,7 @@ you will need to parse it yourself. Everything before the first space is the | |
type of authentication (probably ``Basic`` or ``Digest``), which controls which | ||
type of handler you'll need to construct. The realm name is given as | ||
realm="foo" -- so foo would be your first argument to auth.add_password. Other | ||
information in the www-authenticate header is probably safe to ignore; the | ||
:file:`urllib2` module will handle it for you. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Don't think this part of the sentence makes sense since |
||
|
||
|
||
Determining that a feed is password-protected | ||
--------------------------------------------- | ||
information in the www-authenticate header is probably safe to ignore. | ||
|
||
.. code-block:: pycon | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -19,19 +19,6 @@ you should change the User-Agent to your application name and | |
Customizing the User-Agent | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Setting |
||
-------------------------- | ||
|
||
.. code-block:: pycon | ||
|
||
>>> import feedparser | ||
>>> d = feedparser.parse('$READTHEDOCS_CANONICAL_URL/examples/atom10.xml', | ||
... agent='MyApp/1.0 +http://example.com/') | ||
|
||
You can also set the User-Agent once, globally, and then call the ``parse`` | ||
function normally. | ||
|
||
|
||
Customizing the User-Agent permanently | ||
-------------------------------------- | ||
|
||
.. code-block:: pycon | ||
|
||
>>> import feedparser | ||
|
@@ -44,13 +31,3 @@ download a feed from a web server. This is discouraged, because it is a | |
violation of `RFC 2616 <http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.36>`_. | ||
The default behavior is to send a blank referrer, and you should never need to | ||
override this. | ||
|
||
|
||
Customizing the referrer | ||
------------------------ | ||
|
||
.. code-block:: pycon | ||
|
||
>>> import feedparser | ||
>>> d = feedparser.parse('$READTHEDOCS_CANONICAL_URL/examples/atom10.xml', | ||
... referrer='http://example.com/') |
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basic auth still seems to work fine, as
requests
supports these type of URLs - for anything else it seems like you need to handle authentication yourself