|
|
2#

楼主 |
发表于 2008-11-26 01:07:27
|
只看该作者
MagpieRSS: RSS for PHP
|
[url=http://magpierss.sourceforge.net/]http://magpierss.sourceforge.net/
MagpieRSS provides an XML-based (expat) RSS parser in PHP.
MagpieRSS is compatible with RSS 0.9 through RSS 1.0. Also parses RSS 1.0's modules, RSS 2.0, and Atom. (with a few exceptions)
Why?I wrote MagpieRSS out of a frustration with the limitations of existing solutions. In particular many of the existing PHP solutions seemed to:
- use a parser based on regular expressions, making for an inherently fragile solution
- only support early versions of RSS
- discard all the interesting information besides item title, description, and link.
- not build proper separation between parsing the RSS and displaying it.
In particular I failed to find any PHP RSS parsers that could sufficiently parse RSS 1.0 feeds, to be useful on the RSS based event feeds we generate at [url=http://protest.net/]Protest.net.
Features- Easy to UseAs simple as: require('rss_fetch.inc'); $rss = fetch_rss($url);
- Parses RSS 0.9 - RSS 1.0Parses most RSS formats, including support for [url=http://www.purl.org/rss/1.0/modules/]1.0 modules and limited namespace support. RSS is packed into convenient data structures; easy to use in PHP, and appropriate for passing to a templating system, like [url=http://smarty.php.net/]Smarty.
- Integrated Object CacheCaching the parsed RSS means that the 2nd request is fast, and that including the rss_fetch call in your PHP page won't destroy your performance, and force you to reply on an external cron job. And it happens transparently.
- HTTP Conditional GETsSave bandwidth and speed up download times with intelligent use of Last-Modified and ETag.
See [url=http://fishbowl.pastiche.org/archives/001132.html]HTTP Conditional Get for RSS Hackers - ConfigurableMakes extensive use of constants to allow overriding default behaviour, and installation on shared hosts.
- Modular
- rss_fetch.inc - wraps a simple interface (fetch_rss()) around the library.
- rss_parse.inc - provides the RSS parser, and the RSS object
- rss_cache.inc - a simple (no GC) object cache, optimized for RSS objects
- rss_utils.inc - utility functions for working with RSS. currently provides parse_w3cdtf(), for parsing [url=http://www.w3.org/TR/NOTE-datetime]W3CDTF into epoch seconds.
- More
- Secure - supports HTTP authentication, and SSL
- Bandwidth friendly - supports transparent GZIP encoding to reduce bandwidth usage
- Does not use fopen(), work even if allow_url_fopen is disabled.
Magpie's approach to parsing RSSMagpie takes a naive, and inclusive approach. Absolutely non-validating, as long as the RSS feed is well formed, Magpie will cheerfully parse new, and never before seen tags in your RSS feeds.
This makes it very simple support the varied versions of RSS simply, but forces the consumer of a RSS feed to be cognizant of how it is structured.(at least if you want to do something fancy)
Magpie parses a RSS feed into a simple object, with 4 fields: channel, items, image, and textinput.
channel$rss->channel contains key-value pairs of all tags, without nested tags, found between the root tag (<rdf:RDF>, or <rss>) and the end of the document.
items$rss->items is an array of associative arrays, each one describing a single item. An example that looks like:
<item rdf:about="http://protest.net/NorthEast/calendrome.cgi?span=event&ID=210257"><title>Weekly Peace Vigil</title><link>http://protest.net/NorthEast/calendrome.cgi?span=event&ID=210257</link><description>Wear a white ribbon</description><dc:subject>Peace</dc:subject><ev:startdate>2002-06-01T11:00:00</ev:startdate><ev:location>Northampton, MA</ev:location><ev:enddate>2002-06-01T12:00:00</ev:enddate><ev:type>Protest</ev:type></item>
Is parsed, and pushed on the $rss->items array as: array( title => 'Weekly Peace Vigil', link => 'http://protest.net/NorthEast/calendrome.cgi?span=event&ID=210257', description => 'Wear a white ribbon', dc => array ( subject => 'Peace' ), ev => array ( startdate => '2002-06-01T11:00:00', enddate => '2002-06-01T12:00:00', type => 'Protest', location => 'Northampton, MA' ));
image and textinput$rss->image and $rss-textinput are associative arrays including name-value pairs for anything found between the respective parent tags.
Usage Examples:A very simple example would be:
require_once 'rss_fetch.inc';$url = 'http://magpie.sf.net/samples/imc.1-0.rdf';$rss = fetch_rss($url);echo "Site: ", $rss->channel['title'], "<br>";foreach ($rss->items as $item ) { $title = $item[title]; $url = $item[link]; echo "<a href=$url>$title</a></li><br>";}More soon....in the meantime you can check out a [url=http://www.infinitepenguins.net/rss/]cool tool built with MagpieRSS, version 0.1.
TodosRSS Parser- Swap in a smarter parser that includes optional support for validation, and required fields.
- Improve RSS 2.0 support, in all its wacky permutations (as much as I'm annoyed by it)
- Improve support for modules that rely on attributes
RSS Cache- Light-weight garbage collection
Fetch RSS- Attempt to [url=http://diveintomark.org/archives/2002/08/15.html]auto-detect an RSS feed, given a URL following, much like [url=http://diveintomark.org/projects/misc/rssfinder.py.txt]rssfinder.pydoes.
Misc- More examples
- A test suite
- RSS generation, perhaps with [url=http://usefulinc.com/rss/rsswriter/]RSSwriter?
|
|