doc/plugins/aggregate/discussion.mdwn

   1 I'm trying to set up a [planet of my users' blogs](http://help.schmonz.com/planet/). I've enabled the aggregate, meta, and tag plugins (but not htmltidy, that thing has a gajillion dependencies). `aggregateinternal` is 1. The cron job is running and I've also enabled the webtrigger. My usage is like so:
   2
   3     \[[!inline pages="internal(planet/*) show=0"]]
   4
   5     \[[!aggregate
   6     name="Amitai's blog"
   7     url="http://www.schmonz.com/"
   8     dir="planet/schmonz-blog"
   9     feedurl="http://www.schmonz.com/atom/"
  10     expirecount="2"
  11     tag="schmonz"
  12     ]]
  13
  14     \[[!aggregate
  15     name="Amitai's photos"
  16     url="http://photos.schmonz.com/"
  17     dir="planet/schmonz-photos"
  18     feedurl="http://photos.schmonz.com/main.php?g2_view=rss.SimpleRender&g2_itemId=7"
  19     expirecount="2"
  20     tag="schmonz"
  21     ]]
  22
  23
  24 (and a few more `aggregate` directives like these)
  25
  26 Two things aren't working as I'd expect:
  27
  28 1. `expirecount` doesn't take effect on the first run, but on the second. (This is minor, just a bit confusing at first.)
  29
  30 >
  31
  32 2. Where are the article bodies for e.g. David's and Nathan's blogs? The bodies aren't showing up in the `._aggregated` files for those feeds, but the bodies for my own blog do, which explains the planet problem, but I don't understand the underlying aggregation problem. (Those feeds include article bodies, and show up normally in my usual feed reader rss2email.) How can I debug this further?
  33
  34 --[[schmonz]]
  35
  36 > I only looked at David's, but its rss feed is not escaping the html
  37 > inside the rss `description` tags, which is illegal for rss 2.0. These
  38 > unknown tags then get ignored, including their content, and all that's
  39 > left is whitespace. Escaping the html to `&lt;` and `&gt;` fixes the
  40 > problem. You can see the feed validator complain about it here:
  41 > <http://feedvalidator.org/check.cgi?url=http%3A%2F%2Fwww.davidj.org%2Frss.xml>
  42 >
  43 > It's sorta unfortunate that [[cpan XML::Feed]] doesn't just assume the
  44 > un-esxaped html is part of the description field. Probably other feed
  45 > parsers are more lenient. --[[Joey]]