Work around XML::Atom strangeness that results in double-encoded posts
authorSimon McVittie <smcv@ http://smcv.pseudorandom.co.uk/>
Tue, 3 Feb 2009 19:48:55 +0000 (19:48 +0000)
committerSimon McVittie <smcv@ http://smcv.pseudorandom.co.uk/>
Tue, 3 Feb 2009 19:48:55 +0000 (19:48 +0000)
See [[bugs/Aggregated_Atom_feeds_are_double-encoded]]. By default,
XML::Atom outputs strings of UTF-8 bytes with the Perl UTF8 flag stripped
off, which IkiWiki assumes to be Latin-1 and re-encodes as UTF-8 on
output. XML::Feed  does not currently (0.41-1) set the magic variable to
change this behaviour (I've filed a bug on CPAN), but IkiWiki can
usefully set the same variable as a workaround.

IkiWiki/Plugin/aggregate.pm

index c667ee2a9f7429512d6287fdd91bd6422cd1500e..e1baae666a02ffe0e9230ed4d81667a2f7ece79a 100644 (file)
@@ -534,6 +534,11 @@ sub aggregate (@) {
                }
 
                foreach my $entry ($f->entries) {
+                       # XML::Feed doesn't work around XML::Atom's bizarre
+                       # API, so we will. Real unicode strings? Yes please.
+                       # See [[bugs/Aggregated_Atom_feeds_are_double-encoded]]
+                       local $XML::Atom::ForceUnicode = 1;
+
                        my $c=$entry->content;
                        # atom feeds may have no content, only a summary
                        if (! defined $c && ref $entry->summary) {