X-Git-Url: https://sipb.mit.edu/gitweb.cgi/ikiwiki.git/blobdiff_plain/121405e8aa80e7ceb5283b0ff8c9865458a6dd52..e46a3b753463e71d8a24c35a5035cfbc47dd4816:/doc/plugins/contrib/field/discussion.mdwn diff --git a/doc/plugins/contrib/field/discussion.mdwn b/doc/plugins/contrib/field/discussion.mdwn index b243e2dfe..24c37cc4c 100644 --- a/doc/plugins/contrib/field/discussion.mdwn +++ b/doc/plugins/contrib/field/discussion.mdwn @@ -24,10 +24,64 @@ behaviour, an auxiliary plugin would be easy.) >>> Based on `field_get_value()`, yes. That would be my ideal. Do you think I should implement that as an ikiwiki branch? --[[KathrynAndersen]] +>>>> This doesn't solve cases where certain fields are treated specially; for +>>>> instance, putting a `\[[!meta permalink]]` on a page is not the same as +>>>> putting it in `ymlfront` (in the latter case you won't get your +>>>> `` header), and putting `\[[!meta date]]` is not the same as putting +>>>> `date` in `ymlfront` (in the latter case, `%pagectime` won't be changed). +>>>> +>>>> One way to resolve that would be to have `ymlfront`, or similar, be a +>>>> front-end for `meta` rather than for `field`, and call +>>>> `IkiWiki::Plugin::meta::preprocess` (or a refactored-out function that's +>>>> similar). +>>>> +>>>> There are also some cross-site scripting issues (see below)... --[[smcv]] + >> (On the site I mentioned, I'm using an unmodified version of `field`, >> and currently working around the collision by tagging books' pages >> with `bookauthor` instead of `author` in the YAML.) --s +>> Revisiting this after more thought, the problem here is similar to the +>> possibility that a wiki user adds a `meta` shortcut +>> to [[shortcuts]], or conversely, that a plugin adds a `cpan` directive +>> that conflicts with the `cpan` shortcut that pages already use. (In the +>> case of shortcuts, this is resolved by having plugin-defined directives +>> always win.) For plugin-defined meta keywords this is the plugin +>> author's/wiki admin's problem - just don't enable conflicting plugins! - +>> but it gets scary when you start introducing things like `ymlfront`, which +>> allow arbitrary, wiki-user-defined fields, even ones that subvert +>> other plugins' assumptions. +>> +>> The `pagetemplate` hook is particularly alarming because page templates are +>> evaluated in many contexts, not all of which are subject to the +>> htmlscrubber or escaping; because the output from `field` isn't filtered, +>> prefixed or delimited, when combined with an arbitrary-key-setting plugin +>> like `ymlfront` it can interfere with other plugins' expectations +>> and potentially cause cross-site scripting exploits. For instance, `inline` +>> has a `pagetemplate` hook which defines the `FEEDLINKS` template variable +>> to be a blob of HTML to put in the `` of the page. As a result, this +>> YAML would be bad: +>> +>> --- +>> FEEDLINKS: +>> --- +>> +>> (It might require a different case combination due to implementation +>> details, I'm not sure.) +>> +>> It's difficult for `field` to do anything about this, because it doesn't +>> know whether a field is meant to be plain text, HTML, a URL, or something +>> else. +>> +>> If `field`'s `pagetemplate` hook did something more limiting - like +>> only emitting template variables starting with `field_`, or from some +>> finite set, or something - then this would cease to be a problem, I think? +>> +>> `ftemplate` and `getfield` don't have this problem, as far as I can see, +>> because their output is in contexts where the user could equally well have +>> written raw HTML directly; the user can cause themselves confusion, but +>> can't cause harmful output. --[[smcv]] + From a coding style point of view, the `$CamelCase` variable names aren't IkiWiki style, and the `match_foo` functions look as though they could benefit from being thin wrappers around a common `&IkiWiki::Plugin::field::match` @@ -77,8 +131,8 @@ I think it should just be part of `field` rather than a separate plugin. error("sort=field requires a parameter"); } - my $left = IkiWiki::Plugin::field::field_get_value($_[2], $_[0]); - my $right = IkiWiki::Plugin::field::field_get_value($_[2], $_[1]); + my $left = IkiWiki::Plugin::field::field_get_value($_[0], $a); + my $right = IkiWiki::Plugin::field::field_get_value($_[0], $b); $left = "" unless defined $left; $right = "" unless defined $right; @@ -87,17 +141,6 @@ I think it should just be part of `field` rather than a separate plugin. 1; -------- - -Bug report: `field` has an unnecessary `use YAML::Any`, presumably from before -you separated out `ymlfront`. Trivial patch available from -field-etc branch in git://git.pseudorandom.co.uk/git/smcv/ikiwiki.git (gitweb: -) ---[[smcv]] - -> Can do for the field plugin (delete one line? Easy.) Will push when I get to a better connection. --[[KathrynAndersen]] ->> Done! -K.A. - ---- Disclaimer: I've only looked at this plugin and ymlfront, not other related @@ -125,3 +168,135 @@ smcv's discuission of field author vs meta author above. --[[Joey]] > So, yes, it does cater to mostly my personal needs, but I think it is more generally useful, also. > --[[KathrynAndersen]] +>> Is it fair to say, then, that `field`'s purpose is to take other +>> plugins' arbitrary per-page data, and present it as a single +>> merged/flattened string => string map per page? From the plugins +>> here, things you then use that merged map for include: +>> +>> * sorting - stolen by [[todo/allow_plugins_to_add_sorting_methods]] +>> * substitution into pages with Perl-like syntax - `getfield` +>> * substitution into wiki-defined templates - the `pagetemplate` +>> hook +>> * substitution into user-defined templates - `ftemplate` +>> +>> As I mentioned above, the flattening can cause collisions (and in the +>> `pagetemplate` case, even security problems). +>> +>> I wonder whether conflating Page Text Variables with Page Variables +>> causes `field` to be more general than it needs to be? +>> To define a Page Variable (function-like field), you need to write +>> a plugin containing that Perl function; if we assume that `field` +>> or something resembling it gets merged into ikiwiki, then it's +>> reasonable to expect third-party plugins to integrate with whatever +>> scaffolding there is for these (either in an enabled-by-default +>> plugin that most people are expected to leave enabled, like `meta` +>> now, or in the core), and it doesn't seem onerous to expect each +>> plugin that wants to participate in this mechanism to have code to +>> do so. While it's still contrib, `field` could just have a special case +>> for the meta plugin, rather than the converse? +>> +>> If Page Text Variables are limited to being simple strings as you +>> suggest over in [[forum/an_alternative_approach_to_structured_data]], +>> then they're functionally similar to `meta` fields, so one way to +>> get their functionality would be to extend `meta` so that +>> +>> \[[!meta badger="mushroom"]] +>> +>> (for an unrecognised keyword `badger`) would store +>> `$pagestate{$page}{meta}{badger} = "mushroom"`? Getting this to +>> appear in templates might be problematic, because a naive +>> `pagetemplate` hook would have the same problem that `field` combined +>> with `ymlfront` currently does. +>> +>> One disadvantage that would appear if the function-like and +>> meta-like fields weren't in the same namespace would be that it +>> wouldn't be possible to switch a field from being meta-like to being +>> function-like without changing any wiki content that referenced it. +>> +>> Perhaps meta-like fields should just *be* `meta` (with the above +>> enhancement), as a trivial case of function-like fields? That would +>> turn `ymlfront` into an alternative syntax for `meta`, I think? +>> That, in turn, would hopefully solve the special-fields problem, +>> by just delegating it to meta. I've been glad of the ability to define +>> new ad-hoc fields with this plugin without having to write an extra plugin +>> to do so (listing books with a `bookauthor` and sorting them by +>> `"field(bookauthor) title"`), but that'd be just as easy if `meta` +>> accepted ad-hoc fields? +>> +>> --[[smcv]] + +>>> Your point above about cross-site scripting is a valid one, and something I +>>> hadn't thought of (oops). + +>>> I still want to be able to populate pagetemplate templates with field, because I +>>> use it for a number of things, such as setting which CSS files to use for a +>>> given page, and, as I said, for titles. But apart from the titles, I +>>> realize I've been setting them in places other than the page data itself. +>>> (Another unreleased plugin, `concon`, uses Config::Context to be able to +>>> set variables on a per-site, per-directory and a per-page basis). + +>>> The first possible solution is what you suggested above: for field to only +>>> set values in pagetemplate which are prefixed with *field_*. I don't think +>>> this is quite satisfactory, since that would still mean that people could +>>> put un-scrubbed values into a pagetemplate, albeit they would be values +>>> named field_foo, etc. --[[KathrynAndersen]] + +>>>> They can already do similar; `PERMALINK` is pre-sanitized to +>>>> ensure that it's a "safe" URL, but if an extremely confused wiki admin was +>>>> to put `COPYRIGHT` in their RSS/Atom feed's ``, a malicious user +>>>> could put an unsafe (e.g. Javascript) URL in there (`COPYRIGHT` *is* +>>>> HTML-scrubbed, but "javascript:alert('pwned!')" is just text as far as a +>>>> HTML sanitizer is concerned, so it passes straight through). The solution +>>>> is to not use variables in situations where that variable would be +>>>> inappropriate. Because `field` is so generic, the definition of what's +>>>> appropriate is difficult. --[[smcv]] + +>>> An alternative solution would be to classify field registration as "secure" +>>> and "insecure". Sources such as ymlfront would be insecure, sources such +>>> as concon (or the $config hash) would be secure, since they can't be edited +>>> as pages. Then, when doing pagetemplate substitution (but not ftemplate +>>> substitution) the insecure sources could be HTML-escaped. +>>> --[[KathrynAndersen]] + +>>>> Whether you trust the supplier of data seems orthogonal to whether its value +>>>> is (meant to be) interpreted as plain text, HTML, a URL or what? +>>>> +>>>> Even in cases where you trust the supplier, you need to escape things +>>>> suitably for the context, not for security but for correctness. The +>>>> definition of the value, and the context it's being used in, changes the +>>>> processing you need to do. An incomplete list: +>>>> +>>>> * HTML used as HTML needs to be html-scrubbed if and only if untrusted +>>>> * URLs used as URLs need to be put through `safeurl()` if and only if +>>>> untrusted +>>>> * HTML used as plain text needs tags removed regardless +>>>> * URLs used as plain text are safe +>>>> * URLs or plain text used in HTML need HTML-escaping (and URLs also need +>>>> `safeurl()` if untrusted) +>>>> * HTML or plain text used in URLs need URL-escaping (and the resulting +>>>> URL might need sanitizing too?) +>>>> +>>>> I can't immediately think of other data types we'd be interested in beyond +>>>> text, HTML and URL, but I'm sure there are plenty. +>>>> +>>>> One reasonable option would be to declare that `field` takes text-valued +>>>> fields, in which case either consumers need to escape +>>>> it with ``, and not interpret it as a URL +>>>> without first checking `safeurl`), or the pagetemplate hook needs to +>>>> pre-escape. +>>>> +>>>> Another reasonable option would be to declare that `field` takes raw HTML, +>>>> in which case consumers need to only use it in contexts that will be +>>>> HTML-scrubbed (but it becomes unsuitable for using as text - problematic +>>>> for text-based things like sorting or URLs, and not ideal for searching). +>>>> +>>>> You could even let each consumer choose how it's going to use the field, +>>>> by having the `foo` field generate `TEXT_FOO` and `HTML_FOO` variables? +>>>> --[[smcv]] + +>>> Another problem, as you point out, is special-case fields, such as a number of +>>> those defined by `meta`, which have side-effects associated with them, more +>>> than just providing a value to pagetemplate. Perhaps `meta` should deal with +>>> the side-effects, but use `field` as an interface to get the values of those special fields. + +>>> --[[KathrynAndersen]]