It makes me think that one non-negotiable feature of any webapp architecture is ...

samstokes · on Nov 9, 2012

Yesod (a Haskell web framework) tries pretty hard. e.g. http://www.yesodweb.com/book/shakespearean-templates

javajosh · on Nov 9, 2012

Cool, thanks for that.

(My hobby: posting "nothing like X exists" in a Hacker News thread. :)

JonnieCache · on Nov 9, 2012

Rails sortof does that, and has since 3.0.

http://yehudakatz.com/2010/02/01/safebuffers-and-rails-3-0/

javajosh · on Nov 9, 2012

Neat. Something like SafeBuffer is a practical way to approach the problem.

It seems like with the rise of 'zero copy' approaches we could do even better - simply designate a memory region as unsafe, and transform it into a safe version depending on which context it is used. These transforms would want to add a little metadata pointing to the original unsafe region in case the transformed region is ever subsequently used in a different execution context. Alas, from the perspective of one program the input to another always just looks like a string, which means that somehow our host program (and programmer) needs to signal the appropriate transform on, say, concatenation. The only way I can think of around this requirement is to force implementors of contexts to tag their interfaces as a context, and for callers to construct arguments to those functions such that constituents that derive from unsafe regions are detectable. For example we have a SQL context that takes an array of string pointers, where some of the pointers point to 'unsafe' regions, and we just concatenate the elements of the array to construct the context argument.

porlw · on Nov 9, 2012

Check out taint mode in perl. It's been around forever, and I don't understand why all web frameworks don't have a similar concept.

lazyjones · on Nov 9, 2012

The Play framework (Scala, Java) and Mojolicious (Perl) (and many other newer frameworks probably) escape output by default, so at least they make you think before allowing XSS.

fnayr · on Nov 9, 2012

same with Django (Python)

wglb · on Nov 9, 2012

Ah, the fun part of this is "interpreted as code". Which language? html, xml, js, css, json? Get that part wrong or slightly off, and what you sanitized for one isn't for the other. And sometimes there can be nested contexts.

While the idea of "taint" is useful, it is only half the battle. The other half is accounting for the context.