Open proposals and issues

This talk of new capabilities for a new release of Rivet has had me revisiting something I've been thinking about for quite a while...

It's really easy for developers to use forms in a way that they shouldn't, putting stuff in hidden fields, etc, that could easily be monkeyed with by an attacker. That data needs to be kept server side and accessed with an opaque, unguessable session ID cookie or equivalent.

It's also super common for developers to accept input from forms with insufficient checks on the validity of the input data, the source of SQL injection attacks and so forth.

I'm thinking of what you might call a response broker. (This isn't an original idea -- I've read about stuff like this.)

Rather than doing a load_response, the page handling the response would explicitly name the fields you expect to get from the form, you invoke a proc specifying the fields you expect to find, their data types, optional code to validate them, and an array to stuff the validated fields into.

set wantVarList {{username string} {id integer} {uid check_routine validate_uid} {password check_routine validate_password} {hash base64} {email email}}

set status [response_broker $wantVarList response]

Since every page using the response broker would need to check for response broker parse failures it would probably be nice to be able to specify a general handler routine that would run and then abort the page, removing the need to check the return.

Now if I run…

response_broker $wantVarList response

…if the page continues then the response array would contain validated fields found in the form for the variables named in wantVarList and no others. You might want the presence of unexpected fields to also blow out, but that would be likely to bite you pretty often when the reasons are harmless. Maybe log them and include a {field ignore} option that will inhibit logging for expected-but-ignored fields.

We've done a form package at FlightAware that has a specific look and feel that allows you to specify both Tcl and Javascript validation code. Of course you still need the Tcl code on the server but it's nice in the modern era to also provide Javascript validation on the browser. We could probably open source if its appearance was genericized or something, but I mention it mainly to point out the usefulness of such an approach both for the developer and the users and a likely need to push Rivet into providing more stuff to support developers making modern websites.

We use separate virtual interpreters for development (for each developer we diddle auto_path to give them source-controlled private copies of all of our packages) and it's really badass. The problem comes in the all-or-nothing approach of SVI. We set up a virtual host for each developer, for both port 80 (well really 8080 + varnish on 80) and 443. This results in 18 interpreters in each httpd process on our development machine, making for very large httpd processes and slow startup time after a graceful.

Right now if one of our developers changes a package, private to them, they still have to do an apachectl graceful to pick up the change. This restarts all of the httpd processes and reinitializes all of the interpreters. Our interpreter initialization is intense. Each FlightAware httpd process loads 468 packages.

I'd like to be able to cause only one vhost's Tcl interpreters to be reloaded by a Tcl_DeleteInterp / Tcl_CreateInterp / Rivet initialization process. Instead of a graceful, you'd be able to specify something like a trigger file for each vhost. Every time a vhost (with separate virtual interpreters) serves a page, it gets the mtime of the trigger file. If the mtime of the trigger file has changed since the last time the interpreter served a page, Rivet deletes the virtual host's interpreter, creates and initializes a new one, and then handles the page. [I tried to write this but kind of lost control of it and was not successful.]

This way, developers could totally reload all their libraries without any httpd processes being stopped or started. Also this will lower overall overhead because a lot of times a httpd process won't have ever handled a page in its lifetime for many to most of the virtual interpreters.

An additional improvement would be to create the ability to not even initialize a vhost's separate virtual interpreter until the first time it is needed.

.rvt templates run within the ::request namespace. It is a sensible practice to parse other templates or to source other scripts from an .rvt file. Being run within a ::request namespace that is going to be destroyed after the request has been served. There are still consequences on how specific Tcl commands work (e.g. 'package require ...') This has to be documented extensively in the manual (suggested by Karl)