Handling of Transfer-Encoding chunked

Sep 25, 2010 at 11:41 PM

Hi.

I glanced at the Input object code and don't see any special magic there. Was wondering if you have any plans to handle request body that was sent HTTP/1.1, Transfer-Encoding chunked. 

Any client that uses libcurl can easily send transfer encoding chunked request body by setting Continue-100, Transfer-Encoding: chunked before even knowing how much they will send.. Git client does that when cached request body crosses 1meg; it starts shoving output straight into request's pipe.

Cherrypy does handle transfer encoding chunked by wrapping the input obj into a reader class that looks for "unexpected" end of input and sends EOF up to wsgi.input It's not exactly per PEP333, but it helps bring sanity to the wsgi world. 

Any thoughts about transparent handling of chunked bodies for NWSGI?

Daniel.

Coordinator
Sep 26, 2010 at 5:07 AM

Hi Daniel,

I would like to support it, but I would need a simple testcase - can it be done with the curl command line tool?

My understanding is that it should actually be handled by IIS before it ever gets to NWSGI, but I've never tested it.

Sep 27, 2010 at 6:49 PM
Edited Sep 27, 2010 at 7:07 PM

Sample CURL command to sent chunked:

curl upload_url --upload-file some_local_file_name --header "Transfer-Encoding: chunked"
(tried it on windows with curl shipped as part of cygwin)

I tried it on IIS 7.x + NWSGI 2.x + WSGI app to see environ. 
The headers contain "Transfer-Encoding" headers. Content-Length is Zero (not empty string, not -1 not abscent)

It looks like IIS does not cache + presents the body as a simple file-like. 

Frankly, I would not want it to do that. I much prefer a simple pipe inserted between request body and wsgi.input.
The pipe would read bytes from request in a .Net try-catch until "unexpected end of stream" and send "EOF" down to
wsgi.input file-like when exception is caught. 

This position is in-line with point #2 here:
http://blog.dscpl.com.au/2009/10/details-on-wsgi-10-amendmentsclarificat.html
(Not sure if there is a more recent draft of "WSGI 1.1" anywhere.)

When "chunked"-aware WSGI app detects "Transfer-Encoding: chunked" it would elect to read from wsgi.input until EOF
instead of reading bytes specified in Content-Length, which may (a) be missing (b) be wrong or (c) be ambiguous [Ex:-1].

Incidentally, sending EOF does not break WSGI 1.0 apps. Since they are supposed to read exactly Content-Length number
of bytes, and the Content-Length header is either zero or un-usable, they will either read zero bytes from input, or blow 
up if they can't handle non integer contents of Content-Length.

Edit:
Originally provided a CURL command that sends a multipart/form-encoded body - a headache to play with. Simplified the
command to send, "chunked" but simple, non-multipart stream.

Chunked can also come in as "multipart/form-encoded", which is another complication on top of basic, 
chunked, "Content-Encoding: gzip"

Same cherrypy handles multipart as well, but only as part of cherrypy namespace module, not in standard WSGI space.
It decodes and mounts the parts as separate objects in a list attached to "request" object. 
It's not really something I would like to see in wsgi server, but was just interesting thing to note.
I would be perfectly content to get the entire bytes stream in wsgi and chop it myself if "multipart" is present. 
Would probably put multipart chopping into wsgi middle-ware rather than the server itself.