NPH (Non-Parsed Header) Scripts

NPH scripts were originally created to let a CGI process write its output directly to the socket fd. This implementation has a performance benefit, as well as the following implications:

HTTP headers are not parsed nor modified by the HTTP server.
script output is not buffered by the HTTP server.

It is the latter attribute that many popular NPH scripts depend on.

Orenosv offers flexible output buffering facilities, one of which is unbuffered output. Unlike NPH, unbuffered output does parse and modify response headers but it won't buffer script output. Thus most NPH scripts can achieve its purpose by using unbuffered output. In order to offer further compatibilty, Orenosv also correctly parses raw response headers generated by NPH scripts.

This NPH compatibility mode can be enabled per script basis by prepending "nph-" prefix to the name of the script, regardless of the global "outbuf_mode" parameter.

Other points:

The output does go through the HTTP server, so there's no performance benefit.
Unbuffered output is also subject to output filtering, which may buffer output. Make sure to configure NPH scripts to no go thru any output filters.
NPH mode is also applicable to scripts that run by ISAPI interpreters. (like PERLIS.DLL).

Multi-process implementation

Currently, processes in a process group (called "helper processes" hereafter) are implemented as an independent HTTP server that listens on a TCP port. So the main server acts as a specialized form of proxy. A helper process listens on IP address of "localhost"(127.0.0.1) and the port number that it will get from OS when it started up (using bind(0)).

This implementation may be resource intensive in terms of TCP ports if many helper processes are required, as well as main-helper communications overhead. There are several performance optimizations possible.

passing client connection handle to a helper process.
Since W2K doesn't allow dissociation of socket handle from IO completion port, the helper process can't use its own IPCP but switch to synchronous IO or event-based async IO. This would require a whole new network code.
passing response data via a shared memory or temporary file
basic request-response exchange is done via HTTP on TCP but body of response will be passed via a shared memory or a temporary file if large. This is a easy option because it doens't require to change TCP socket code or HTTP header handling code. It just needs a special header telling the main server where to find the response body. Removing those resources (block in shared memory or temporary file) will be done in the main server. Note that response body will be not be pipelined to the main process, rather it comes in one shot. This memory block or temporary file will be directly passed to winsock API to avoid copying. However this method may be even more resource-intensive than now...

The security against localhost users is as follows. When a parent process starts up a child process, it generates a secret code and passes it on to the child via the pipe, which should be protected by the OS. The parent process adds this secret code to its HTTP request headers when passing the client request to the child process, and the child checks that secret to see if it's from the parent. TCP communications between "localhost" endpoints are also protected by the OS.

SSI limitations

The reason why SSI is a separate CGI executable is that its implementation will be never be complete. Its parser is very simple minded.

it doesn't understand HTML syntax so an SSI directive embraced within HTML comments will be expanded.
it is not internationalized in any way. Especially it will have troubles with double-byte character sets if a charset contains 7-bit bytes as part of a double-byte character. In case of Japanase charsets, you can avoid the problem by using euc-jp charset.

Since SSI is to parse user-created files, it must be robust in handling errorneous documents otherwise it might fall into a tight CPU loop. However it will never be internationalized so it can never handle all errorneous cases correctly. Making it a CGI enables Orenosv to kill it off after a certain time limit (using -x option in mod_cgi).

other limitations

"virtual=" attribute can only specify a file that resides in the same alias directory tree as the referencing document. This is a limication that stems from CGI architecture.
"cgi" commands not implemented.
"cgi" will have to implemented as a call into the http server.

Performance problem with using localhost(127.0.0.1)

I encountered this problem on one of my notebook PC. This problem is present in Windows 2000 from SP0 to SP2.

Look at MS Knowledge base artitle Q294769
Data Transfer to "Localhost" or Loopback Address Is Slow.

A workaround is to use hostname that is assigned a valid IP address.

About 'Content-Encoding' field

It seems to me that the interpretation of 'Content-Encoding' differs between Netscape and IE. In both Netscape 4.7 and Mozilla 0.9, it is the content handler that is responsible for handling 'Content-Encoding', which is understandable. In IE 5.0, the HTTP transport handles 'Content-Encoding', which is good because this field is defined in HTTP spec.

The latest HTTP 1.1 spec suggests to use 'Transfer-Encoding: gzip' instead of 'Content-Encoding' for the purpose of entity transfer. 'Content-Encoding', instead, should be used to represent a meta-data for the entity.

For example, the headers:
Contenet-Type: text/plain Content-Encoding: gzip suggests to the user agent that the file is a plain text compressed by gzip, and should present this entity in gzip'ped form to whatever the user agent is using to render the content. When saving to disk it should use these info to name the file according the OS spec on which the user-agent is running. On Unix and Windows, the file should be named like 'xxx.txt.gz'.

On the other hand when the server emits 'Transfer-Encoding:gzip', like:
Contenet-Type: text/plain Transfer-Encoding: gzip it suggests to the user agent that the entity will be transfered in the format specified with Transfer-Encoding. The user agent or the HTTP transport component of the user agent should uncompress the entity and present the ungzip'ped format to the upper layer. When saving to disk, the agent should save it in the uncompressed form.

Following the RFC strictly will confuse existing browsers. Actually I don't know if any current browser supports Transfer-Encoding: gzip yet. Anyway modfilt_zlib will have to have an option to use Transfer-Encoding instead of Content-Encoding.