Orenosv HTTP/FTP Server

Features of Orenosv

Optimized for Windows NT/2000

I was a user of Win32 Apache 1.3 when it was still in beta. At that time it was unstable and its performance wasn't impressive. It was probably because 1) Apache 1.x was heavily based on Unix code, and 2) Win32 version of Apache must support both Windows NT and Windows 95/98.

I started writing an experimental http server that utilizes Windows NT's IOCPs and TransmitFile API. I have been making improvements on this http server, on and off for about two years. There is nothing extraordinary, it just uses NT-only APIs for network operations and won't run on Windows95/98/Me.

Performance comparison with any version of IIS has never been done, but I expect IIS to easily outperform, one of the reasons is IIS caches static files in memory. I also think that Apache 2.0 will also be optimized to each platform it supports. So once Apache 2.0 becomes stable on NT/2000, there may be no performance advantage. But who knows at this point...

In addition to performance benefits, Orenosv also fully takes advantage of various Windows NT security features. Details will be given below.

Minimum resource usage when sending static files

Static files are served with asynchronous TransmitFile(), thus effectively only one thread is active even when there are hundreds of users downloading files.

On NT Workstaion and 2000 Professional, there is a restriction that only two concurrent executions of TransmitFile() are allowed. When Orenosv runs out of available TransmitFile calls on those OSes, it uses a single dedicated asynchronous sendfile thread to handle concurrent clients. This dedicated sendfile thread is also used when the http connection is in bandwidth control mode. Note that when using sendfile thread, a single request will cause two threads to run to complete, incurring a performance penalty.

Supports PHP and Perl using ISAPI extension DLLs

I have been using ISAPI version of PHP4 on this server for sometime now. Activestate's PERLIS also works on orenosv server. Other ISAPI extensions have not been tested.

Separate processes for executing ISAPI extensions

Actually, not only ISAPI-handlers but also most handler modules can be executed in a different process from that of the web server.

Orenosv has a concept of process groups, each of which has a predefined number of prespawned processes. If a handler is specified to be executed in one process group, a request for the handler invocation will be routed to and executed in one of those processes in the process group.

The following attribute can be specified to a process group:

concurrency level of handler executions
each process is multi-threaded and thus capable of executing several concurrent requests simultineously. This attribute can be used to restrict that concurrency level.
reuse limit
A process can exit after the specified number of handler executions.
idle timeout
A process can exit after the specified period of idle state.
OS user in whose security context a process executes
A process can execute in security context of an OS user that is different from that of the main process.

Let's say you want to use an PHP extension which is known to be thread-unsafe. In this case you can run the PHP ISAPI handler in a process group whose concurrency level is 1. If an PHP extension or any ISAPI extension is leaky in terms of memory or handle usage, you can run it in a process group who bounces processes once in every 200 executions.

Another example is an ISAPI DLL that changes per-process resources. For example PERLIS seems to set CRT locale to OS default, which, on my OS, is "Japanese_Japan.932" while it was originally set to "C". (probably PERLIS is calling setlocale(LC_ALL, "")). This may cause a trouble in other ISAPI DLLs that depend on CRT locale being "C". There is no real solution to this kind of problem except that we separate processes in which they are executed.

These attributes can be specified per process group. Therefore you can partition your applications into different process group, each with differing number of worker processes, OS user contexts, and other attributes.

 Example:

  process group 1 : 1 worker process, concurrency of 8, os user of wwwguest
  process group 2 : 8 worker processes, concurrency of 1, os user of phpgw

  *.php                -> process group 1
  /phpgroupware/*.php  -> process group 2
  (note: path pattern matching is done in reverse order of appearing)

Running handlers in a process group will definitely have overhead compared to running them "in-process". So there is a trade-off that you have to consider. The overhead in multi-process config is not low in current implemntation. Please see detail. The implementation will be changed if there's enough demand for it.

Buffered output of dynamic contents

Orenosv is able to buffer output of dynamic content-handlers. It can buffer either 1) up to a specified limit in memory (partial buffering), or 2) all the output, spilling to disk as necessary (full buffering). The buffering can also be disabled completely (no buffering or unbuffered).
The purpose of case 1) is obvious. The purpose of 2) is to avoid pipelining of data from the content generator to the web browser.

Between the content generator and the web browser, we have a path like this:

content-generator -> web server -> [server-up-link] ->
the Internet -> [client-down-link] -> web browser

Pipelining output has an adverse effect on server side when this path is slow. The path will be slow when there's a bottleneck somewhere along the wire, like users from dial-up modems, your limited bandwidth to ISP, or any network congestions somewhere between your server and your users' PCs.

It is very often the case that a content-generator will need many resources during its execution. The resources are like threads, processes, database connections, temporary files, and work memory. Thus it is important that a content-generater must finish its work as soon as possible and should never block on slow network connection.

By buffering all of output in the web server, pipelined data is converted to a static (temporary) file transfer, which can be fine-controlled by the web server for throttling, etc. Another benefit is that a connection can be kept-alive, that was otherwise to be disconnected because of unknown content length. (If client is HTTP/1.1 and Orenosv has to do a partial buffering, Orenosv will automatically send generated outputs in HTTP/1.1 chunked encoding, instead of closing the connection.)

Output Filters for HTTP compression and regex-based content rewriting

Orenosv has output filter mechanism. An output filter is a module that accepts output from any handler (content-generator) and processes them in a way that is specific to the filter. The filter then sends them out to network. Additonally you can stack up several filters on a line, forming a chain of filters.

As an application of this mechanism, Orenosv has a builtin module that does HTTP compression to output of any handlers. Using this filter you can compress dynamic contents as well as static files on the fly. The compression filter uses ZLIB library.

There is also another output filter that rewrites some part of content (i.e. HTTP response body) dynamically based on regular expression-based rewriting rules. This filter uses PCRE (Perl Compatible Regular Expression)library for regex processing. Actually this filter is implemented as a submodule for Filter Extender.

Currently output filters are executed in main process only. The main process collects outputs from handlers running in another process, applies filters, and then sends them out.

Please note that output filters are different from ISAPI filters and Orenosv does not currently support ISAPI filters. It seems to me that ISAPI filter mechanism is a set of hooks, rather than a systematic filtering mechanism.

Bandwidth Control (Throttling)

Orenosv allows to create bandwidth groups, each of which is assigned a fixed amount of network output rate per second. An http request is associated with one bandwidth group if its URL pattern matches with that of the request. Sending out the output of the request (response) will be controlled in terms of network output rate according to the associated bandwidth group.

If there are N requests (that are in sending phase) in a bandwidth group of 32KB per second, the 32KB/sec bandwidth is distributed among those N requests. Note that by default the bandwidth is not fairly distributed to each request. Rather, packets are sent on first-come-first-served basis. (Please see Users Guide for more discussion on this.)

Please note that controlling bandwidth of dynamically generated contents must be used in conjunction with full output buffering. The generated content is written out to disk (if it overflows memory buffer) and a single sendfile thread handles asynchronous sends for all throttled connections. Static files are, of course, sent the same way, served directly from the disk.

This technique is absolutely necessary to handle many hundreds (or thousands) of throttled connections. This is because otherwise a bandwidth-controlled server thread would keep precious server resources held up while trickling output. Bandwidth throttling capability is present in most HTTP servers (like Apache), but in many of them a thread (or a process) simply pause between each send.

OS-level Security

Orenosv user authentication and authorization can be optionally integerated with the Windows operating system.

OS-level authentication
For both HTTP and FTP, users can be authenticated against Windows account database.
OS-level authorization (NTFS-ACL)
For FTP service, in addition to Orenosv-level access control, access authorinzation to files by users who are OS-authenticated can be also done by the Windows operating system. Files needs to be on a local or remote NTFS-formated drive because the OS uses ACL feature of NTFS.
Per-process security attribute
For HTTP service, both CGI processes and worker processes in process groups can be executed in the security context of a administrator-specified OS user. Combined with Orenosv's flexible handler-mapping, the administrator can specify exactly which applications run in which OS user context.

Automatic death-detection and restart

Orenosv main server is monitored by its parent Orenosv admin server. If main server causes an access violation, it dies fast, and the admin server will restart the main server as soon as it detects the death. "Hang" detection will also be implemented.

Support for IP Version 6

Orenosv supports use of IPv6 TCP connections for both HTTP and FTP. It can accept requests from both protocols simultaneously.

Simple Virtual Hosts

Orenosv supports simple name-based virtual hosting. For each virtual host, administrator will specify a virtual path to which a virtual host is mapped. Orenosv will convert a URL that specifies a virtual host to the URL that has the mapped path.

Let's say virtual hosts are configured as follows:

                 [vhost name]  [URL to map to]
http_vhost_dir = virthost-1    /vhosts/virthost-1
http_vhost_dir = virthost-2    /vhosts/virthost-2
http_vhost_dir = virthost-3    /vhosts/virthost-3
http_vhost_dir = vhost3-alias  /vhosts/virthost-3

If a client sends a request that specifies a virtual host,

http://virthost-1/abc/def.html

then the URL is converted to the following

http://myserver/vhosts/virthost-1/abc/def.html

From then on, all processing is done using this non-vhost path, except that "Server variables", (i.e., CGI env variables or ISAPI GetServerVariables), return values that are appropriate for the request's virtual host. The following variables will have virtual-host specific values.

SERVER_NAME,REQUEST_URI,SCRIPT_NAME,URI... : values that you would expect from a virtual host server.
DOCUMENT_ROOT : the filesystem path that corresponds to non-vhost virtual path. In the above example, whatever the filesystem path that /vhosts/virthost-1/ is mapped to.

In short, from web server administrator's perspective, he/she doesn't need to pay much attension to virtual hosting, except for http_vhost_dir configs. On the other hand, from application developer's perspective, the server is fully operating in virtual hosting mode.

Since this vhosting functionality is kept very simple, Orenosv will be able to dynamically change virtual host configs without the server reboot. This will be implemented in a future version.

Integerated FTP service

Orenosv has a builtin FTP service in the main process.

User authentication database can be shared between HTTP and FTP services.
IPv6 enabled
FTPS (FTP over SSL) support

For more detail, see ftp_en.html.

Minimally dependent on Windows OS

Orenosv runs on Windows NT/2000/XP, so how can it be minimally dependent on Windows?

Orenosv's core functionalities depend only on the core OS facilities of Windows NT-based operating system that we have all known to be reliable, versatile and secure. These facilities include robust process/thread/memory/file management and TCP/IP stack. It does not, for its core functions, use other more higher level components like CryptAPI(encryption and SSL), WinInet(HTTP/FTP protocol handling). Of course there are many parts in Orenosv that depend on components like CryptAPI, WinInet and Windows Security, but they are there for optional functionalities, that can be easily disabled or replaced with other methods.

There are many advantages of being minimally dependent on Windows.

Availability of Orenosv is not tied to Windows versions.
Look at IIS versions. You cannot run IIS 6 on Windows 2000 or run IIS 5 or 6 on Windows NT 4.0. You have to upgrade the whole operating system just to obtain a minor functional upgrade.
More controllable, more predictable for Orenosv auther.
One notable example that happened in Windows XP is CryptAPI/Schannel API. Microsoft decided to install "update of root certificates" in all Windows XP default install. When this component is installed, any application that uses CryptAPI/Schannel for its SSL processing will attempt to contact a Windows update site and install certificate of any trusted root CA that Microsoft determines trustworthy, all without any consent from the user.
Microsoft made Windows XP not just an operating system, not just an integrated software of operating system and a set of Internet applications, but an integral part of Microsoft's Internet services infrastructure.

Maybe the above facts are to provide a regular consumer Windows user with a seamless Internet experience, but for people who use Windows as a plain tool, they just pose more troubles than necessary. These are the reasons why Orenosv tries to be self-contained, utilizing written-from-scratch libraries and open source software like Openssl and zlib.

Other functionalities

Support for large files
can transfer files larger than 2GB. can also handle over 2GB request body.
IP-based access-control
file-based authentication and authorization, allows multiple realms.
limited support for SSI(server side includes)
implemented as cgi executable. see details.

Features/Improvements planned

In addition to general code improvements/cleanup/bug fixing, the following are planned.

Minor but necessary features

To allow each script to specify output buffering mode and/or buffer size.
builtin log rotation

Output Caching

Buffered output is already on disk so why not preserve them and use as cache? Here's why not: 1) generally applicability of caching to dynamic content is very small. Output of most applications cannot be cached without proper invalidation model. 2) Caching at application level would be a much better solution because the application knows exactly what/when to cache and not cache and also the app-level cache can be used for other purposes or by other applications. Caching by application framework is also a good choise, because an application usually assembles a page from several templates which the application framework generates.

Still, I'm considering to add caching features because there are some applications like report generation or thumbnail generation that don't need strict invalidation model but a simple expiry time. It is important to note that statefull(session-based) pages can never be cached. When building dynamic pages, you should try to make a distinction between these two so as to make stateless pages cacheable.