CFHT HOME RPM Installation and Administration


up: RPM Info next: RPM user's manual
This document contains information on downloading installation and configuration of the RPM Web Server. It also contains a note about log file structure and a typical list of related files.

Downloading

From the bottom up, the following things are required to build and run RPM:
  1. A unix machine on the internet. (Tested platforms include Linux 1.x and 2.x, HP/UX 9.0.x, SunOS 4.1.x, and Solaris 2.5)
  2. An installed and running Web server that understands the common gateway interface (CGI)
    or
    Have the standard inetd invoke RPM for you. (This is what we do at CFHT.)
  3. Precompiled RPM binary

Installation


Now you have to make a choice of running RPM as an independent HTTP server on its own port number, or, as a CGI script running under some other Web server. At CFHT, we run RPM in inetd mode on port 911. The binary is automatically installed in /usr/local/cfht/bin/ by the makefile, and /etc/inetd.conf needs to be manually set up to point there, as described below...

For running under `inetd'

You will have to add the following line to /etc/inetd.conf:
  rpm stream tcp nowait root /usr/local/bin/rpm /usr/local/bin/rpm
  
The file "rpm" should be a sym-link in the file system pointing to whatever version you want to run. Next, add this line to /etc/services:
  rpm             911/tcp                 # rpm http server	
  
Or set "911" to whatever port you want it to answer on. "80" is the standard HTTP port. Finally, send your inetd process the HUP signal so it re-reads its configuration files (or if desperate or lazy... reboot, and the changes will take effect for sure.) Typical URLs with this scheme will now look something like this:
  http://rpm.cfht.hawaii.edu:911/some.file.on.your.server.xyz

For running under another httpd as a CGI script

Copy the binary into your /cgi-bin directory, but rename the file to something that starts with "nph-". This little bit of filename magic tells the server not to attempt to parse RPM's headers, since it can handle all that itself. For example, install the file with
  cp httpserver-OSNAME /usr/local/httpd/cgi-bin/nph-rpm
Which would allow you to access files with RPM by accessing the virtual URL
  https://www.cfht.hawaii.edu/cgi-bin/nph-rpm/some.file.on.your.server.xyz

Configuration

The server should now run. But there are other features that can be controlled by editing the file /etc/rpm.conf:

A Note about Security

Anyone who connects to RPM has the ability to not only read files in the specified directories, but also to execute programs, write files, and delete files on your system. If you are not careful when setting up RPM, you can open yourself up to some very big security holes, so read this section carefully.

/etc/rpm.conf: the {MIME-TYPE ... } section

NOTE: Many useful defaults are already compiled into the server. You can probably skip to the next section and take the defaults to get started...

Use this to tell the server what kind of file something is according to its extension. Suppose you wanted to add "*.htm" as a valid extension for an HTML file. You would add:
  {MIME-TYPE
    ...
    htm = text/html
    ...
  }

/etc/rpm.conf: the {FEATURE ... } section

NOTE: The defaults built into the server for this section are probably good enough, so unless you want to experiment you can probably skip to the next section...

Not all browsers support the same features. The formats that browsers can accept, and the types of tricks they can do was originally intended to be negotiated by headers specific for each feature (for example if a client sends "Connection: Keep-Alive" it's supposed to mean they support some kind of persistent connection mode). Unfortunately this system is really quite broken, because many features don't have a useful or consistent header that corresponds, and even one's that do, like Keep-Alive, are not always to be trusted. Netscape 2.0, for example, claims that it can do Keep-Alive, but it is really broken. Because of this great mess, the {FEATURE...} section is used to tell RPM which versions of which browsers support certain features. The easiest way to override defaults in the {FEATURE section is to set a particular feature to "*" (=on for everyone) or to "X" (=off for everyone). See the source file "rpm_config.cc" for more details.

Supported {FEATURE ...}'s


last-modified
Client properly uses "If-Modified-Since/Last-Modified" headers, and the server can safely tell the browser to "use cached copy" if the dates agree. This is set to "*" by default since no browsers are known to have any problems with this.

frames
This is a list of browsers that support netscape's ``frames''.

javascript
This usually contains a list of browsers that have half-way working support for JavaScript. If a browser is not included in the list, RPM will not bother to include javascript generated from ".rpm" files. JavaScript in normal HTML documents will still be sent out, however.

java
A list of browsers that support java.

dynamic
A list of browsers that understand how to handle server-push dynamic documents. If a browser is not on this list, it will be sent a single, snap-shot frame of the page, rather than a self updating display.

keep-alive
List of browsers that RPM's keep-alive implementation is known to work with. Keep-Alive can speed up loading of pages, especially if many small gifs are involved. It is not, however, necessary.

/etc/rpm.conf: the {CONFIG ... } section

All other setup parameters for the server fall into the {CONFIG...} section. Most can be omitted, and the server will use built in defaults. The table lists all the ones that can be overridden. Note that many are obscure and would not need to be changed under most circumstances.

The most useful {CONFIG ...}'s


local_access
Specifies the access level given to hosts connecting from the local domain. For full protection against name spoofing, rpm should be run with a tcp_wrapper, a firewall, or an enhanced inetd. Possible settings are demo (the default, won't let anyone execute any ACTIONs, but server-side script tags are executed and all files are fair game and can be viewed), none (completely shut down the server), authorized (prompts for a valid username and password before allowing any access, including ACTIONs), open (no passwords, anyone can load and execute what they like, including ACTIONs).

remote_access
Takes the same settings as local_access, but authorized is the default setting. This applies to all hosts outside of the local internet domain.

host
It is possible to have multiple {CONFIG...} sections in the rpm.conf file. This can be useful if you want several servers to read the same global configuration file, or if a single server machine is hosting serveral virtual sites (even with IP aliasing!) If a config section contains a host=... then it only applies to a server accessed at that hostname. In this way you can specify different document directories or access permissions for different servers in the same configuration file. To guarantee that this works with all browsers, you must use IP aliasing, but since most browser send a "Host:" header, you can specify additional virtual hosts for those browsers without using IP aliasing.

port
This setting can be used to create virtual servers in the same way that the host setting can. This setting can be used instead of or in combination with the host setting.

webmaster
Place the webmaster's email address here

core_dir
If you don't set this one, or if you set it to some string that's not a real directory on your file system, then all signals will be trapped, and some kind of message will hopefully be sent to the client that something went wrong. It may be more useful, in some cases, to allow a core to be generated, in which case you should set this to the directory where the core should be left (for example "/tmp"). You can use the rpmcrashtest server directive to test core generation.

log_dir
Path to the directory where RPM will build an HTML browseable log structure. The directory should be writable ONLY BY ROOT.

script_dir
Path to the directory where javascript and server-side script tags will be installed.

default_dir
Path to the "document root" directory on your server

default_user
This should be an unpriveleged username (like "nobody") to use when accessing pages in the default_dir.

default_home
Set this to something like "~/rpm" to automatically give ALL users RPM directories without having to add them to the SECURE section one-by-one.

default_index
A list of ',' separated filenames to look for in a directory before generating an automatic directory listing. Default value is "index.rpm,index.cgi,index.html".

local_list NEW in RPM 2.0.8
A list of ',' separated filename patterns which should never be served to a remote host, regardless of the setting of remote_access. Default is "/LocalAccess/*,/local_access/*".

log_list
A list of ',' separated filename patterns for directories that contain RPM log files. The default is "/logs/*".

cgi_list
A list of ',' separated filename patterns which should be executed as CGI scripts instead of being sent out normally. The default is "*.cgi,/cgi-bin/*". NOTE: This allows a CGI script to exist ANYWHERE on the server, so long as the filename ends in ".cgi".

cgi_nph_list
If and only if a filename was matched by cgi_list, it is then checked against this list. If it also matches this list, it is executed as a "no-parse-header" CGI script. The default is "*/nph-*".

Rarely Used {CONFIG}'s


verbose
If set to on, more information will be logged to syslog/ cfhtlog. The default is off.

redirect_links
If set to on, a request for a URL that ends up being a symlink to another file at the same level only will result in an HTTP redirect. The default is off.

timings
If set to on, an approximation of how long it takes each document to reach a user with a graphical browser will be logged, and displayed at the bottom of each document in fine print. If set to hidden, approximate transfer times will be logged, but not shown to the user. If set to off, no attempt will be made to time performance.

proxy
If set to on, the server will act as both a regular HTTP server, and as a (non-cacheing) proxy server.

profile
If set to on, the server will enable profiling timers to measure server performance. A table of how long it took to complete various parts of document preparation and server setup will be included with the HTML sent to the browser. If set to off, you can still get most of the timings by appending the rpmprofile directive to a URL.

image_generation
The string with which all special paths that are meant to access the built-in server-side image generation begin with. The string "/imggen" is the default. You can't have a real file or directory by this particular name on your server.

icon_generation
The string with which all special paths that are meant to access the built-in server-side icon selection begin with. The string "/icongen" is the default. You can't have a real file or directory by this particular name on your server.

server_id
Don't change this. It identifies the version of the server that is running. The value this was precompiled into the binary is what we want to see here.

server_date_fmt
Internal stuff for the HTTP protocol that RPM uses. This is a string that strftime should use for formatting the dates on the HTTP headers. If you mess this up, some client's last-modified caching might break!

keep_alive_max
Maximum number of requests to serve sequentially on a single keep-alive connection. Default is 30.

keep_alive_timeout
If no new requests come in for this many seconds, break the connection during keep-alive mode. Default is 45. Note: This is different from RPM dynamic (server push documents). These will keep a connection open indefinitely.

network_timeout
Pauses longer than this when reading data from the client will cause the server to close the connection. Default is 45.

exec_timeout
If set to 0 (the default) processes that the server forks (such as CGI's and handlers) have unlimited time to complete their functions. Otherwise, and error will result after however many seconds you set this to, in case of a runaway process.

bgcolor
The default background color to use for generated forms. For pegasus, this defaults to "#BBBFAA".

bgcolor_light
A lighter version of the background color for grey-ed out items. Default is "#C0C0C0".

demo_head
Full path to a ".rpm" file to process and send before the body of a file in "demo" mode. This demo header should contain only valid tags normally found in the {BODY} section of RPM files.

demo_tail
Full path to a ".rpm" file to process and send after the body of a file in "demo" mode.

These {CONFIG}'s only apply to CFHT's version


CFHTLOGU
This defaults to "~/.,session.np". It tells the server what to store in the environment variable $CFHTLOGU before initializing the standard CFHT logging functions.

CFHTLOGS
This defaults to "/tmp/pipes/syslog.np". It tells the server what to store in the environment variable $CFHTLOGS before initializing the standard CFHT logging functions.

hform_netscape_exec
Complete path to the netscape executable to run when an hform is invoked.

hform_netscape_version
The version number of the netscape being used, exactly as it appears in the _MOZILLA_VERSION X-property. The default is "3.0", but other possibilities are things like "2.01" or "4.0b6"

hform_default_dir
A directory that contains default copies of customized versions of the files preferences (to control some aspects of netscape's appearance), and Netscape (to control still more aspects, through X-resources).

hform_new_window
The text to appear on the title bar of a new hform while it's busy loading the real thing.

hform_empty
The url to load into an hform window while it's busy loading the real thing.

hform_http_server
The http server where RPM is running. Note that settings in a user's .,net.par file will override this.

hform_version_prop
The name of the X-Property that netscape makes available to advertise its version. By default, netscape uses "_MOZILLA_VERSION", but at CFHT it has been changed to "_MOZILLA_RPMVERS" so that hform will not interact with the user's own netscape window, if they happen to be running it at the same time.

An example /etc/rpm.conf

All of the above may seem complex, but an /etc/rpm.conf file would rarely need to use most of the features described above. They are provided only to allow you to keep RPM up-to-date with the latest browsers without having to recompile the binary. A typical /etc/rpm.conf file can be pretty short:

{CONFIG
 local_access = authorized
 remote_access = none
 webmaster = me@my.machine.net
 log_dir = /usr/local/rpm/logs
 default_dir = /usr/local/rpm/documents
 default_home = ~/rpm
}

Note how the all the sections have been completely omitted, except for {CONFIG}. The example above allows anyone with a subdirectory called "rpm" off of their home directory to put out RPM documents on the server (default_home=~/rpm), and allows anyone with a valid user-id and password on the machine running RPM to get access to any of the pages. Passwords are sent unencrypted across the network. You have been warned.

Log File Structure

If the directory sepcified in {CONFIG log_dir="..."} is writable by root, RPM will create an HTML browseable log of all transactions in this directory. You can point any browser at this directory locally to see the logs, or you can load them through RPM if a link has been set up (in {CONFIG default_dir="..."}) called "logs" that points to the directory. In the latter case, RPM will update the log screen dynamically if you leave it up. Accesses to the /logs/ directory are not logged in the log to prevent obvious feedback problems.

File List

Here are some typical places RPM-related files might show up with the installation we have at CFHT:
/etc/services
/etc/inetd.conf
/etc/rpm.conf
/usr/local/cfht/bin/rpm-neptune -> (sym-link to appropriate version)
/usr/local/cfht/conf/rpm/home/*  (document root)
~user/rpm/*                      (each user's document root)

1996 October 23 by Sidik Isani
isani@cfht.hawaii.edu