[Go to CFHT Home Page] Man Pages
Back to Software Index  BORDER=0Manpage Top Level
    CFP(4) manual page Table of Contents

Name

cfp - configuration file syntax and usage

Overview

Many of our software packages require us to maintain non-volatile configuration or state information between program invocations. This information can be hardware configuration, user parameters which are sticky from invocation to invocation, run time control data extracted from a program to reduce maintenance and reconfiguration time, etc.

Standard answers were needed for many specification, design, and implementation problems such as file naming conventions, file permission checking, file syntax and semantics, error handling, etc. A library with low maintenance code that could be read by all members of the team was required.

A set of routines collectively called the ’configuration file parser’ was created to manage this ’sticky’ data. It was then discovered that an extension to these routines could handle command line parsing as well. These additional routines are available but are not required for applications that are interested in the original purpose. Our form handlers also tie into this package to allow smarter and friendlier programs with a minimum of rewrite. Most recently an I/O configuration standard has been created using these same tools.

All of the configuration file parse routine names start with cfp_. All defines can be found in <cfht/cfp.h>. The library to use is libcfht.a. The configuration files themselves are called par files, and their names typically end with a .par suffix.

Argument Parsing

A very common use of par files is handling sticky command line arguments. A simple program is found below:

#include <cfht/cfp.h>
main(argc, argv)
    int argc;
    char **argv;
{
    FILE_ID id;
    double val;
    /* handle user input */
    id = cfp_parse(CFP_PAR, argc, argv);
    /* get current value of required variable */
    cfp_getcur(id, "val", (void *) &val, CFP_DOUBLE, sizeof val);
    /* do some custom work */
    val = dojob(val);
    /* custom work changed value, so save new value */
    cfp_putcur(id, "val", (void *) &val, CFP_DOUBLE);
    /* save changes since parse in real file */
    cfp_putf(id);    /* needed to save putcur update */
}
The simplest way to handle a program such as the one above is to name the parameter file the same as the program and append a .par suffix. For example, if the program were named dojob, then the parameter file would be called dojob.par. When using the cfp tools in this manner, many activities are automatic. Master versions of the par files are maintained in a single directory and copied to the user’s directory the first time they are updated.

The contents of dojob.par when val is equal to 10 might look like:

val = { r, 10 } # current value

The syntax of the val line above is quite powerful. The 10 is the current value of the val option. The current value is always found in the second field. The r in the first field tells cfp_parse(3) to val can be updated (replaced) in dojob.par so that the next invocation of the program also sees the new value. If the user invoked the dojob program starting with the above dojob.par file three times in succession as follows:

dojob

dojob -val 20

dojob

the value of val would start out as 10, go to 20, and stay at 20.

File Format

par files are ASCII files with a relatively free format. Blank lines are ignored. Spaces and TAB characters can be used freely between fields on a line. Comments can be put on any line by using the # character; the # and all characters to the end of that line are ignored. There is no mechanism for line continuation. In all of the following discussion, it will be assumed that all comments and blank lines have already been removed; any format described below can be followed by a comment.

Non-blank, non-comment lines come in three forms, constant lines, variable lines, and alias lines. Typically the values assigned on constant lines do not change whereas the values assigned to variables do change. Alias lines provide alternate (possibly variable) names for other lines. The general form for a line is:

<name> <assignment> <value>
, as described in detail below.
<name>
A name labels a line for reference from a program or the command line. Each name in a par file must be unique within that par file. References to a name from a program or on a command line may use any unique abbreviation of the name. Names can be any number of characters chosen from the letters, numbers, and the special characters per cent, dollar sign, ampersand, period, underscore, plus, and left and right square brackets (%$&._+[]). Per cent and dollar sign are not recommended because they have additional meanings. Case is significant. Legal names include "Xwindows", "Filter1", "+HELP_+f28", "2Door", and "filtername[B]". However, for left and right square brackets, see the note below about ranges.
A program accessing names in a par file can access them in any order. A
hash table is used internally so there is no preferred access order (used to be that sequentially was faster).
<assignment>
Constant and variable lines are defined by the presence of an equal sign (=) as the assignment symbol. Alias lines are defined by the sequence -> as the assignment symbol.
<value>
The form for value depends on which type the line is.

Constant Lines

The value for a constant begins with the first non-blank character following the equal sign and continues through the last non-blank character before the end of line (remember that comments have already been eliminated). If a double quote (~) is the first non-blank character of the value, it and the next unquoted/unescaped double quote on the line are not included in the value. To obtain one double quote as part of this value, it must be quoted or escaped by preceeding it with a backslash (~). Any other double quotes are part of the value. If any of the characters in the value is a #, backslash, newline, or return, each instance of the mentioned characters must be preceded by a backslash. The # does not need to be escaped inside of paired double quotes. Spaces and tabs inside the value are maintained and do not have to be quoted; leading or trailing spaces and tabs must be quoted if they are to be kept as part of the value. If when the par file is read in a constant line value starts with double quotes (as in D or F below), when the file is output the entire value will be enclosed in double quotes. A constant value may not begin with an unquoted left brace ({).

The following lines show legal constant lines with the actual value shown as a comment, between hyphens which are not part of the value. The left "column" shows the file version of the definition; the comment shows what a program sees as the value.

A = # --
B = ~~ # --
C = The character ~a~ # -The character ~a~-
D = ~A quoted string~ # -A quoted string-
E = one ~ # -one ~-
F = ~leading @~ed~ with more # -leading ~ed with more-
G = specials @# @@ @n @r # -specials # @ @n @r-
H = inside spaces # -inside spaces-
I = no leading spaces # -no leading spaces-
J = ~ leading spaces ~ # - leading spaces -

Note that the "\n" and "\r" in the value for G are real newline and return, respectively.

Variable Lines

The value for a variable begins with a left brace ({) and continues through the matching right brace (}). Within the braces there can be up to five fields, the mode field, the current value field, the default value field, the supplied value field, and the range field. These fields are separated by commas. Rightmost fields that are not used for a particular variable instance may be omitted, including their preceding comma. Other fields may be omitted by just including their following comma. A variable line looks like
<name> = { <mode>, <current>, <default>, <supplied>, <range> }
, when all fields are specified.

Each field is delimited by spaces, tabs, the preceeding and/or following comma, and the preceeding and/or following brace. If any of these characters are part of the desired value for a field, the field must be enclosed in double quotes (~). If any of the characters in a field is a # the field must be enclosed in double quotes. If any of the characters in a field is a double quote, backslash, newline, or return, the field must be enclosed in double quotes and each instance of the mentioned characters must be preceeded by a backslash. When a par file is output by a program using cfp_putf(3) , fields are only enclosed in double quotes when they are required.

<mode>

The mode field gives actions to be done on the current value when it is parsed, read, written, or output (with cfp_parse(3) , cfp_getcur(3) , cfp_putcur(3) , or cfp_putf(3) , respectively). Each action is specified by the single character given below. These characters can appear in any order in the mode field, but the preferred order is alphabetical (dmnors), and the cfp routines will always output them in that order.
o
When cfp_parse(3) encounters -name while parsing a command line, a mode character of o indicates that a value is optional (e.g., -val 10, or -val are both legal). If neither o nor n is given in the mode field, -name without a value is invalid. It is always valid for -name to be omitted from the command line.
n
When cfp_parse(3) encounters -name while parsing a command line, a mode character of n indicates that a value is not allowed (i.e., -val 10 is not legal, but -val is). If neither o nor n is given in the mode field, -name without a value is invalid. It is always valid for -name to be omitted from the command line. This is useful for Booleans (see the "dns" example below).
d
This mode is used two ways to provide default values. If cfp_parse(3) encounters -name while parsing a command line, no value is given, and no s mode character is present, the default value (third field) is copied to the current value (second field). If cfp_getcur(3) is called on this variable, no value has been provided for this variable on the command line, and no mode s supplied value has been used, the default value is copied to the current value and returned.
s
When cfp_parse(3) encounters -name while parsing a command line, a mode character of s indicates that, if no value is provided for -name on the command line, the supplied value (fourth field) is copied to the current value (second field).
r
When a par file is written back to disk (by cfp_putf(3) ), a mode character of r indicates that the current value is to be saved in the file. If an r mode character is not given, the original value from the file is retained as the current value. r variables provide long term memory, between program executions.

An example of using several specifiers at once can be found in the use of Booleans. Suppose you want a par file variable to be TRUE if its name appears on the command line and FALSE if not for a program named runme. Then the runme.par file should contain:

now = { dns, , False, True } # some Boolean
The programmer would get the option value with:
cfp_getcur(id, @now@, (void *) &now, CFP_BOOLEAN, sizeof(now) );
where the C language now variable is type BOOLEAN. Because of the n mode character, a value cannot be supplied on the command line with -now. Because there is no r mode character, the current value field in the file will never change. If the program is invoked as:
runme
the variable now would have the value FALSE after the cfp_getcur(3) call because the d mode character caused cfp_getcur(3) to substitute the default value (here "False"). If the program is invoked as:
runme -now
the variable now would have the value TRUE after the cfp_getcur(3) call because the s mode character caused cfp_parse(3) to substitute the supplied value (here "True") since -now appeared on the command line without a value.

<current>

This field contains the current value of the variable. This is the active memory for the variable.

<default>

This field contains the default value of the variable. This is used in conjunction with the d mode character and by the Defaults button of xform(1) .

<supplied>

This field contains the supplied value of the variable. This is used in conjunction with the s character (when an option is supplied but no value is given).

<range>

This field contains a range specifier for legal values for this variable. If the m mode character is present, all references to this variable’s current value by cfp_getcur(3) , cfp_putcur(3) , cfp_getval(3) , cfp_putval(3) , and reference on the command line cause range checking on the current value. This includes xform(1) text edit fields. For a cfp call with a type of CFP_INTEGER, range checking is done on an integer value; with a type of CFP_DOUBLE, range checking is done on a double value; with a type of CFP_STRING, the value type depends on the range specifier.

A range specifier can be either int to accept any integer value, double to accept any double value, or a range value that looks like a number line interval. Parentheses indicate open ranges, and square brackets indicate closed ranges. Minimum and maximum values can be given as numbers, references to other variables (%name to incorporate the current value of the variable name), or omitted to indicate no check. Numbers can be integers, doubles, or times. A number line interval can be preceeded by int or double to provide a type. If any numbers given are integers (i.e., they include only signs and digits, no decimal points, no exponents, and no colons), the range is for an integer value when CFP_STRING is given as the variable type. Note that for variable references there MUST be a space between the variable name and the ]. Examples include the following, with the range specifier on the left and the acceptable values on the right.

int - any integer value
double - any floating value or time
@[0,1]@ - 0 or 1
@double [ 1, 10 ]@ - floats from 1.0 through 10.0 inclusive
@ [ -1.0, +1.0 ) @ - floats from -1 inclusive to 1 exclusive
@[ 1, %xmax ]@ - raster sizes from 1 through chip size
@[ -90.0, +90.0 ]@ - declination values
@[ 0, 24:0:0 )@ - right ascension values
@[1]@ - any positive integer

Alias Lines

The value on an alias line provides an indirect reference mechanism. These indirect references can be nested. It does not matter whether the alias or the reference appears first in the par file. There are two forms.

If a par file contains

var = 3
refvar -> var
any reference to refvar will actually read or write the value of var.

If a par file contains

var1 = 1
var2 = 2
refvar = { r, var1 }
refrefvar -> %refvar
the current value of refver determines what variable is actually accessed when refrefvar is referenced. With the above values any reference to refrefvar will reference var1, but if refvar’s value is changed to var2, var2 will be referenced instead.

Generic Parameter Handling

Many uses of par files are much simpler than the input arguments cases described above. Sometimes a string (representing a number or not) needs to be saved and/or retrieved. There is to be no optional user interaction and no automatic type conversion.

The basic theory behind all par file usage is that every data item has a name which is used as a lookup key. Therefore, given a par file that looks like:


angle = 20
size = 50 100
type = low mass
the code fragment to retrieve this data might be:


cfp_getstr(id, "angle", buf, sizeof(buf));
angle = atoi(buf);
cfp_getstr(id, "size", buf, sizeof(buf));
sscanf(buf, "%d %d", &height, &width);
cfp_getstr(id, "type", type, sizeof(type));
There are also a set of macros defined for cfp_getcur (CFP_GETCUR), cfp_getval (CFP_GETVAL), cfp_putcur (CFP_PUTCUR), and cfp_getstr (CFP_PUTCUR). These convience macros will take care of calling cfht_log automatically. They are invoked exactly the same way as the normal function calls. If they detect an error, they will exit the current routine returning FAIL. Note that there is no way to determine the return value from cfp_getcur when using the CFP_GETCUR macro.

Generic File Usage

Most of the cfp technology identifies the file in use by an id. This id is of type FILE_ID. id can be acquired in one of two ways:
cfp_parse()
can be called with the magic CFP_PAR argument instead of an id. In this case the par file is automatically found by tacking .par on the back of argv[0] and then looking in several possible system areas as well as the home directory. This is the preferred technique for a program’s main par file.
cfp_init()
In this case the file path name is known, and all that is required is an id for the file. This is often used in conjunction with cfp_file(3) which will hunt down the true path to a file when all that is known is the simple name. This technique is used when the par file name itself is a variable or when programs share a par file.

See Also

cfp_cursub(3) , cfp_file(3) , cfp_finish(3) , cfp_getcur(3) , cfp_getf(3) , cfp_getln(3) , cfp_getsiz(3) , cfp_getstr(3) , cfp_getval(3) , cfp_init(3) , cfp_io(3) , cfp_match(3) , cfp_parse(3) , cfp_putcur(3) , cfp_putf(3) , cfp_putln(3) , cfp_putstr(3) , cfp_putval(3) , cfp_range(3)

Warnings

When declaring par file variables, enclose strings in double quotes if the initial value has any special characters (spaces, tabs, commas, colons, semi-colons, quotes, pound signs, right braces, or newlines in particular). If the value is changed later, the parse routines will use quotes when needed. A null string can be given with "". A quote can be included as "\"". For example:

Lamp1 = {r, "He Ar", "He Ne"}


Table of Contents