Standard answers were needed for many specification, design, and implementation problems such as file naming conventions, file permission checking, file syntax and semantics, error handling, etc. A library with low maintenance code that could be read by all members of the team was required.
A set of routines collectively called the ’configuration file parser’ was created to manage this ’sticky’ data. It was then discovered that an extension to these routines could handle command line parsing as well. These additional routines are available but are not required for applications that are interested in the original purpose. Our form handlers also tie into this package to allow smarter and friendlier programs with a minimum of rewrite. Most recently an I/O configuration standard has been created using these same tools.
All of the configuration file parse routine names
start with cfp_
. All defines can be found in <cfht/cfp.h>
. The library to use
is libcfht.a
. The configuration files themselves are called par files, and
their names typically end with a .par
suffix.
The simplest way to handle a program such as the one above is to name the parameter file the same as the program and append a .par suffix. For example, if the program were named dojob, then the parameter file would be called dojob.par. When using the
#include <cfht/cfp.h> main(argc, argv) int argc; char **argv; { FILE_ID id; double val; /* handle user input */ id = cfp_parse(CFP_PAR, argc, argv); /* get current value of required variable */ cfp_getcur(id, "val", (void *) &val, CFP_DOUBLE, sizeof val); /* do some custom work */ val = dojob(val); /* custom work changed value, so save new value */ cfp_putcur(id, "val", (void *) &val, CFP_DOUBLE); /* save changes since parse in real file */ cfp_putf(id); /* needed to save putcur update */ }
cfp
tools in this manner, many activities are automatic.
Master versions of the par files are maintained in a single directory and
copied to the user’s directory the first time they are updated. The contents of dojob.par when val is equal to 10 might look like:
val = { r, 10 }
# current value
The syntax of the val line above is quite powerful. The 10 is the current value of the val option. The current value is always found in the second field. The r in the first field tells cfp_parse(3) to val can be updated (replaced) in dojob.par so that the next invocation of the program also sees the new value. If the user invoked the dojob program starting with the above dojob.par file three times in succession as follows:
the value of val would start out as 10, go to 20, and stay at 20.dojob
dojob -val 20
dojob
#
character; the #
and all characters to the end of that line
are ignored. There is no mechanism for line continuation. In all of the
following discussion, it will be assumed that all comments and blank lines
have already been removed; any format described below can be followed by
a comment. Non-blank, non-comment lines come in three forms, constant lines, variable lines, and alias lines. Typically the values assigned on constant lines do not change whereas the values assigned to variables do change. Alias lines provide alternate (possibly variable) names for other lines. The general form for a line is:
<name> <assignment> <value>, as described in detail below.
%$&._+[]
). Per cent and dollar sign are not recommended because they have
additional meanings. Case is significant. Legal names include "Xwindows",
"Filter1", "+HELP_+f28", "2Door", and "filtername[B]". However, for left
and right square brackets, see the note below about ranges. =
) as the assignment symbol. Alias lines are defined
by the sequence ->
as the assignment symbol. ~
) is the first
non-blank character of the value, it and the next unquoted/unescaped double
quote on the line are not included in the value. To obtain one double
quote as part of this value, it must be quoted or escaped by preceeding
it with a backslash (~
). Any other double quotes are part of the value.
If any of the characters in the value is a #
, backslash, newline, or return,
each instance of the mentioned characters must be preceded by a backslash.
The #
does not need to be escaped inside of paired double quotes. Spaces
and tabs inside the value are maintained and do not have to be quoted;
leading or trailing spaces and tabs must be quoted if they are to be kept
as part of the value. If when the par file is read in a constant line value
starts with double quotes (as in D or F below), when the file is output
the entire value will be enclosed in double quotes. A constant value may
not begin with an unquoted left brace ({
). The following lines show legal constant lines with the actual value shown as a comment, between hyphens which are not part of the value. The left "column" shows the file version of the definition; the comment shows what a program sees as the value.
Note that the "\n" and "\r" in the value for G are real newline and return, respectively.A = # --
B = ~~ # --
C = The character ~a~ # -The character ~a~-
D = ~A quoted string~ # -A quoted string-
E = one ~ # -one ~-
F = ~leading @~ed~ with more # -leading ~ed with more-
G = specials @# @@ @n @r # -specials # @ @n @r-
H = inside spaces # -inside spaces-
I = no leading spaces # -no leading spaces-
J = ~ leading spaces ~ # - leading spaces -
{
) and
continues through the matching right brace (}
). Within the braces there
can be up to five fields, the mode field, the current value field, the
default value field, the supplied value field, and the range field. These
fields are separated by commas. Rightmost fields that are not used for
a particular variable instance may be omitted, including their preceding
comma. Other fields may be omitted by just including their following comma.
A variable line looks like <name> = { <mode>, <current>, <default>, <supplied>, <range> }, when all fields are specified.
Each field is delimited by spaces,
tabs, the preceeding and/or following comma, and the preceeding and/or
following brace. If any of these characters are part of the desired value
for a field, the field must be enclosed in double quotes (~
). If any of
the characters in a field is a #
the field must be enclosed in double quotes.
If any of the characters in a field is a double quote, backslash, newline,
or return, the field must be enclosed in double quotes and each instance
of the mentioned characters must be preceeded by a backslash. When a par
file is output by a program using cfp_putf(3)
, fields are only enclosed
in double quotes when they are required.
<mode>
The mode field gives actions to be done on the current value when it is parsed, read, written, or output (with cfp_parse(3) , cfp_getcur(3) , cfp_putcur(3) , or cfp_putf(3) , respectively). Each action is specified by the single character given below. These characters can appear in any order in the mode field, but the preferred order is alphabetical (dmnors
), and the cfp routines will always output them in that order.
o
- When cfp_parse(3) encounters -name while parsing a command line, a mode character of
o
indicates that a value is optional (e.g., -val 10, or -val are both legal). If neithero
norn
is given in the mode field, -name without a value is invalid. It is always valid for -name to be omitted from the command line.n
- When cfp_parse(3) encounters -name while parsing a command line, a mode character of
n
indicates that a value is not allowed (i.e., -val 10 is not legal, but -val is). If neithero
norn
is given in the mode field, -name without a value is invalid. It is always valid for -name to be omitted from the command line. This is useful for Booleans (see the "dns" example below).d
- This mode is used two ways to provide default values. If cfp_parse(3) encounters -name while parsing a command line, no value is given, and no
s
mode character is present, the default value (third field) is copied to the current value (second field). If cfp_getcur(3) is called on this variable, no value has been provided for this variable on the command line, and no modes
supplied value has been used, the default value is copied to the current value and returned.s
- When cfp_parse(3) encounters -name while parsing a command line, a mode character of
s
indicates that, if no value is provided for -name on the command line, the supplied value (fourth field) is copied to the current value (second field).r
- When a par file is written back to disk (by cfp_putf(3) ), a mode character of
r
indicates that the current value is to be saved in the file. If anr
mode character is not given, the original value from the file is retained as the current value.r
variables provide long term memory, between program executions.An example of using several specifiers at once can be found in the use of Booleans. Suppose you want a par file variable to be
TRUE
if its name appears on the command line andFALSE
if not for a program named runme. Then the runme.par file should contain:The programmer would get the option value with:now = { dns, , False, True } # some Boolean
where the C language now variable is type BOOLEAN. Because of thecfp_getcur(id, @now@, (void *) &now, CFP_BOOLEAN, sizeof(now) );
n
mode character, a value cannot be supplied on the command line with-now
. Because there is nor
mode character, the current value field in the file will never change. If the program is invoked as:the variable now would have the valuerunme
FALSE
after the cfp_getcur(3) call because thed
mode character caused cfp_getcur(3) to substitute the default value (here "False"). If the program is invoked as:the variable now would have the valuerunme -now
TRUE
after the cfp_getcur(3) call because thes
mode character caused cfp_parse(3) to substitute the supplied value (here "True") since-now
appeared on the command line without a value.
<current>
This field contains the current value of the variable. This is the active memory for the variable.
<default>
This field contains
the default value of the variable. This is used in conjunction with the
d
mode character and by the Defaults button of xform(1)
.
<supplied>
This field
contains the supplied value of the variable. This is used in conjunction
with the s
character (when an option is supplied but no value is given).
<range>
This field contains a range specifier for legal values for this variable. If them
mode character is present, all references to this variable’s current value by cfp_getcur(3) , cfp_putcur(3) , cfp_getval(3) , cfp_putval(3) , and reference on the command line cause range checking on the current value. This includes xform(1) text edit fields. For a cfp call with a type ofCFP_INTEGER
, range checking is done on an integer value; with a type ofCFP_DOUBLE
, range checking is done on a double value; with a type ofCFP_STRING
, the value type depends on the range specifier.A range specifier can be either
int
to accept any integer value,double
to accept any double value, or a range value that looks like a number line interval. Parentheses indicate open ranges, and square brackets indicate closed ranges. Minimum and maximum values can be given as numbers, references to other variables (%name to incorporate the current value of the variable name), or omitted to indicate no check. Numbers can be integers, doubles, or times. A number line interval can be preceeded byint
ordouble
to provide a type. If any numbers given are integers (i.e., they include only signs and digits, no decimal points, no exponents, and no colons), the range is for an integer value whenCFP_STRING
is given as the variable type. Note that for variable references there MUST be a space between the variable name and the]
. Examples include the following, with the range specifier on the left and the acceptable values on the right.int
- any integer value
double
- any floating value or time
@[0,1]@
- 0 or 1
@double [ 1, 10 ]@
- floats from 1.0 through 10.0 inclusive
@ [ -1.0, +1.0 ) @
- floats from -1 inclusive to 1 exclusive
@[ 1, %xmax ]@
- raster sizes from 1 through chip size
@[ -90.0, +90.0 ]@
- declination values
@[ 0, 24:0:0 )@
- right ascension values
@[1]@
- any positive integer
If a par file contains
any reference to refvar will actually read or write the value of var.var = 3
refvar -> var
If a par file contains
the current value of refver determines what variable is actually accessed when refrefvar is referenced. With the above values any reference to refrefvar will reference var1, but if refvar’s value is changed to var2, var2 will be referenced instead.var1 = 1
var2 = 2
refvar = { r, var1 }
refrefvar -> %refvar
The basic theory behind all par file usage is that every data item has a name which is used as a lookup key. Therefore, given a par file that looks like:
the code fragment to retrieve this data might be:
angle = 20 size = 50 100 type = low mass
There are also a set of macros defined for cfp_getcur (CFP_GETCUR), cfp_getval (CFP_GETVAL), cfp_putcur (CFP_PUTCUR), and cfp_getstr (CFP_PUTCUR). These convience macros will take care of calling cfht_log automatically. They are invoked exactly the same way as the normal function calls. If they detect an error, they will exit the current routine returning FAIL. Note that there is no way to determine the return value from cfp_getcur when using the CFP_GETCUR macro.
cfp_getstr(id, "angle", buf, sizeof(buf)); angle = atoi(buf); cfp_getstr(id, "size", buf, sizeof(buf)); sscanf(buf, "%d %d", &height, &width); cfp_getstr(id, "type", type, sizeof(type));
cfp
technology identifies
the file in use by an id. This id is of type FILE_ID. id can be acquired
in one of two ways:
- cfp_parse()
- can be called with the magic CFP_PAR argument instead of an id. In this case the par file is automatically found by tacking .par on the back of argv[0] and then looking in several possible system areas as well as the home directory. This is the preferred technique for a program’s main par file.
- cfp_init()
- In this case the file path name is known, and all that is required is an id for the file. This is often used in conjunction with cfp_file(3) which will hunt down the true path to a file when all that is known is the simple name. This technique is used when the par file name itself is a variable or when programs share a par file.
Lamp1 = {r, "He Ar", "He Ne"}