Software Development at CFHT

Anyone contributing software to the main development tree (/cfht/src/..., formerly /usr/local/cfht/dev/{pegasus,medusa,etc.}) must follow these basic guidelines. You may be able to keep track of what you are doing without following all of this, but in order for everyone in the group to be able to make some basic assumptions about the software in /cfht/src/, we all need to comply. If you don't, you could be causing one of your colleagues (and possibly yourself) a huge headache some time in the future. This is an attempt at a practical guide on how to maintaining some basic organization in the CFHT source directories. If you have your own methods of achieving the same ends, please make sure that the result is compatible with what is described here, especially the information concerning the use of RCS and archiving.

Contents

Location of Files
Makefile conventions
Index files
Changes files
Check-in and Check-out
Releases
Archival Back-ups

Location of Files (Back to contents)

All CFHT Unix machines now have a /cfht/ automount point. Subdirectories are as follows:
/cfht/src/
This directory is always a mount of saturn:/usr/local/cfht/dev/, which contains subdirectories "pegasus" for global session-related software, "medusa" for instrument specific software, and "epeus" for third party software used in observing sessions. The new Makefile setup will build stuff from either /cfht/src/ or the old location, /usr/local/cfht/dev/. For now, if a machine has both, please build from /usr/local/cfht/dev/, until the transition to /cfht/src/ has been completed.

/cfht/man/
This is always a mount of saturn:/usr/local/cfht/man/.

The rest of the directories (bin, lib, include, and obs) are different for different machines. All (old) SunOS machines currently mount all of these from "uwila", Sparc Engines use "titan" (even at the summit, which is why you need to use /usr/local/cfht/dev/ instead of /cfht/src/ on those machines!). Waimea HPs (HP-UX 9.x) use "saturn", while those at the summit use "neptune". A limited set of HP-UX 10.x files exist on "hookele". All Solaris machines (2.5.1 and 2.6... beware, this could cause some problems if things are compiled under 2.6 and you try to run them under 2.5.1!) currently use "makani", except for the 12K detector hosts "paniau" and "akua" which have their own local copies. I need to create a new CLASS for them in auto.cfht. Some work is needed on finalizing /etc/auto.cfht, but this is the current state.

/cfht/bin/
The file /cfht/src/local-install.sh makes a set of symbolic links into /usr/local/bin of every machine at CFHT, pointing to a few programs in /cfht/bin/ that are generally useful. For example, the "clone" command for director can be run by any user from any machine (as long as they at least have /usr/local/bin/ in their PATH). unroll and tcshandler are a couple of other links created by local-install.sh at the moment. /cfht/bin/ shouldn't need to appear in anyone's PATH (if it does, it should be at the end.) Observing accounts have symbolic links in $HOME/bin and if you wish to make programs runnable by staff members, add them to the /cfht/src/local-install.sh script and re-run it.

/cfht/lib/
At the present time, we only use static libraries, so at runtime, nothing uses /cfht/lib/. On saturn or in Waimea, /cfht/lib/ generally contains the latest static libraries matching what we are working on in the source tree, so if you build and link on saturn, that is what your executable will get. At the summit, we only do "make install"s of libraries when they have been tested, so the stuff in /cfht/lib/ on neptune may differ. This means that if you build executables at the summit, they can behave differently that those built from the same sources in Waimea. Copying a binary built on saturn into an account will not be subject to this, but that could change if we ever start using sharable libraries in /cfht/lib/. (In that case, your executable would load whatever is in /cfht/lib/ at runtime, so an executable could change its behavior simply by being moved to the summit.)

/cfht/include/
This directory must always be paired with a matching /cfht/lib/. It contains include files that are installed at the same time as the matching library in /cfht/lib/. With or without sharable libraries, it will never get used at runtime, though.

/cfht/conf/
I think I had intended for this directory to be the one that contains installed par-files and other text files that can get used by programs at runtime. Right now, the automount points to the wrong place. I think I need to change it to point to /usr/local/cfht/dev/conf/ (which is also architecture- specific in the old setup.) For now, where it matters, programs are still looking directly in the fixed mount /usr/local/cfht/dev/conf/ and nothing should use /cfht/conf/ until I fix it.

/cfht/obs/
This is the last of the architecture-specific directories. It contains intermediate object files (.o) and executables and libraries before they are installed. "make clean" removes entire subdirectories from this tree to force recompilation, but normally the files in /cfht/obs/ are used by "make" to resolve dependencies and allow it to figure out which components have changed and need to be rebuilt and/or reinstalled.

There is another use for the files in /cfht/obs/. When programs are installed in /cfht/bin/, they are stripped of debugging symbols. As long as the last step you did was a "make install", there will be a matching copy with symbols in /cfht/obs/progname/progname which can be used by gdb to debug cores or running versions of the installed copy in /cfht/bin/. There is more information on how to do this in another document... yet to be written. (Note that as soon as you change the source and do another "make" without a "make install" the version in /cfht/obs/ will no longer match, and gdb will warn you of this if it happens.)


Makefile conventions (Back to contents)

Within "pegasus", "medusa", and "epeus" each project or instrument may contain a subdirectory. When possible, creating a separate subdirectory for each program is recommended because it keeps the Makefiles simple. I currently do not have an example of how to generate more than one compiled executable from a single subdirectory. If you make one, please put it here as an example. Most Makefiles include "Make.Common" twice. So far I only have three examples:
  1. A project Makefile which doesn't install anything but just lists the subdirectories that "make world" should traverse. It is not enough to simply give the subdirectories in the order that they should be compiled! You must set up proper dependencies using the form in the example below, or "make -j" for multi-processor machines will not work properly. Here is an example:
    # Makefile for cli-2.8 project tree
    include ../Make.Common
    SUBDIRS=libcli director directest clicmd clicap clidup runon
    clicmd.d clicap.d clidup.d runon.d director.d directest.d: libcli.d.install
    include ../Make.Common
    

    The dependency line indicates that libcli must be installed before the others can even be built.

  2. A Makefile for a C program. Programs should always be at the "leaves" of the directory tree, meaning there should be no more subdirectories for "make" if at this level. Example:
    # Makefile for directest
    include ../Make.Common
    $(EXECNAME) $(EXECNAME)-pure: $(OBJS) libcli.a libcfht.a
    include ../Make.Common
    # Rest of Makefile is auto-generated
    

  3. A Makefile for a C library. These are even simpler, assuming you just want every *.h file installed and every *.c file compiled:
    # Makefile for libcli.a
    include ../Make.Common
    include ../Make.Common
    # Rest of Makefile is auto-generated
    

    If you need to explicitly list your source files, that is possible too. See the link below.
How does the Makefile know whether it is to generate a library (.a) or an executable? The name of the subdirectory always gives the name of the object being generated, and if it starts with "lib*", the Makefile knows it has to make a library. This means you must create a subdirectory for each library and program, and they must have the same name. More specifics on Makefiles are given on the Make.Common page.

Index files (Back to contents)

Placing some kind of README or file-list at each directory node is generally a good idea. If you use the suggested format below, you will able to take advantage of the "make titles" function of Make.Common, which automatically inserts a description and copyright comment into the top of several types of source files. Here is the suggested format. The file must be called Index if you want "make titles" to find it:
# Description:

  package		DIRECTOR cli wrapper
  version		2.8
  organization		Canada-France-Hawaii Telescope
  email			daprog@cfht.hawaii.edu

# Contents:

  Makefile		Use with GNU make to build director

  director.cc		main event loop for director program

  builtins.cc		builtin commands that director handles itself

  Curs.hh		manages screen output and keyboard input
  Curs.cc		See Curs.hh

  Roll.hh		holds of the latest N lines echoed in shared memory
  Roll.cc		See Roll.hh

  Pipes.hh		reads line-by-line from pipe or fifo sources
  Pipes.cc		See Pipes.hh

  setserial.h		Stty type-stuff used by Pipes.cc to set up serial ports
  setserial.c		See setserial.h

To see how "make titles" inserts this information in C code, see the files in /cfht/src/pegasus/cli/director/ for an example. "make titles" will also search the files listed in Index for unusually long lines and markers that you can leave within comments. If it finds === (three equal signs) it will be flagged as a warning (parsable by the emacs compile buffer, so you can click with the middle mouse button in the editor and it will automatically pull up the file with the cursor in the correct place) and %%% (three percent signs) will be flagged as an error. Files that exist but are not identified in Index are also flagged. You can run the command "titler" with no arguments to get some built-in help on this utility.

Changes files (Back to contents)

Placing a file called "Changes" in a subdirectory can serve several purposes. RCS (see below) keeps a change log on a file-by-file basis. If it useful to describe a change in terms of what it did to the over-all project, it might help to maintain a Changes file that tracks this development.

Second, if you make all of the lines in Changes into comments, except for the last line, which sets a version number, then you can include Changes in your Makefiles and use it to assign a "1.0, 1.1, 1.2, 2.0 ..." style versioning to your program, rather than the Pegasus default (the date, in a string *-YYMMDD). In this case, it is usually most meaningful to have a Changes file only for the top-level of a project, and then include it in the Makefiles of all the components, so that a set of utilities all have a consistent version number. Using these manual version numbers gives you a little more flexibility, but you must use it responsibly. For example, once version "2.5" is being used, especially in an observing environment, you had better switch to version "2.6" as soon as you start modifying the code again to avoid confusion, no matter how small the changes might be. On the other hand, with the default dated versioning, you do not have the option to move to a new version number until the next day! Here's an example of the Changes from the cli (director) project:


# CLI Version History
#
# 1.0 - First version.  Used for uh8k run in 97I.
# 2.0 - Shared memory roll buffer. status: changed to statusbar: message
# ------CHANGE IN SHARED MEMORY SEGMENT SIZE (2.0/2.1 can't clone each other).
# 2.1 - More efficient packing of roll messages (Shm size from 1M->160K)
# 2.2 - Silently handle SIGALRM for client "low priority". Used for aobir 97II.
# 2.3 - Allow spaces in comlist for displaying in help. Used for uh8k 97II.
# ------CHANGE IN SHARED MEMORY SEGMENT SIZE; Versions above/below this
# ------line cannot attach to each other's shared memory segments!
# ------Use of <2.4 should be discontinued anyway, as 2.4 is stable and
# ------fully supports all features of previous 2.x version.  Extra entries
# ------have been added to shared memory structures to hopefully avoid the
# ------need for further incompatibilities.
# 2.4 - More tolerant of named pipe problems; blank entries in shm for future
# 2.5 - Better debug info for director and clicmd; re-start write()'s to pipes.
# 2.6 - Minor fixes to curses screen update code; detect rmd() errors.
#       Entire environment is now passed to agents on remote hosts.
#       Clicmd utility now supports sym-linking to command names.
#       Clones can only be activated by entering account password first.
#       Clones automatically get infosize of parent if no '-i' option given.
#       Infolines that run to end of screen don't erase next line anymore.
#       Autoprobe for rxvt turns on color support even if TERM variable wrong.
#       Added -t TERM and -C (no title clock) command line options.
#       cli_system() no longer interruptable by SIGALRM or other signals.
#       cli_sh_cmdstr() and cli_remsh() added to libcli.a
#       Now requires Posix "termios.h" terminal i/o routines.
# 2.7 - Removed cli_sh_cmdstr and cli_remsh().  Replaced with external "runon".

VERSION=2.7

And it is included in the Makefiles below as follows:
include ../Make.Common
include ../Changes
.
.
.
include ../Make.Common

So that installed binaries will be versioned as foo-2.7 rather than foo-981209.

RCS Check-in and Check-out (Back to contents)

Once a group of files has reached a stable, usable condition, each file should be "checked in" to RCS. After this has been done, you should NEVER move, use root access to change anything, or manually chmod/chown a file, or otherwise try to circumvent RCS! Learn these simple commands and save all of us a lot of confusion in the longrun.

You may find it useful to create the following aliases (the commands "co" and "ci" are already aliased in this way if you use bash at CFHT.) The examples below assume you have them. If not, be sure to add the options each time you type the command. Note that you can call the alias something other than co and ci if you don't want to clobber the original command.

ci"ci -V3 -u"
co"co -V3 -l"

The -V3 means operate in a mode compatible with RCS version 3. (I'm not sure this is needed by new projects anymore, but it just makes things compatible with the co/ci that are still in /usr/bin/ on our HP-UX 9 machines.) The -u/-l mean that you will always be unlocking a file when you check it in, and locking it when you check it out. Locking just means no one else will be able to edit the file while you're working on it.

Initial Check-in

Lets say you've reached a stage with file foo.x that meets one of the following: First, make sure an RCS subdirectory exists or RCS will make a mess in the current directory. If it doesn't exist, simply run:
    % mkdir ./RCS
Initial check-in for foo.x will then look something like this, assuming you set up the alias properly:
    % ci foo.x
    RCS/foo.x,v  <--  foo.x
    enter description, terminated with single '.' or end of file:
    NOTE: This is NOT the log message!
    >> This is my description for file foo.x
    >> .[Return]
    initial revision: 1.1
    done
    % _

Help, the file is now gone!

Well, it shouldn't have disappeared, but if you didn't alias ci to be "ci -u" then RCS will remove the file from the current directory. To keep things simple, we never want to use RCS in this way, so if this happens, please fix your aliases and then check the file out again immediately. But don't worry, RCS will never remove the ,v version of the file that lives in the RCS subdirectory.

Check-out

The next thing you'll probably want to do is check the file out immediately again, to continue editing what will become version 1.2 upon the next check-in:
    % co foo.x
    RCS/foo.x,v  -->  foo.x
    revision 1.1 (locked)
    done
    % _
The "(locked)" is important, and shows that you have exclusive control over editing the file now. In case someone else already grabbed the lock before you, a message like this may show up:
    % co foo.x
    RCS/foo.x,v  -->  foo.x
    co: RCS/foo.x,v: Revision 1.1 is already locked by bonehead.
    % _
If you see this, try to contact bonehead and get them to check the file back in, using the procedure below. Another thing you might see is this:
    % co foo.x
    RCS/foo.x,v  -->  foo.x
    revision 1.1 (locked)
    writable foo.x exists; remove it? [ny](n):
    % _
Say `n'! This appears if you already have the file checked out. If you check it out again, you will get back to the old version. The only time you would want to do this is if you want to cancel all of your edits since the last check-in. In this case, I strongly recommend you first run rcsdiff on the file to see what changes you are about to lose in reverting to the previous version. (See the rcsdiff example below.)

Check-in

If you haven't made any changes, but you want to unlock the file again so someone else can check it out, just run ci and it will be smart enough to figure out that there's no need to make a "version 1.2" and it will just return the file to its safe, checked-in state:
    % ci foo.x
    RCS/foo.x,v  <--  foo.x
    file is unchanged; reverting to previous revision 1.1
    done
    % _
On the other hand, if you've made changes to the file, I strongly recommend running "rcsdiff -c" on the file first to doublecheck that you've only changed the parts of the file you meant to change. rcsdiff will show you what's different between the last checked-in version and the version you are about to check in. The -c option selects "context-style" output, which is shown below:
    % rcsdiff -c foo.x | less
    ===================================================================
    RCS file: RCS/foo.x,v
    retrieving revision 1.1
    diff -c -r1.1 foo.x
    *** /tmp/T0a19438       Wed Dec  2 23:29:38 1998
    --- foo.x       Wed Dec  2 23:27:55 1998
    ***************
    *** 1 ****
    --- 1,2 ----
      This is a dummy file, dummy.
    + Here's a new line I added to the end for version 1.2.
    % _
New lines are marked with a "+", removed ones with a "-" and changed ones will show up twice, marked with "!" (first the old one, then the new one.) If this all looks good, then check in the file, and summarize the changes shown by rcsdiff by typing in a line or two describing what's new.
    % ci foo.x
    RCS/foo.x,v  <--  foo.x
    new revision: 1.2; previous revision: 1.1
    enter log message, terminated with single '.' or end of file:
    >> Added a line to the file for demo purposes
    >> .[Return]
    done
    % _
If you ever receive anything different than what you see in the examples here, PLEASE do not try to force things to work. Get help and clean up the mess immediately, before things get out of hand.

One final note on check-in: To keep things simple, I recommend avoiding branching feature of RCS versioning.

Rlog and RCS keywords

If you insert a $Log$ token in a comment near the top of your file, RCS will maintain a log message history below it. This can quickly lead to files with several pages of log messages and a tiny bit of code at the bottom, so you may prefer not to include $Log$ in the file itself. You can always access the log information separately using the rlog command:
    % rlog foo.x | less
    RCS file: RCS/foo.x,v
    Working file: foo.x
    head: 1.2
    branch:
    locks: strict
    access list:
    symbolic names:
    keyword substitution: kv
    total revisions: 2;     selected revisions: 2
    description:
    This is my description for file foo.x
    ----------------------------
    revision 1.2
    date: 1998/12/02 23:32:00;  author: isani;  state: Exp;  lines: +1 -0
    Added a line to the file for demo purposes
    ----------------------------
    revision 1.1
    date: 1998/12/02 23:17:40;  author: isani;  state: Exp;
    Initial revision
    =============================================================================
    % _
Even if you decide not to include $Log$ at the top of your file, it will be useful to put $Id$ in a comment near the top of the file. RCS will change this key into a useful summary of information about the file, and will automatically keep it current each time you co and ci the file. In C code, you might also want to use the RCSID(); macro defined in <cfht/cfht.h> so that compiled .o files contain a string indicating from which version of the C file they were generated. In any case, try to include $Id$ in at least some kind of comment near the top of the file so that it is obvious to anyone editing that the file is under RCS. Here's a short C file that shows what $Id$'s look like when you insert them:
    /*
     * Copyright, blah blah blah blah
     * $Id$
     */
    
    #include <cfht/cfht.h>
    
    RCSID("$Id");
    
    /* ...code starts here */
And here is how they look after RCS has replaced them:
    /*
     * Copyright, blah blah blah blah
     * $Id: test.c,v 1.1 98/12/03 17:36:24 isani Locked $
     */
    
    #include <cfht/cfht.h>
    
    RCSID("$Id: test.c,v 1.1 98/12/03 17:36:24 isani Locked $");
    
    /* ...code starts here */

RCS and Emacs

If you use Emacs or XEmacs, you should be able to use the key-strokes ^X-v-v to lock and unlock a file instead of running co and ci from the command line. This way you probably won't get the -V3 option, but I don't think it is very important except for things like consistent log formatting in older files. -V3 also affects the formatting of the information in a $Header$ keyword. Files that have $Header$ should really have $Id$ instead, and there should be no problem changing them to $Id$'s in which case the -V3 option makes no real difference either way.

Summary of Check-in/Check-out


Releases (Back to contents)

Pegasus only has ways of making an entire release of all the software in the development tree, and making a binary release of everything used by a specific observing account. These are covered in another document. Releases or backups of source trees can be made on a smaller scale using the more manual methods below.

Archival Back-ups (Back to contents)

Since we currently have no release system that can be applied per project in the Pegasus source trees, and since RCS only works on the level of one file at a time, here are some ways to make backups of an entire source tree manually. Whether you use the advice given here or not, please be sure that the commands you use preserve ownership, permissions, file time stamps, and file types!. One easy way to do this is to use the GNU version of cp, which is installed as "gcp" site-wide. When used with the -a option ("archive"), it makes an exact backup of the original file. The -v option ("verbose") is also recommended. For example, you can use it to make a backup before editing a text file:
    % ls -latr testfile.par*
    ... 6239 Jul 21 02:02 testfile.par
    % gcp -av testfile.par testfile.par.LAST
    % edit testfile.par
    % ls -latr testfile.par*
    ... 6239 Jul 21 02:02 testfile.par.LAST
    ... 6239 Dec  9 12:31 testfile.par
    % _
The -latr options for ls cause it to list all files in long format, sorted with the most recently touched files at the bottom of the listing. You can see the importance of always copying with "gcp -a", so that the information from "ls -latr" makes sense. You can use timestamps in combination with the "find" command as well. Say you were about to start hacking on files, and you want to be able to generate a list of the files you messed with after you're done. "ls -latr" will quickly show you files at the current directory level, but if the files are scattered throughout a tree, "find" is more appropriate. If you create a marker file before you start editing, or use "touch" to create it with the appropriate time-stamp after, you can use the -newer option to locate all the files that have been modified:
    % touch MARKER
    ... Edit a bunch of files from the current directory down ...
    % gfind -newer MARKER -print
    ... Lists all the files you edited (all the files modified after MARKER
        was created ...
You might also be able to use a recent .tgz file as the marker. Just remember that gfind will list all files with a modification date later than that of the marker file.

GNU copy can also be used to make copies of entire subdirectories, keeping timestamps, symbolic links, and permissions in tact.

    % cd /cfht/src/medusa
    % gcp -av myproject myproject-snapshot-981111
    myproject -> myproject-snapshot-981111
    myproject/Makefile -> myproject-snapshot-981111/Makefile
    myproject/foo.c -> myproject-snapshot-981111/foo.c
    .
    .
    .
    % _
However, if your Makefile is set up correctly, there is a better way to make a backup of a project. Let's say your project subdirectory is called "fuzzbuster/" , then /cfht/src/medusa/fuzzbuster/ (physically on machine saturn) should always be the latest copy of your project files. This should also be the only place where the RCS/ subdirectories appear, otherwise you could end up with two different copies of the same version of the same file. To make a snap-shot of the entire tree, not including the RCS directories, you can use the following target provided by Make.Common (see Makefile conventions to make sure your Makefile properly includes Make.Common for this to work.)
    % cd /cfht/src/medusa/fuzzbuster/
    % make tar
    % cd ..
    % ls -l fuzzbuster-VERSION.tgz
Once you've created a versioned archive, don't make any further changes under that version number! Also, be sure to save any buffers in your editor before making the tar file.

Be careful when extracting the .tgz file, because if you do it in /cfht/src/medusa/, it could overwrite a newer version of itself. One way to avoid this is to temporarily rename the project directory to project-VERSION before running "make tar". You can even leave the directory name with a version number in it (Make.Common will automatically strip off the -VERSION part, if it exists) and make a symlink pointing to it without the version. In any case, before extracting a file, you should always look at the contents first, and if necessary, create a subdirectory, cd to it, and untar there. To look at the contents, use the following:

    % gtar tzvf fuzzbuster-VERSION.tgz | less
To check if the tar file matches what is on disk you can use:
    % gtar dvf fuzzbuster-VERSION.tgz | less
To do a full context-style comparison between two versions, extract them both in a scratch space and use gdiff:
    % mkdir TMP
    % cd TMP
    % gtar xzvf ../fuzzbuster-VERSION-A.tgz
    % mv fuzzbuster fuzzbuster-VERSION-A
    % gtar xzvf ../fuzzbuster-VERSION-B.tgz
    % gdiff -rc fuzzbuster-VERSION-A fuzzbuster | less
Or to compare the file with the version on disk:
    % gdiff -rc fuzzbuster-VERSION-A ../fuzzbuster | less
Aside: The output of "gdiff -rc" is human-readble, but is also suitable for input to the "patch" program for making incremental upgrades.

If a certain version of a program is in use by observing accounts, but source in /cfht/src/ doesn't match it, because it is under development, it is advisable to keep a .tgz file at the same level as the project directory with a clearly marked version. For example, if we are running rpm-2.0.4 as the Web server at the summit, but the version in the pegasus source tree is rpm-2.0.5, then it is useful to be able to quickly access the source for 2.0.4 if there is a problem with the version in use. If some bug is showing up in 2.0.4 that has been fixed in 2.0.5, it will be very confusing looking at the source code for 2.0.5! For this reason, /cfht/src/pegasus/ should contain at least the .tgz file rpm-2.0.4.tgz and any other versions that are in use by anything critical. Older .tgz files can be archived on tape or CDROM and removed from the pegasus source tree.

Summary of Archiving/File maintenance commands

ls -latr Lists all files with the most recent ones at the bottom.
gfind -newer somefile Lists files modified after somefile.
gcp -av orignal backup Creates exact backup copies of files or directories.
gdiff -rc first second Generates context diff output of files or directories.
make tar Creates a compressed tar image of a project, excluding RCS. Use this just before upgrade VERSION numner.
gtar czvf FILE.tgz FILES Creates compressed tar file. (Use "make tar" on source directories.)
gtar dzf FILE.tgz Lists differences between files on disk and tar file.
gtar tzvf FILE.tgz Lists everything in the tar file.
gtar xzvf FILE.tgz Extracts the files into the current directory, overwriting anything in the way.

Sidik Isani
Last modified: Mon Dec 14 10:21:20 HST