Starlab Internals

Starlab stories

The ``story'' data structure is an integral part of Starlab. A story is simply an unstructured piece of text, of arbitrary length, which may be associated with any Starlab node. Internally, stories are implemented as linked lists of character strings organized into lines and chapters, making it easy to extend and/or edit their contents. However, from the programmer's point of view, they are most usefully thought of as generalized ``scratchpads'', in which lines of text or pieces of data specific to a particular node can be stored and retrieved via library functions.

Conventionally, Starlab nodes contain two stories (actually, story pointers): the ``log'' story, in which specific events or points of interest in the history of a node may be noted, and the ``dyn'' story, intended to store dynamical variables that are not part of any of the standard classes (see ``structure''). For any node, the member functions get_log_story() and get_dyn_story() return pointers to the two stories. (The pointers may be NULL if no story data exist.) In addition, the starbase class includes a ``star'' story pointer (accessed by starbase member function get_star_story) and the hydrobase class includes a ``hydro'' story (get_hydro_story).

A line of text may be appended to the log story via the member function log_comment(). A typical use might be

        b->log_comment("Close encounter with black hole #7 at time 10 Myr");

The intent is that the log story forms a sequential record of key events in a node's history. A specific piece of data may be written to a story in the form of a ``keyword = value'' line by use of the ``put'' functions:

        putiq(story *,  char *, int)           // write an int
        putulq(story *, char *, unsigned long) // write an unsigned long
        putrq(story *,  char *, real)          // write a real
        putra(story *,  char *, real *, int)   // write a real array
        putsq(story *,  char *, char *)        // write a character string
        putvq(story *,  char *, vector)        // write a vector

For example, the function call

        putiq(b->get_dyn_story(), "N_orbits", 2);

writes the line

        N_orbits  =  2

to node b's dyn story. The data may subsequently be accessed via the corresponding ``get'' functions:

        int            getiq(story *, char *)
        unsigned long  getulq(story *, char *)
        real           getrq(story *, char *)
        void           getra(story *, char *, real *, int)
        char *         getsq(story *, char *)
        vector         getvq(story *, char *)

e.g.

        if (getiq(b->get_dyn_story(), "N_orbits") > 0) ...

Unlike the log_comment() function, a second call to putiq() will overwrite the data if it exists, and create a new story line otherwise. If we need to check if a line already exists, we can use the find_qmatch() function:

        if (find_qmatch(b->get_dyn_story(), "N"))

            // an "N = " line already exists

        else

            // no "N = " line in the story

Stories thus provide a highly flexible (if rather inefficient) means of getting data from one part of a program to another, without the need to create a specific route for it to take (e.g. parameters in a sequence of function calls, or extern data, or FORTRAN common blocks). In typical use, a function may write some (real) data to the dyn story of node b using

        putrq(b->get_dyn_story(), "an_interesting_number", value);

Later (or in another program), the data can be retrieved by

        result = getrq(b->get_dyn_story(), "an_interesting_number");

The value of such a general mechanism in a complex programming environment should be obvious. Many Starlab tools rely on this approach.

It is possible to write either type of story line to any type of story (a more general tool to append a string to any story is add_story_line(story*, char*)). However, as a matter of style, we prefer not to write unstructured text to the dyn, star, or hydro stories, and to limit the use of the ``keyword = value'' format in log stories to the root node only (where it is heavily used in passing initial and run-time information from one program to another -- see for example the output of any of the ``make...'' tools that create N-body systems).

Once written, stories remain part of a node's ``permanent record'' until they are explicitly deleted or the node is destroyed (in which case the story data may yet live on in the story of the parent or root node). When the system is written out using put_node(), all stories are saved as plain text along with the ``hard-coded'' class data; similarly, stories are read in and reconstructed when the get_node() is used. In this way, the stories survive as data moves from one program to another, or from one segment of a long simulation to the next.

Internal and external data representations

Internally, Starlab data consists of ``standard'' class variables, for which memory is explicitly allocated, and free-format stories of variable length, which may themselves contain additional variables in text form. By contrast, Starlab's external data representation treats all data as stories, i.e. all data written by put_node() are in ``keyword = value'' text form. On input, get_node() interprets and sets known class variables from the story line; everything else remains in story form.

Externally, a node is represented as simple (more or less) human-readable text. For example, part of the output of an N-body calculation might look like:

        (Particle
          i = 4
          N = 1
        (Log
        Close encounter with black hole #7 at time 10 Myr
        )Log
        (Dynamics
          N_orbits  =  2
          an_interesting_number  =  42
          m  =  0.5
          r  =  -0.1  0.2   0.5
          v  =  0.3  -0.4  -0.3
        )Dynamics
        (Hydro
        )Hydro
        (Star
          Type   =  main_sequence
          T_cur  =  0
          M_rel  =  1
          M_env  =  0.99
          M_core =  0.01
          T_eff  =  6000
          L_eff  =  1
        )Star
        )Particle

This snapshot fragment represents a single node, delimited by the matching ``(Particle'' and ``)Particle'' lines. Following the opening (Particle come lines containing node class information: the particle's index i and the total number of particles N ``contained'' within this node -- 1 for a ``leaf,'' 2 for a binary center-of-mass node, and so on. The remainder of the data is divided into four stories:

Log	`(Log ... )Log` Contains precisely the log story of the node, line for line
Dynamics	`(Dynamics ... )Dynamics` Contains the dyn story of the node (if any) and the values of all true class variables associated with the dynamics: mass (`m`), position (`r`), velocity (`v`), etc., in ```keyword = value`'' form. (Mass is actually part of the `node` class, but it seems more natural not to separate it from the `dyn` variables here.) On input, `m`, `r`, `v`, etc. are interpreted and stored in the appropriate class location; everything else is saved in the dyn story.
Hydro	`(Hydro ... )Hydro` Contains the hydro story and any `hydrobase` class data for the node (in this case, no hydro part exists). As with the Dynamics story, any data corresponding to a hydrobase class member is interpreted and stored appropriately; the remainder becomes the node's hydro story.
Star	`(Star ... )Star` Contains the star story and any `starbase` class data for the node. As with the Hydro story, any data corresponding to a `starbase`, `single_star`, or `double_star` class member is interpreted and stored; the remainder becomes the node's star story.

The internal tree structure is expressed by nesting the external node representations. Thus, if a second (Particle line is found before the delimiting )Particle for the current node, then the new particle is taken to be the daughter of the first, and so on, recursively. In this way the entire Starlab tree structure is faithfully preserved. For example,

        (Particle        <--------.
          N = 1                   |
        (Log                      |
        )Log                      |
        (Dynamics                 |
          m  =  1                 |
        )Dynamics                 |
        )Particle        <--------'
        (Particle        <--------.
          N = 1                   |
        (Log                      |
        )Log                      |
        (Dynamics                 |
          m  =  1                 |
        )Dynamics                 |
        )Particle        <--------'
        (Particle        <--------.
          N = 1                   |
        (Log                      |
        )Log                      |
        (Dynamics                 |
          m  =  1                 |
        )Dynamics                 |
        )Particle        <--------'

represents three sister nodes at the same level, while

        (Particle        <----------------.
          N = 2                           |
        (Log                              |
        )Log                              |
        (Dynamics                         |
          m  =  1                         |
        )Dynamics                         |
        (Particle        <--------.       |
          N = 1                   |       |
        (Log                      |       |
        )Log                      |       |
        (Dynamics                 |       |
          m  =  0.5               |       |
        )Dynamics                 |       |
        )Particle        <--------'       |
        (Particle        <--------.       |
          N = 1                   |       |
        (Log                      |       |
        )Log                      |       |
        (Dynamics                 |       |
          m  =  0.5               |       |
        )Dynamics                 |       |
        )Particle        <--------'       |
        )Particle        <----------------'

represents a parent and two daughters (arrows added for clarity).

Finally, the ASCII representation of Starlab data is quite inefficient in terms of space, although we have not yet found disk usage to be a limiting factor in our simulations. However, should it be necessary, on some systems (for example, Linux, but unfortunately not on the DEC UNIX systems that host GRAPEs) it is possible to compress and uncompress the data automatically using gzip and gunzip, without having to type these commands explicitly into the command line. The requirement is that the non-standard include file "pfstream.h" be available. Use of automatic compression is controlled by the following environment variables, set in local/cshrc.starlab:

#	STARLAB_HAS_GZIP		Define if gzip is available AND if
#					pfstream.h exists; else set to null
#	STARLAB_USE_GZIP		(RUNTIME) Use compressed I/O if set

See ``installation'' for more details.