.\" -*- nroff -*-
.TH Properties 3 "7 October 1998" "CS 216" "Properties documentation"
.\" Turn off hyphenation
.hlm 0
.\" Turn off adjustment
.na
.SH NAME
Properties, proptest
.SH SYNOPSIS
.B "~nmoor0/CS216/bin/proptest"
.RI "[" filename "]"
.\" Back to standard adjustment
.ad
.\" Allow hyphenation, omit if you want no hyphenation in the man page
.hlm 1

.SH AVAILABILITY
.PP
This program may be found on cslab, in the directory ~nmoor0/CS216/bin/.

.SH "ASSOCIATED FILES"
Properties.man (this file) may be found in ~nmoor0/CS216/man/ on any cslab
machine.  A test file, proptest.prop, may be found in ~nmoor0/CS216/misc/,
again on any cslab machine.

.SH DESCRIPTION
.B proptest
treats its argument as a filename; if not provided with an argument,
it defaults to \(lqproptest.prop\(rq.  Each line of that file should
follow one of the following formats:

.RS
.RI [ ws ] property [ ws "]=[" ws "][" value ]
.br
.RI [ ws "] #" "comment"
.br
.RI [ ws ]
.RE

.PP
Each \fIproperty\fR begins with an alphabetic character, followed by any number
of alphanumeric characters and/or dots (\(lq.\(rq).  Two or more dots may not
appear consecutively, and the property name may not end with a dot.

.PP
The property list is loaded from the file into memory, where it may be operated
upon by the user.  Each property has associated with it a string value (which
may be empty).
.B proptest
presents the user with a menu of operations.  These include printing the entire
property list to the screen, saving it to a file, retrieving the value of a
particular property, testing for the existence of a property, adding (or
updating) a property, and removing a property.  When the user enters a command,
.B proptest
prompts the user for the needed parameters, tests for the validity of these
parameters, and performs the operation requested.  The program handles errors
somewhat gracefully; it displays a concise error message and returns to the
prompt.


.SH "RETURN VALUE"
0 if things went normally, 1 otherwise.  Currently, there are no fatal 
errors, so proptest will always return 0.

.SH "NOTES ON IMPLEMENTATION"

.PP
.B proptest
is intended as a driver program for the
.B Properties
class; this class implements a properties manager.
.B proptest
itself is rather simple.  The
.B main()
function consists of a loop which reads a command from stdin, dispatches that
command to a
.B docommand()
function, and loops (assuming the user did not type \(lqquit\(rq).
.B docommand()
reads additional parameters from stdin and calls the appropriate function in
the
.B Properties
class.  It catches any exceptions that may be thrown (so that the program does
not abruptly terminate).

.PP
.B Properties
consists of a number of methods.  These include constructors, a destructor, the
assignment operator,
.BR load() ","
a number of property retrieval functions,
.BR addProperty() ", " remove() ", " isDefined() ", " save(), ", and " valid_property() .
In addition, there is the associated
.B ostream <<
operator.

.PP
.BR load() ","
as the name suggests, loads into memory a property file (in the format 
outlined in the
.B DESCRIPTION
section above), the filename for which is indicated by its
.B const char *
argument.  Properties in memory are stored as a linked list of
.RI ( property ", " value )
pairs (this implementation detail may be changed in the future; STL
.BR map s
are a quite attractive alternative).  Each line is parsed by
.B match_prop()
(a protected static member function\(emsee below).
.B load()
prints a warning message to stderr when it encounters an invalid line
or property name.  This behaviour would ideally be generalised, such
that
.BR Properties "\(aqs"
constructor accepted as an argument a function object, which would be
called for each error (we cannot use exceptions, as we wish to
continue processing further lines in the event of an error).  Such a
behaviour may be implemented in a future version of this class.

.PP
The
.B Properties
class contains a number of functions to retrieve the value of a
property.  The most basic of these is
.BR getValueFor() ", "
which takes as its argument a property name.  It returns the value
associated with that property (always a string); if the property is
undefined, it throws a
.B NotFound
exception (see below).  There are other functions
.RB ( getBoolValueFor() ", " getIntValueFor() ", and " getDoubleValueFor() )
which interpret the value of a property as a particular type
.RB ( bool ", " int ", and " double ", respectively)."
If the property does not fit the type requested, these methods throw a
.B BadFormat
exception (see below).  
.B getIntValueFor()
interprets values as
.B strtol()
would (a leading \(lq0x\(rq indicates hexadecimal, a leading \(lq0\(rq octal).
.B getDoubleValueFor()
uses
.B strtod()
to make the conversion (thus accepting scientific notation such as
\(lq1.3E-4\(rq).  Finally,
.B getBoolValueFor()
interprets \(lq1\(rq, \(lqon\(rq, or \(lqtrue\(rq (any capitalisation) as
.BR true ,
and \(lq0\(rq, \(lqoff\(rq, or \(lqfalse\(rq as
.BR false .


.PP
The
.B addProperty()
method adds a
.RB ( property ", " value )
pair to the property list.  It uses
.B valid_property()
(see below) to test the validity of the property name.  If the property name is 
invalid, it throws a
.B BadProperty
exception.  If the indicated property already exists in the property list,
.B addProperty()
updates its value\(emthe same property may not appear twice in the list.

.PP
.BR remove() ,
surprisingly enough, removes a pair (specified by property name) from the
property list.  If the specified property does not exist,
.B remove()
throws a
.B NotFound
exception.

.PP
.B isDefined()
is quite simple; it returns
.B true
if the specified property is in the property list,
.B false
otherwise.  Because the property list is stored as a standard linked list, this
method (as well as most of the others) is
.RI O( n )
with respect to the number of properties in the list.

.PP
.B save()
writes the property list to a file suitable for use with
.BR Properties::load() .
It utilises
.B ostream operator<<()
(see below) to do the actual output.  Note that
.BR load() ing
a file the
.BR save() ing
it loses information (such as comments, blank lines, and redefinitions).

.PP
.B valid_property()
(a static member function) simply tests the validity of a property name.  The
proper syntax for property names is described in the
.B DESCRIPTION
section above.  This method basically tacks \(aq=\(aq to the end of
the property name, and hands the result off to
.BR match_prop() .
.B match_prop()
itself is a protected static member function.  It attempts to match the
property part of a
.IR property = value
line.  This function matches the ed-style regular expression:
.RS
\fC^[ \\t]*\\([a-z][a-z0-9]*(\\.[a-z0-9]+)*\\)[ \\t]*=\fR
.RE
This regexp corresponds to optional leading whitespace, followed by at least
one alphabetic character, followed by any number of alphanumeric characters,
followed by optional whitespace and a necessary equals sign; dots may be
interspersed within the alphanumeric sequence, but two dots may not occur
adjacently, and the last character of that sequence must not be a dot.
 The text enclosed within \fC\\(\fR . . . \fC\\)\fR (which corresponds to the
property name) is placed in the string to which
.BR match_prop() \(aqs
second parameter refers.  The value returned by
.B match_prop()
indicates the total number of characters matched (including whitespace and the
equals sign).  If the line is invalid, it returns 0 and sets its second (string
reference) argument to the empty string.

.PP
Rather than using a more obvious ad-hoc parser,
.B match_prop()
uses a deterministic finite automaton (DFA) to match the string.  We do not
implement a full regular expression handler, however: the state table for the
above regexp is hard-coded into the program.  The algorithm here is basically:
.RS
.RE
.IP 1
.I state
= 0;
.I prop
= the empty string;
.RI ( prop
is the reference argument which will hold the matched property name.)
.IP 2
if we are at the end of the string, return failure
.IP 3
.I in
= code representing the type of the next character (alphabetic, numeric, dot,
whitespace, equals sign, or other).
.IP 4
.B state
=
.IR next_state ( state ", " in )
.IP 5
if
.I state
is a failing state, return failure
.IP 6
if
.I state
is an accepting state, return success
.IP 7
if
.I in
is not `whitespace', append the current character to
.IR prop .
.IP 8
Go to step 2
.RE
Note that this algorithm differs from 
.BR grep ( 1 )
and
.BR ed ( 1 )
with respect to certain parts of its matching.  Notably, it quits as soon as it
reaches an accepting state, rather than trying to find a maximal match.

.PP
.B readPropLine()
is a protected static member function which reads a line from an
.B istream
and breaks it into property and value components.  It is a wrapper around
.BR match_prop() .
.B writePropLine
is also a protected static member function.  It writes a specified
.RI ( property ", " value )
pair from an
.B ostream
in the format expected by
.BR readPropLine() .
.B ostream &operator<<()
is a friend function of the
.B Properties
class.  It basically calls
.B writePropLine()
for each pair in the property list.

.PP
There are also a few miscellaneous functions defined in and/or used by
.BR Properties .
These include:
.B chartypeof()
(a private static member function, called only by match_prop, which converts
a 
.B char
into a character code of the type mentioned in the
.B match_prop()
algorithm above;
.B stripws()
(in util.cc), which removes leading and trailing whitespace from a
.BR std::string ;
and
.BR lowercasify() ,
which, oddly enough, converts a
.B std::string
to lowercase.

.PP
Various member functions of the 
.B Properties
class may throw exceptions.  These exceptions are classes defined within the
scope of
.B Properties
(so they are referred to as \fBProperties::\fIExceptionName\fR).  The exception 
classes are:
.RS
.IP \(bu
.BR CannotOpen ,
thrown by
.BR load() ", " save() ,
and one of the constructors.  This exception indicates that a file could not be 
opened in the appropriate mode.
.B CannotOpen::errno()
gives the value of
.B errno
when the failure was detected;
.B CannotOpen::filename()
gives the filename that could not be opened.

.IP \(bu
.BR BadProperty ,
thrown by
.BR addProperty() .
This exception indicates that the name of a property is invalid (according to
.BR valid_property() ).
.B BadProperty::propname()
gives the invalid property in question.

.IP \(bu
.BR BadFormat ,
thrown by
.BR getBoolValueFor() ", " getIntValueFor() ", and " getDoubleValueFor() .
This exception indicates that, although the requested property exists, its
value does not match the requested type.  For example, calling
.BR getIntValueFor "(''foo'')"
when the value of foo is ''purple'' will result in a
.B BadFormat
exception being thrown. 
.B BadFormat::propname()
gives the name of the property in question;
.B BadFormat::value()
gives the (string) value of that property.

.IP \(bu
.BR NotFound ,
thrown by
.BR get * ValueFor()
and
.BR remove() .
This exception indicates that an undefined property was referred to.
.B NotFound::propname()
gives the name of the property which was referred to.
.RE

There is also an internal-only
.B BadLine
exception, thrown by
.B readPropLine
(which, as pointed out above, is a protected member function).

.SH AUTHOR
.RI "Neil Moore <" neil@cs.uky.edu ">, <" nmoor0@sac.uky.edu ">"

.SH BUGS
.PP
The
.B Properties
class could be made more general in a number of places.  For example,
.B load()
should take a function object argument, which would be called for each error in
the input file.  In addition, functions such as
.BR readPropLine() ", " writePropLine() ", and " match_prop()
should be made virtual, so that subclasses of
.B Properties
could redefine them.  Finally, the property list should probably be stored in a 
.BR map
(which would probably be implemented
All of these improvements may be implemented in the future.
