HASHTABLE(3)                                                      HASHTABLE(3)



NAME
       hashtable - keyed sets and maps

SYNOPSIS
       #include "hashtable.h"
       #include "hashtable.c"

       struct hashtable *hashtable_new(unsigned size);
       void hashtable_delete(struct hashtable *table);

       struct hashtable *hashtable_initialise(struct hashtable *table);
       struct hashtable *hashtable_release(struct hashtable *table);

       struct hashtable_entry *hashtable_insert(struct hashtable *table,
           char *key, unsigned length);

       struct hashtable_entry *hashtable_find(struct hashtable *table,
           char *key);

DESCRIPTION
       These  functions  provide hash tables that use the method of coalescing
       lists [1].  They are particularly suitable for building symbol tables.

       The hashtable_new() function allocates and  returns  a  pointer  to  an
       empty  hash  table  structure  that can contain size entries.  When the
       structure is no longer needed, the hashtable_delete() function  can  be
       used to destroy it.  Programs that want to allocate a hash table struc-
       ture in automatic or static storage can use  hashtable_initialise()  to
       prepare  it for use and subsequently hashtable_release() to reclaim any
       resources  it   has   used.    (The   functions   hashtable_new()   and
       hashtable_delete()  perform this initialisation and reclamation implic-
       itly.)

       The hashtable_insert() function creates a new entry in table  with  the
       given  key,  growing  the table to make room if necessary.  It does not
       check if an entry with the  key  already  exists  in  the  table.   The
       hashtable_find()  function  searches  table for an entry with the given
       key and returns a pointer to it, or null if no such entry exists.   The
       length  argument specifies the number of bytes in the key.  The default
       type of key is char * and the length should not include any  terminaing
       nul character.

       Two  data  structures are declared by the header file.  Several members
       of these structures are potentially useful to the client, as follows.

          struct hashtable_entry
          {
            char     *hashtable_entry_key;
            unsigned  hashtable_entry_length;
          };

          struct hashtable
          {
            unsigned int             hashtable_capacity;
            struct hashtable_entry **hashtable_entries;
          };

       The hashtable_entry_key and hashtable_entry_length are copies of  those
       provided  to  the  hashtable_insert function at the time when the entry
       was created.  The hashtable_capacity gives the size of (one  more  than
       the  maximum  valid  index  into)  hashtable_entries.   Each element of
       hashtable_entries is either null or  a  pointer  to  a  hashtable_entry
       structure.

       The client must never modify the contents of these structures directly,
       with the exception of any additional members that were  added  to  them
       explicitly (see the next section).

TYPICAL USE AND BASIC CUSTOMISATION
       Programs  that  wish  only  to  manage a set of unique strings can just
       include hashtable.h to declare the API, and add hashtable.c to the list
       of source files.  The types and functions described in this manual will
       then be available.

       Simple customisation is possible by  defining  preprocessor  macros  to
       change default types and names.

       HASHTABLE_NAME
              The  tag  given  to  the  hashtable  structure,  and  the prefix
              prepended to the hash table functions.  The  default  is  'hash-
              able'.

       HASHTABLE_ENTRY_NAME
              The  tag given to the hashtable entry structure.  The default is
              'hashable_entry'.

       HASHTABLE_MEMBERS
              Additional members to be added to the hashtable structure.  They
              are  placed at the beginning of the structure and initialised to
              zero.

       HASHTABLE_ENTRY_MEMBERS
              Additional members to be added to the hashtable entry structure.
              They  are  placed  at the beginning of the structure and will be
              initialised to zero.

       These symbols affect the  declarations  made  in  hashtable.h  and  the
       implementations  of  associated functions in hashtable.c.  The same set
       of definitions must be in effect when including  the  header  file  and
       compiling  the  source  file.   The  easiest  way  to ensure this is to
       include hashtable.c in the program after the necessary definitions.

       For example, the following  complete  program  tallies  the  number  of
       occurences of each unique word passed on the command line.

          #define HASHTABLE_NAME          SymbolTable
          #define HASHTABLE_ENTRY_NAME    Symbol
          #define HASHTABLE_ENTRY_MEMBERS int tally;

          #include "hashtable.c"
          #include <stdio.h>

          struct Symbol *lookup(struct SymbolTable *t, char *key)
          {
            unsigned int len= strlen(key);
            struct Symbol *found= SymbolTable_find(t, key, len);
            if (!found) found= SymbolTable_insert(t, key, len);
            return found;
          }

          int main(int argc, char **argv)
          {
            unsigned int i;
            struct SymbolTable *symbols= SymbolTable_new(argc * 2);
            while (--argc) {
              ++argv;
              struct Symbol *symbol= lookup(symbols, *argv);
              symbol->tally++;
            }
            for (i= 0;  i < symbols->hashtable_capacity;  ++i) {
              struct Symbol *s;
              if ((s= symbols->hashtable_entries[i]))
                printf("%7i %s\n", s->tally, s->hashtable_entry_key);
            }
            SymbolTable_delete();
            return 0;
          }

ADVANCED CUSTOMISATION
       The  following  preprocessor  symbols affect only the implementation in
       hashtable.c, unless otherwise noted.

       HASHTABLE_KEY_T
              The type of a key.  The default type is 'char *'.   If  the  key
              type  is not 'pointer to integer' then appropriate copy, compare
              and hash functions should also be specified (see  below).   Note
              that this symbol affects both hashtable.c and hashtable.h.

       HASHTABLE_KEY_COPY
              The  function  used  to  copy  a  key.   The default is value is
              'strdup'.  Note that the implemenation will reclaim the  storage
              for  copied  keys by calling free(), so the copy function should
              use malloc() (or one of its  cousins)  to  allocate  memory  for
              copied keys.

       HASHTABLE_KEY_COMPARE
              The  function  used  to  compare two keys.  This function should
              return non-zero if the two keys are equal.  The default value is
              '!strcmp'.

       HASHTABLE_FRACTION
              Controls  when  a  table grows to make room for new keys.  It is
              specified as a fraction of the  total  capacity  of  the  table.
              When this fractional capacity is exceeded, the table is grown by
              HASHTABLE_FACTOR (see below).  The default value is '1/2'.  Note
              that  the  fractional value must not be enclosed in parentheses,
              otherwise the compiler will round down to  zero  and  the  table
              will grow every time a key is inserted.

       HASHTABLE_FACTOR
              The factor by which the capacity of the table is increased every
              time it is grown.  The default value is '3'.

       HASHTABLE_HASH_FUNCTION
              The name of the function that generates a  numeric  hash  for  a
              key.  The signature of this function must be

                 unsigned int
                 HASHTABLE_HASH_FUNCTION ( HASHTABLE_KEY_T key,
                                           unsigned int length )

              and  it  must always return the same value when given equivalent
              keys.

       The symbols that affect hashtable.h are undefined automatically at  the
       end  of  that  file.  In the unlikely event they are useful beyond that
       file, defining HASHTABLE_KEEP_DEFINITIONS will preserve them.  All sym-
       bols  affecting  either  hashtable.h or hashtable.c are unconditionally
       undefined at the end of hashtable.c.  This permits the  C  file  to  be
       included  more than once in a parent source file, to create implementa-
       tions for multiple table types.

RETURN VALUE
       The hashtable_new() function returns a pointer to  the  newly-allocated
       hash table, or null if memory could not be allocated.

       The  hashtable_initiliase(),  and  hashtable_release() functions return
       table.

       The hashtable_insert() function returns a pointer to the  newly-created
       table entry, or null if required memory cannot be allocated.

       The  hashtable_find()  function returns a pointer to the existing table
       entry corresponding to key, or null if the key does not  exist  in  the
       table.

CAVEATS AND BUGS
       There  is  no  way  to  remove a key from a table.  You should not feel
       upset about this.

       The implementation trusts the client not to free any  of  the  pointers
       returned by hashable_insert() and hashable_find().

       The contents of the table may be rearranged during insertion.  Pointers
       returned  from  hashable_insert()  and  hashable_find()  are  therefore
       invalid after (and iteration is unstable over) insertions.

SEE ALSO
       hcreate(3)

       [1] Donald Knuth, The Art of Computer Programming, Algorithm 6.4C.

       The latest version of this software and documentation:

          http://piumarta.com/software/hashtable

AUTHOR
       The software and this manual page were written by Ian Piumarta (first-
       name at last-name dot com).

       Please send bug reports and suggestions for improvements to the author
       at the above address.



                                                                  HASHTABLE(3)