<plaintext>
***********           This is the              **********
***********   "FREQUENTLY ASKED QUESTION"      **********
***********    file for the SaGA Toolbox       **********

This is just a beginning.
This file will be constantly expanded and updated as a
result of user's input.

                        CONTENTS:

  1. General.

1.1 What is it all about?
1.2 How do I find quickly what this toolbox can/can't do
    and whether it has a capability I need?
1.3 What are the current MATLAB capabilities for spacial
    analysis?
1.4 Why does GRIDDATA bogs down?
1.5 Should I wait instead for the MATLAB 5.0 release for
    many improvements in this area?


  2. Registration, license.

2.1 What is Limited license?
2.2 Do I have to prove that I am a full-time student for
    the student license registration?
2.3 Why such a short free trial period?


  3. Analysis of irregular spacial data.

3.1 What are the most important methods of dealing with
    irregular spatial data?
3.2 What is a triangulation method?
3.3 What is natural neighbour interpolation?
3.4 What is kriging, objective maping, optimal
    interpolation, minimum curvature interpolation?
3.5 What is an inverse distance method?
3.6 What is extrapolation as opposed to interpolation?
3.7 Which interpolation/gridding method should I use?


  4. Issues in Computational Geometry.

4.1 What is a convex hull of a set of points?
4.2 What is a simplex?
4.3 What is Delaunay triangulation?
4.4 What is Voronoi tesselation?
4.5 Where these terms - Delaunay, Voronoi - come from and how to 
    pronounce them?
4.6 What is the volume of an n-dimensional simplex?
4.7 How to calculate a circumsphere around an n-dimensional
    simplex?


        1. General.
        -----------

1.1 What is it all about?

    SaGA is a toolbox intended to deal with geometry
  and spatial - planar, spherical, three-dimensional and 
  multidimensional data.
  Its capabilities can approximately be divided into two groups.
    First - which will probably find the most usage - is a set
  of routines for for interpolation/mapping from irregularly-
  spaced points (observations). This is a complicated problem 
  frequently encountered in geology, geophysics, meteorology and
  many other fields.
  One can find and try different methods used in this
  complex operation: triangulation, natural-neighbour,
  inverse distance, kriging, objective mapping, minimum
  curvature techniques with various options and parameters.
  No single method is well suitable for all purposes and
  datasets. Yet by trying various methods and programs
  contained in SaGA one can most probably find a way which is
  most suitable for his/her data.
    Second - there is a large number of routines dealing with
  geometric modeling and computational geometry. They can be 
  used for many purposes, from simple plotting to constructive
  geometry, mesh generation, visibility analysis, etc.
  These routines range from relatively simple operations on 
  planar polygons, like area and center of mass computations,
  intersections, unions, point-in-polygon, etc. to very
  sophisticated multi-dimensional computational geometry
  algorithms, like convex hull and Delaunay triangulation.
    In addition to these two groups there are also quite a few
  qraphics programs. They include simple planar drawings, like 
  ellipses and circles, filled contour plots and a variety of 
  surface-plotting routines. Some of them produce quite a 
  beautiful 3-d pictures, like "knots" or Mebius strips.


1.2 How do I find quickly what this toolbox can/can't do
  and whether it has a capability I need?

  There is a fairly extensive on-line documentation available for SaGA.
  It is detailed in the "Doclist" file in the /SAGA directory.
  Generally various documentation files start with a capital letter,
  so they appear first in the directory listing.
  For a very quick overview one can start with the "Readme" file.
  Most of the files (at least front-end routines) are listed in the 
  "Contents". The "Flowchart" contains information about the calling
  sequence and dependencies among various programs (which routine calls
  which). "Whatsnew" informs about upgrades, bug fixes, further 
  development and is updated regularly. This file "SagaFAQ" contains 
  many useful stuff about various routines and underlying methods.
  And don't forget to read the "License" file about the registration
  and licensing.


1.3 What are the current MATLAB capabilities for spatial
    analysis?

  Well, there aren't many. MATLAB was originally intended to deal with 
  _matrices_, that is "regular data" objects. Algorithms dealing with 
  irregular data are often difficult to put in the form of matrix 
  algebra (where MATLAB is at its best) or vectorizable in other ways.
  About the only function designed to deal with irregularly-spaced 
  2-dimensional data - GRIDDATA has a very poor performance in terms of
  memory, speed, flexibility, accuracy. Because of this it has drawn 
  hundreds of complaints from users.
    Concerning the development in this area within the MATLAB environment
  but outside of The MathWorks, I am currently aware of one publication 
  (and a program) for kriging by Denis Marcotte "Cokriging with Matlab"
  in Computers in Geosciences, 1991.
  Many scientists and engineers no doubt has done a lot of work in this 
  area using MATLAB for their problems, yet not much is readily available
  for other users.


1.4 Why does GRIDDATA bogs down?

  GRIDDATA is a 2-dimensional gridding problem supplied with MATLAB since
  4.0. It has some very serious drawbacks which caused grievances from 
  large number of users. In particular is uses too much memory (it does
  not have any division procedure similar to the QUADTREE which is used 
  throughout the SaGA toolbox and moreover uses more large matrices
  than actualy needed).
  In fact one should not use GRIDDATA at all unless:
   - dataset is small enough (no more than several hundred points),
   - extrapolation is not required,
   - accuracy and performance constraints are lax enough.
  In addition input to GRIDDATA requires some important preprocessing:
  data must be well-scaled in all directions, mean and trend removed,
  otherwise results can be ridiculous - one can get highly biased estimates,
  especially for points near the boundary or outside of the known set.


1.5 Should I wait instead for the MATLAB 5.0 release for
    many improvements in this area?

  One can certainly expect many improvement in the area of spatial data 
  analysis (especially considering the current rudimentary level). 
  During the MATLAB user conference in October 16-18, 1995 in Cambridge The 
  MathWorks gave overvew of many new features iof MATLAB version 5.
  Some of the capabilities currently available in SaGA will be offered
  (for example, there going to be a MEX-file for 2-dimensional Delaunay
  triangulation). I haven't seen however a serious breakthrough in the area
  of spatial interpolation or geometrical modeling.
  I belive The MathWorks major interests encompass linear algebra, signal 
  processing, control, simulation, optimization and some other fields.
  Although among the MATLAB users there is a great number of geophysisists,
  geologists, oceanographers for whom analysis of spatial irregular data is
  very important, The MathWorks currently does not seem to have either
  expertise or commitment to address these needs.
  In short, I argue that the SaGA toolbox provides much more in this area
  than what The MathWorks would be able to come up with any time soon.
  There are many features however which the future releases of SaGA can
  benefit from. Among them there are "vectorized patch" 3-d graphics, more
  sophisticated data structures, etc. I will try to incorporate these
  improvements into SaGA as soon as the version 5 will be released.


      2. Registration, license.
      -------------------------

2.1 What is a Limited license?

  Limited registration assumes that user needs only a small part of this 
  toolbox capabilities and feels that full registration fee is more than
  he/she can afford for this level of usage. It is nevertheless entitles 
  users to full technical support and updates.


2.2 Do I have to prove that I am a full-time student for
    the student license registration?

  No, I believe you. Just let me know what college or university you are 
  enrolled in and the department you are affiliated with (for statistical
  survey purposes).


2.3 Why such a short free trial period?

  The free trial period of 15 days is certainly too short to become familiar
  with all capabilities of the SaGA toolbox. However I believe it can be 
  enough to determine whether you need it or not principally. I encourage 
  earlier registration to start interacting with users, listen to their 
  comments and needs and make improvements and additions as soon as
  possible. I already received and further expect very valuable input from
  users for future development. Together we can make it a better product.


      3. Analysis of irregular spacial data
      -------------------------------------

3.0 ***  This section is intended to provide only basic information about
  various interpolation methods. To really learn a lot of practical and 
  useful things about interpolation, mapping, contouring I highly recommend
  an excellent book by David F. Watson "Contouring: a practical guide ..." 
  (see Bibliogr file).


3.1 What are most commonly used methods of dealing with
    irregular spatial data?

  Most data analysis methods assume that data must lie on a regular grid.
  It is incomparably easier to organize, manipulate and analyze data in this
  case. Yet for a great number of real-world situations this is not the case
  - data are not where we want them to be but where it happened to
  be measured or available. To manipulate and visualize data easily one needs
  to interpolate them to a regular grid or some other points where they are
  needed. 
    There are many methods of doing that as well as many classification
  principles to group them. I prefer the following (incomplete and
  approximate at best) classification:
  - geometrical:
    These are based mostly on the Delaunay/Voronoi decomposition of a domain.
    The information for a given grid point is obtained from the nearest
    points which share the same of adjacent Delaunay triangles. This is
    implemented in the program INTERPTR (which has auxillaries EXTRAPTR and
    GRAD2EST for extrapolation and gradient information). There is also a
    newer and more advanced method of "natural neighbour interpolation"
    based on Voronoi tesselation. It will be implemented in future versions.
  - algebraic (e.g. polynomial, rational):
      Unlike 1-dimensional and 2-d regular grid cases polynomials themselves
    did not prove particulaly useful for irregular data interpolation.
    Various rational functions are used much more frequently, such as in 
    "inverse distance" method, which is probably the simplest gridding
    algorithm (in SaGA it is implemented in the INVDISTI function). There 
    are aslo spline-type methods based also on rational functions, such as 
    "minimum curvature" type (MINCURVI in SaGA). 
    These group of methods differ from geometrical ones in that they are 
    global - all observations are used for each grid point (although this is
    not the case with use of quadtree subdivision procedure). In geometrical
    methods only near neighbouring points are used. Another difference is 
    that geometric methods operate only with the metric properties (distance)
    of data and do not use any parametric (polynomial, rational) 
    representation. Algebraic ones explicitly rely on such representation 
    for the "Green's function" of influences. In this sence geometrical 
    methods are more "natural", while algebraic are smoother, with ready 
    expressions for derivatives, etc.
  - statistical:
    These methods are based on the idea that the interpolated field can be
    represented as a sum of space-time mean (and may be a trend) plus random
    component with known covarianvce function. The expected mean-squared 
    error of a linear combination of observed values is the minimized and
    resuilting interpolation coefficients are obtained through inversion of
    "Green's function matrix" between all observation points. Formally it 
    is similar to some algebraic spline-type methods, such as minimum
    curvature, but the "Green's function matrix" is calculated using
    covariances betwen points and not predefined algebraic expressions.
    Two widely used methods of this class are "objective mapping" and
    "kriging" (also a more general term "optimal interpolation" is used).
    They are veru similar up to each other. Probably the main difference is
    the application domain: "kriging" is popular with geologists,
    geophysisists while meteorologists, oceanographers use "objective
    mapping". In SaGA both methods are represented, corresponding functions
    are KRIGING and OBJMAP.


3.2 What is a triangulation-based interpolation?

  This is a group of methods which belongs to a broader class of what I call
  "geometric interpolation methods". First we perform a triangulation of a
  set of observation points - in SaGA there are two program for doing this - 
  TRIANGUL (only for planar sets) and DELAUNAY (general-purpose, for 
  arbitrarily-dimensional sets). Both perform a Delaunay triangulation (see 
  section 4.3). Then we find which point belong to which triangle (The 
  program INMESH does that). Then we perform a linear interpolation within
  each triangle to inside interpolation points. The weights of the data at 
  triangle vertices are proportional to the so-called baricentric area-based
  coordinates. The resulting surface is equivlent to a tout rubber sheet 
  meeting all data. Interpolation is continuous (and linear) inside each
  triangle but the gradient has jumps across triangles boundaries.
  It is possible to use gradient information to make it smooth. See INT3INFO
  for more details and description of INTERPTR routine which performs this
  interpolation. It has options for extrapolation beyond the convex hull 
  (using EXTRAPTR) and incorporating gradient information (with GRAD2EST).


3.3 What is natural neighbour interpolation?

  Natural neighbour interpolation is among of the most sophisticated and 
  reliable (although computationally expensive) methods. It is also related
  to Delaunay triangulation but more complicated than the simplest 
  triangulation-based interpolation decribed in the previous section.
  Firstly, one needs to introduce the concept of natural neighbours. 
  Natural neihgbours for a point X are points belonging to all Delaunay 
  triangles for which X is inside circumferences of these triangles. Then 
  the "natural  neighbour coordinates" are computed for all natural
  neighbours. This coordinates are area-based and are equal to fraction of
  Voronoi polytope of a given data point which intersects the Voronoi
  polytope of a given interpolation point in a set of basis points.
  The actual calculations are simpler than that and use Delaunay triangles
  rather than Voronoi polytopes. The interpolation coefficients can be 
  negative when corresponding triangles are opposite-signed.
  Construction method is described in a paper by Sibson and Watson's book
  (see Bibliogr).
  Implementation of this method in SaGA is reserved for future version.


3.4 What is kriging, objective maping, optimal
    interpolation, minimum curvature interpolation?

  All of them are methods of "global" interpolation (like splines) in the
  sense that all available data points formally participate in the 
  calculation of values for each interpolation point. The weights for each
  datum are calculated by inversion of a "Green's function" matrix 
  depending on the distances between each data points.
    Green's functions are different for all these methods.
  For minimum curvature method Green's function is standard (such as
  r^2*(log(r)-1) ), for kriging and objective mapping it depends on the
  correlation function and semivariogram of the interpolated field 
  respectively. Therefore some apriory statistics should be known (or
  assumed).
    For more details and syntax of the corresponding routines MINCURVI,
  OBJMAP, KRIGING see MAPINFO file and "help" sections for the those 
  programs.


3.5 What is an inverse distance method?

  Basically it is a fairly simple weighted average method. Inverse distance
  method uses estimates for interpolation points as weighted sum of data 
  values. Weights are inversely proportional to the distance R between
  interpolation and basis points: w ~ R^(-n)
  where  n - some positive index. The choice of power index  n  can depend
  on the dimensionality of the problem and data itself. There are no
  universal optimal values. One reasonable constraint is that n should be
  more than d - dimension of the data set. This ensures that if a dataset
  contains many points and thery are approximately uniformly (although may
  be randomly) distributed in space the total influence of far-away points
  will not increase with distance, so the nearest points will be given
  larger weights. Thus, for example, for planar dataset on can choose n = 3,
  4, for 3-D - n = 4,5, etc. The program INVDISTI allows to choose the power
  index n. Moreover, it allows to blend estimates with different n in one
  call. One can experiment with data to choose the best combination of n and
  blending coefficients for a particular dataset.
  The  INVDISTI program is currently the only one in SaGA which can
  interpolate data in dimension higher than 2.


3.6 What is extrapolation as opposed to interpolation?

  It is very easy in one-dimensional case. Let's assume that all available
  data x_i lie between points A=min(x_i) and B=max(x_i). Then if our 
  interpolation point is also between A and B it is interpolation, if it
  is <A or >B it is extrapolation.
  For 2-d and higher-dimensional case it is a little bit more complicated.
  Interpolation - for points inside a _convex hull_ of a set of points 
  (see 4.1 section). In 2-d it can be viewed as the inside of the "fence"
  corraling all the points of a set.
  Extrapolation - for points beyond the convex hull of a set.
    Interpolation and extrapolation typically have significant differences
  in accuracy. Interpolation usually can be made much more accurately, 
  because the known points are "all around" the interpolation points.
  Moreover, some methods (especially geometrical ones) are intrinsically 
  designed for interpolation, and for the extrapolation points the rigorous
  rules of these methods do not apply and some "ad hoc" generalizations 
  are usually implemented.
    Various methods handle extrapolation differently. Global spline-like
  methods, such as minimum curvature (in MINCURVI) are the least reliable
  and can produce artificial overshoots and trends. On the other hand
  formally similar to that kriging and objective mapping methods handle 
  distant points without strong overshoots, although with relatively low
  accuracy. Inverse distance is probably the most robust: for infinitely 
  distant points the extrapolated value tends to a simple average of all
  the  available values.
    For triangulation-based methods, such as triangular baricentric or
  natural neighbour coordinates there is no unique way to extrapolate.
  Various ad-hoc methods can give either resonable or very poor results
  depending on the data.
    In general one should be very cautiuos about values for extrapolation
  points, especially when data are highly variable near the boundaries of
  convex hull and extrapolation points are far away from the dataset.


3.7 Which interpolation/gridding method should I use?

  SaGA toolbox offers a variety of tools for interpolation from
  irregular points. They can be roughly divided into 3 groups:
    1. Inverse distance - the simplest although often the least accurate
  method, works for arbitrary-dimensional datasets, implemented in the
  function INVDISTI. If you not particularly concerned about accuracy but
  would rather prefer speed, try it. To improve accuracy you can experiment
  with power index n and weights for estimates with different n. For some
  cases one can optimize these parameters to achieve pretty good
  interpolation.
    2. Triangulation-based method. Usually robust and accurate enough,
  although one must be cautios about points near the boundary of a convex
  hull (where Delaunay triangles are often long and thin). In SaGA it is
  implemented in the INTERPTR function and has various options: to 
  extrapolate beyond the convex hull (perimeter encompassing the set of 
  points), to "smooth" the results using the gradient estimates with control
  of the "toutness" of the resulting surface. 
  It uses the following functions:
  TRIANGUL, CONVEX2, CONVEX20, INFLATE, INMESH, 
  ETRAPTR, ADJSPX, GRAD2EST, GRAD2LS, GRAD2TCP, UNIQUE.
    3. "Global" interpolation (objective mapping, kriging, minimal 
  curvature method). All these methods require inversion of some kind of a
  "Green's function" matrix. Such methods are most often used in geophysics,
  meteorology, oceanography. They usually require some apriory information 
  about structure of "correlation function" or "variogram" of the field to
  be interpolated. These methods are implemented in the functions OBJMAP,
  KRIGING, MINCURVI. They also need functions DETREND2, MKBLOCKS, PTSINBLK,
  QUADTREE, QTREE0.
     

  4. Issues in Computational Geometry.
  ------------------------------------

4.1 What is a convex hull of a set of points?

  This is one of the most fundamental concepts in computational geometry.
  Convex hull can be viewed as "inside" region of a dataset. For a planar
  case one can visualize data points as nails in the board and there is a
  string which is pulled over these nails so that all of them are inside.
  The area bounded by this string is a convex hull and the string itself 
  is its boundary. Similarly in 3-d case it canbe viewed as inside of a
  "wrapping paper" over a set od points.
    More rigorous definition for n-dimensional set can be the following:
  take all the (hyper)planes passing through n points of a set so that one
  of the half-spaces separated by such a plane does not contain any points
  of a set. The intersection of all the other half-spaces (containing all
  points of a set) of all these (hyper)planes is a convex hull of this set.
    In practical terms computing convex hull means finding all the facets 
  of its boundary - combinations of n points which define the plane dividing
  space into 2 half-spaces, one of which is void of set points. Therefore
  the data structure of a convex hull can be a matrix N by n of indices of
  set points, so that each row is a list of vertices of such a facet.
    The CONVEXH routine returns such a  list and also (optional) matrix of
  the same N by n size where each row is a normal vector to each facets.
    For planar case the output is actually simpler - all we need to know is
  a list of vertices of a polygon which define the boundary of a convex hull.
  For this case there is a separate (much simpler) program CONVEX2 which 
  returns a vector of indices of points in this polygon.


4.2 What is a simplex?

  The "simplest" combination of points in n-dimensional space
  enclosing a finite volume. 
  One can probably imagine the simplest object to be a
  (hyper)-cubic cell in n-dimensions, such as rectangle in 2-D, cube in 3-D,
  etc. Yet such a cell is _not_ the simplest or most convenient object. 
  The simplest object is a triangle in the planar case, tetrahedron in 3-D,
  etc.
    Its convenience and universality can be illustarted by the following
  important property: any polygon (polyhedron) can be divided into non-
  intersecting triangles (tetrahedra) possibly having adjacent vertices,
  edges or faces, but it can not in general be divided into rectangles
  (cubes) in general. This property is used for triangulation - a procedure 
  having many applications, from interpolation to mesh generation and finite
  element analysis.
    There is a general property of a set of points in n-dimensional space:
  a convex hull of any such set can be divided into non-intersecting (but
  possibly adjacent) simplices with vertices belonging to this set of points.


4.3 What is Delaunay triangulation?

  Planar triangulation is a division of a convex hull of a set 
  of points into adjacent triangles with vertices from this set.
  Triangulation can be done in many ways, obviously some ways are better
  than others. The Delaunay triangulation is the one which has the following
  important property: inside the circumference of any triangle there are no
  other points of this set. Therefore  such triangulation is "natural" and
  optimal in many respects. It is relatively easy to construct and there are
  many known algorithms which perform such a triangulation. The program
  TRIANGUL (and also DELAUNAY) perform a planar triangulation.
    This concept can easily be generalized to an n-dimensional set. Delaunay
  triangulation (in this case a partition into simplices) among many other 
  possible triangulations has the property that inside a circumsphere of any
  simplex there are no other points belonging to this set. Delaunay
  triangulation is proven to be statistically optimal in some sense and is 
  most widely used in practice.
    The program DELAUNAY computes the Delaunay triangulation
  in arbitrary dimension. It returns a matrix where each row is
  a list of indices of points defining a simplex. 


4.4 What is Voronoi tesselation?

  This is another fundamental partition of a domain.
  The simplest definition is probably the following: Voronoi polytope for
  an point A in a set S is a subset of all points in S which are closer to
  A than to any other points in this set. The whole space can be divided 
  into Voronoi polytopes. Voronoi tesselation is dual to Delaunay 
  triangulation. Its vertices are circumcenters of Delaunay triangles 
  (simplices in higher dimensions). These two structures can be relatively
  easily constructed from one another.
  The program VORONOI2 calculates and plots a 2-dimensional Voronoi diagram.


4.5 Where have these terms - Delaunay, Voronoi - come from and how to 
    pronounce them?

  Such questions arise regularly in the newsgroups such as
  sci.math.num-analysis or comp.graphics.algorithms.
  Delaunay and Voronoi are probably two biggest names in computational
  geometry. Delaunay was a math professor at Moscow University in 20- 
  30-ies(?). He seems to be of a French descent. There are no Russian
  name which have similar spelling, while it is a fairly frequent French
  name [de - la -'ne]. This is quite possible, since there were quite a lot
  of French settled in Russia after French Revolution and in 19-th century.
  Voronoi worked in Germany and in France at the beginning of this
  century. It is interesting to note that in contrast to Delaunay he
  probably was of Russian origin. This is a typical Russian word and name 
  pronounced [vo-ro-'noy]. It means "raven" (color, typically of a horse).
  These names are connected with a fundamental partition of a domain -
  Delaunay triangulation and Voronoi tesselation. This partition and
  discovery of its many properties is also connected with other names - 
  such as Dirichle and Thiessen.


4.6 What is the volume of an n-dimensional simplex?

  Choose one point of a simplex to be the origin. The coordinates 
  of other points relative to it will form an n by n matrix R.
  The determinant of this matrix divided by the factorial of n will
  give the volume:
  Det(R)/n!
  In 2-d (triangle) it is equal to 1/2 of the cross-product of the
  vectors forming two sides of a triangle.
  In 3-d (tetrahedron) - 1/6 of the triple product of vectors forming
  3 edges (volume of a parallelepiped formed by these edges).
  The routine VOLSPX calculates this value given the vertices 
  coordinates.


4.7 How to calculate a circumsphere around an n-dimensional
    simplex?

  Choose one point of a simplex to be the origin.
  The coordinates of other points relative to it will
  form an nxn matrix R. Normalize each vector (row of a 
  matrix R) by its squared length divided by 2.  This will
  give equations of normal vectors to the planes through mid-
  points of edges of a simplex. Intersections of these planes
  will give the center of a circumcircle. In MATLAB the
  latests stage is a single backslash operation:
  c = R\[1 1 ... 1]'.
  This method is implemented in the program CIRCMSPH in SaGA.