INTRODUCTION: Welcome to STAPLOT v3.1

visit the STAPLOT homepage

STAPLOT is a simple and flexible package designed to make the graphical analysis of irregularly spaced or sampled data a simple matter. It was designed for hydrographic data, but is fairly flexible and I'm sure can be taken apart and customized to some extent for other types of data.

STAPLOT is written completely in the Matlab environment, and in all of the following I assume the reader has some basic familiarity with Matlab syntax and array handling characteristics. I have not provided any numerical or analytical data processing tools, and I strongly urge the user to pick up Phillip P. Morgan's "seawater" package (make sure you get v1.2b or later) and Rich Pawlowicz's geophysical package (both available at the SEA-MAT website ) for use with STAPLOT. Four of Rich's routines are incorporated within STAPLOT -- the xcontour package, and dist.m .

The basic unit of currency in STAPLOT is the station. A station is a related set of data containing a position (Latitude, Longitude), possibly a station number, a date or a time, and then some associated data such as potential temperature or oxygen concentration. The number of stations in your dataset is nstn. (mnemonic: Number of STatioNs).

STAPLOT recognizes 3 types of variables:

An example of a type 1 variable would be Longitude. Typical type 3 variables might be in situ density, mixing coefficient, or salinity. As you can see, the number of data values per station is not fixed. This means you can use STAPLOT for both high and low-resolution data.

If your stations are not of equal length, don't worry: just pad the end of each column with NaN's to fill each out to the length of the longest station. You may want to interpolate in the vertical to standard levels or 10db levels for deep CTD stations. This precaution can enormously reduce the amount of memory your dataset will use.

A word of warning is appropriate here: If you have type 3 data matrices of different lengths, for example bottle salinities and CTD salinities, it is up to you to keep track of which is which. If you want to plot the bottle salinities versus, say, the CTD temperatures, you'll have to subsample the CTD temps to the bottle levels, put that information into a new variable. When you plot the CTD salinities, you'll plot versus the original temperatures; when, you plot the bottle salinities, you'll plot versus the new, subsampled CTD temps. Luckily, adding new variables in STAPLOT is simple, as is deleting them once you're done.

A WORD ON TYPE 2 VARIABLES

The last type of variable used by STAPLOT it the type 2. Type 2 variables are vectors. They are expected to be the same length as the number of ROWS in your type 3 data. One example of a type 2 variable would be standard levels. Suppose you've gone to all the trouble of interpolating you're data to standard pressures. Rather than maintaining an entire matrix with nstn columns, each column containing the standard levels, you could declare a type 2 variable, 'stdlev'. When you subsequently plot any of your interpolated, type 3 data versus the vector 'stdlev', it will automatically be applied to each column of the the stations. This can save a lot of memory, diskspace, and time. On the other hand, if you have bottle data as in the example above, don't forget and try to plot it versus 'stdlev'! You'll only cause yourself trouble, and you may get an incorrect result.

GETTING STARTED

To use STAPLOT you need to set up your data in a few simple arrays:

These variable names can be user-specified, but those are the defaults. STAPLOT may on occasion use length of your longitude variable to determine the number of stations you have (nstn), so it's important to get this right.

STAPLOT will keep track of any number of cruises or sections for you, by using an index in the variable 'sections'. By default, all your data will be treated as if they belong to a single cruise. If all your data are in one section, sections will look like this:

sections = [1 nstn];

Suppose instead that you have 2 cruises, and you want to distinguish between them. suppose L is the number of stations you have in the cruise. Then you might use:

sections = [1 L; L+1 nstn];

Where L is the number of stations in the first cruise, nstn-L+1 is the number of stations in the second cruise, and nstn is as usual the total number of stations. Here's an example:

sections = [1 35; 36 42; 43 66; 67 90];

so we have (nstn =) 90 stations in all, split between 4 cruises.

Every quantity you'd like to plot must be set into its own type 1 or type 3 variable. Here some examples:

therm : a type 1 variable indicating depth of the thermocline, in meters.

Size: [1 nstn]

therm = [400 420 ... 550];

O2: a type 3 variable of Dissolved O2 concentration, with several values per station.

Size: [??? nstn]. Short stations are padded with nan's.

O2 =

[200 210 ... 180

220 205 ... 185

500 NaN ... 330

NaN 480 ... 450];

Here I have reported O2 values in u-mol/kg, which brings me to another point; STAPLOT will not keep track of units for you! If you mix them up and misuse them, it will happily comply and plot complete junk. Notice that missing data are indicated by nan's. So the dissolved oxygen values for the first station are: 200, 220, 500. The depths of these measurements would be found elsewhere, in a separate pressure, or depth matrix (see below).

PRO2: a type 3 variable indicating the pressure (in db) at which the oxygen values above were measured.

PRO2 =

[25 25 ... 22

200 200 ... 205

450 425 ... 330

NaN 500 ... 450];

Now we know at what pressures the O2 data were measured. There are several points to be gleaned here. One: notice that the first column actually has less data than the rest -- it's simply padded out in the 4th row to make the data into a matrix. Use nan whenever you have to pad your data. STAPLOT is designed to handle nan's, and all of the routines are robust with respect to nan.

Now look at the 02 in the second column. What is going on there? The data are not being padded, are they? If so, why wasn't the padding put in at the end? ...

Notice that the third pressure for station 2 is a real value. What this indicates is that a measurement was made at 425db, but the 02 data were later thrown out. Nan is not only the value to use for padding your data, it is the value to insert for missing or bad data as well. An alternative, if you don't care to keep track of missing values, would have been to 'tighten up' column 2 to look like this:

O2(:,2) =

[210

205

480

NaN];

with

PRO2(:,2) = [25 200 500 NaN]';

Please note the way --> ' <-- is being used to create a column vector in the above expression.

Lastly, notice that the last station (rightmost column) does not actually go any deeper than the other two, it's just at a slightly increased vertical resolution.

To make things a little easier, there are several quantities recognized by STAPLOT in the default. They are: pressure, depth, potential temperature, in situ temperature, and salinity. The default names are as follows:

You can change this to anything you prefer, but it's simpler not to do so, at least until you get the hang of it. I strongly suggest interpolating deep stations to a common pressure scale (e.g., 10db or standard levels). It's not strictly necessary, but very useful. However it's not a good idea if you're using bottle data. The default for PR and DE is that they are expected to be positive, increasing downwards. This quality may be set in the session options file.

Once you have gotten your data into this format, SAVE IT in a .mat file. You are ready to begin.

A SAMPLE STAPLOT SESSION

First of all, you want to have copied all the STAPLOT routines into one directory. Start matlab and cd to that directory.

STAPLOT is begun by typing 'staplot' (no quotes) at the user prompt in Matlab. There are a couple of sample sessions included with the code, and I'd strongly suggest you try using one before loading your own data. When first started, STAPLOT will prompt you to load session 'sample1'. If you've followed the directions, just hitting the carriage return will load up the sample1 session.

The basic idea is that you have a number of stations, and you wish to examine them and possibly sort or identify them by geographical location or by the presence of some characteristics, e.g., a salinity minimum at a particular depth.

On startup, you get three windows: a menu, an edit window, and a "Stations" window, which is geographic. Little grey crosses show up at all the station locations in this window.

Near the top of the menu, you'll see a selection labelled ' Sel All'. Select this and you see the numbers 1-10 appear in the edit window. Those are the 'active' stations (the column number of type 3 variables or the index into type 1 variables).

First of all, where are we? The Stations window has an x-axis running from about -75 to -60. This indicates a West longitude (East is positive). The latitude range (y-axis) is in the northern hemisphere. Go down the menu and select "Worldmap". A rough map shows the geographic limits of the dataset in red. Kill this window when you're done looking at it.

In the main menu, you'll see a pulldown submenu with the top item labelled Dist. When you pull this down, you see a lot of variables. These are what is available for plotting. Dist is simply the first variable on the list. Pull down and select, in order, first Lon, then Lat. Immediately underneath the variable submenu is the plotting submenu. Select "XY Plot". You'll see a number of brightly colored circles appear in the Station window. Each station is assigned a color and a plotting symbol, which can be changed at any time, and this is one of the primary ways you will be able to distinguish among your stations in property-property plots. The symbol for each station remains consistent from plot to plot, although if you change that symbol, previous plots will not automatically be updated. STAPLOT will save the station color and symbol for you between sessions, so you can begin where you left off the last time.

Now select PT and SA from the variables submenu and again, "XY Plot". You'll get a new window, a T/S diagram. But it's sideways! Kill this window, go back and select SA and then PT, and again "XY Plot". That's better. The moral is that the variable you choose first becomes the domain, and the second one becomes the range.

Pull down the isopycnals menu (Sigma_0, Sigma_1, ...). and select "Sigma_0". Presto! Isopycnal lines. The choice of values for the isopycnals is another thing that can be customized and saved in the session options. Under "Cl Fig", pull down "CL Iso" to wipe the isopycnals out without disturbing the T/S plot. Kill the T/S plot entirely.

Back to the menu again. Select "PT" and "PR" and then "XY PLOT". You'll see potential temperature versus depth. Note that the depth is positive downwards. Click on the face of this plot to make it the current figure (as in 'gcf' in Matlab). (In general, STAPLOT will perform the operation you've chosen on whichever is the current figure. One exception is the isopycnals, which are automatically directed to the T/S plot.)

Now that T/Z is the current fig, pulldown the top right menu and select "flipud". Select it again to turn the plot "rightside up". Note the green and yellow stations with prominent layers in which the temperature does not change. Select the third button of the menu, "StaSel". This is one of the most important functions of STAPLOT. It will turn red. Go back to the temp/depth plot, and, placing the mouse in the lower right of the figure (inside the axis box, please), hold down the left mouse button and drag out a box that captures only the yellow and green stations. (Just clicking near the data does not work!) An important note to Mac users -- because of the way the Mac handles mouse calls, you will have to click, drag out a box, and click again at the opposite corner of the selection box.

Go back to "StaSel" and click to turn it off. The button should turn grey again. Now look at the edit window. You should see only two stations listed, 2 and 4. They will not appear until "StaSel" has been turned off. This is because "StaSel" is cumulative -- you can selection several stations and they are not toted up until you turn the function off again. Clicking on "Sta ID" tells you that these stations are cruise (or section) 1, stations 205 and 207, and their respective locations (Lon, Lat).

Select SA and PT and "XY Plot" again (you did kill the first one, didn't you?). What do you get? A T/S diagram, but only 2 stations show up -- the 'active' or 'selected' stations. To make life easy, let's give these stations their own, distinctive color. The edit window should still be showing just '2 4'.

Pull down the top left menu, that says "Color". Pick yellow. Select "XY Plot" again. But wait! What variables will be plotted? Try it and see. STAPLOT keeps track of the last 2 or in some cases 3 variables you've selected. You should get the same T/S plot, only both stations are now your chosen color. Now go to the menu "Sel All" and "XY Plot" ... the rest of the stations show up on the T/S diagram again, with the stations 2 and 4 being the only ones in yellow. Select Lon, Lat, and "XY Plot" and you can see the geographic distribution of this thermal feature as it appears in your data.

What do the data look like? In the main workspace, do a 'whos'. You should see a number of row vectors size [1 10] and some arrays size [152 10]. Note an empty variable (size [0 0]) called "Dist". Click on the "Stations" plot. Select "StaSel", and drag a box around the 4 stations at ~30N. Turn off "StaSel" and note the active stations in the edit window. They should be [7 8 9 10]. Now click on "StaSel" again, and this time, select the same 4 stations, but one by one, clicking and dragging a small box around each one, from west to east. Now turn "StaSel" off. This time, the edit window should show [10 9 8 7]. Why is this important? Here's one example:

From the variables list, select Dist, PR and then TE. From the list of plot choices, select "Contline". You should get a colored contour plot. Note that station 8 has a colder surface layer than the surrounding ones. Do a 'whos' in the main workspace. Notice that Dist is now 1x4. What happened? Type 'Dist' at the command prompt. What you see is the cumulative distance between stations, in km. But Dist has not been calculated for ALL the stations, only the active ones.

Moreover, the distance between stations has been calculated in the order in which you selected them. Selecting stations one by one ensures they will be listed in the order you want. Otherwise, they are simply arranged in the order (increasing column number) in which they appear in the dataset. If you are plotting something versus Dist, be sure you have the stations selected in the correct order! Plotting versus longitude or latitude is a way to avoid this kind of mix-up, since those variables are specified rather than calculated each time you select them.

You can edit your station selections manually in the edit window (grey bar), which should accept whatever cut and paste protocol your system supports (as well as the simple backspace). You can also manipule the station selection at the command prompt in the main workspace. The variable 'sta' contains the active stations. One handy use of this is to maintain your own list of station types (such as 'stations-that-are-on-the-shelf', or 'stations-containing-high-DO-values'), in a Matlab variable, e.g.:

And the next time you call a function in STAPLOT, the edit window will be updated to reflect the updated list of active stations.

This discussion brings up an important point: for type 1 and type 3 variables, STAPLOT will happily handle vectors and matrices that are only the size of the current selection of stations. If you want to calculate a quantity on the fly that is only applicable to the selected stations, you can add it to the list of plottable quantities and you can safely plot it against any properties that are already defined, whether for all stations or also only for the current ones. However ... if you select a different set of stations, or subselect among the set for which this quantity has been defined, you will have to remember to resize or recalculate it yourself. The single exception is cumulative distance, which is built-in and is recalculated each time it's called. (As a related consequence, do not put anything in a variable called Dist, it will be overwritten).

Back to the contour map of Temp vs. Pressure and along-track distance. Click inside the axis box to make sure it is the current figure. Try to station select in this plot. Read the error message in the main workspace. You can't select stations here, because STAPLOT has no way to tell if you've recalculated "Dist" in the interim between the time you made the plot and when you tried to select stations. You cannot select stations in any plot of a variable defined for only some, not all, of the total number of stations. C'est la vie.

Make sure that 7,8,9 & 10 are your active stations. (If you've changed the selection of stations, change back now).

Without reselecting variables, try "ContFill" instead of "Contline". Notice that no new window is opened. Now select "XYZ Plot" instead. You do get a new window. Note the cute slidy bars allowing you to change the viewpoint. Also note that, while on a Contour plot, the "y" variable becomes the vertical axis, on the XYZ plot, the "z" variable plays this role. This follows standard convention. For a more satisfying arrangement, select Dist, TE and then PR and "XYZ Plot".

From the "Plot Ctrl" submenu, switch the view from 2-D to 3-D. Try selecting stations in each. Use the edit window (the grey bar) to edit your selections.

A LITTLE MORE ADVANCED

Okay, how about loading a different data set? Type "pwd" at the command prompt in Matlab to make sure you are still in the STAPLOT home directory. If not, cd to that directory.

Pull down and select "Load data" from the menu. Answer the questions that appear in the main workspace. Hit 'y' for the first question, and type 'sample2' (no quotes) for 'name of options file'. Ignore the question about the path, and just hit return. You'll see a similar set of stations, but the properties are slightly different. In fact, in the variables list, you'll see T1, R1, and S1" (mean temp, density, and salinity for the upper 1000m of the water column). But if you type a 'whos' in the main workspace .... they're not there! On the other hand, typing 'showtype' in the workspace shows they are correctly defined.

These variables don't show in the main workspace because they were not loaded at the very beginning when you started STAPLOT -- there were no corresponding variables in the first sample session (sample1). The list of variables that will be visible in the main workspace when you start STAPLOT is contained in a 'startup' file. The name of this file is the session name (e.g. sample1 ), plus 'o.m' (e.g " sample1o.m "). Use your favorite editor to examine sample1o.m.

The session options themselves are stored in " sample1.m". Session options include map projection, station colors, number of contour lines, values of isopycnals, available plotting colors, isobaths to be loaded, what variables are defined, the names used for the default variables (longitude, latitude, pressure, depth, salinity, in situ temperature, and potential temperature), whether pressure and depth and positive upward or downward, the colors in which to plot the chosen isobaths, and not in the least, the name and path for the dataset to be loaded at startup time.

All of these session options can be changed to suit your needs, and I strongly encourage you to make use of the options files. In particular, direct editing of the options file is the only way to change the following:

Any other options, such as the list of defined variables and the file and path of the dataset may be changed there as well. In fact, one of the easiest ways to create a new session is to copy the options file from the old one (changing section, nstn, the proplist etc as appropriate). The data file and path can be changed at STAPLOT startup time, or when a new dataset is added or loaded. The list of defined properties can be changed at any time during a session, by using the " Add Prop" and "Del Prop" selections in the property submenu.

Now, back to the problem at hand. To recapitulate, the property list in STAPLOT shows 'T1', 'R1', and 'S1 (mean temp, density, and salinity for the upper 1000m of the water column). But they do not appear in the main workspace. To access them, in the main workspace, simply type 'global R1 S1 T1' at the prompt. You will have to do this any time you load a new data set and it has new and different variables in it. But the plotlist and 'showtype' will always show you the whole list accessible to STAPLOT. Plot a few things. Notice that the length (number of row in type 3 variables) of these stations is different.

Now what if you want to combine the 2 datasets? Try quitting STAPLOT with the "Quit" button. Now type 'clear all' and 'staplot' at the Matlab command prompt. Which dataset does it ask you to load? It should be sample2. Go ahead and load it. Now pull down the "load data" menu and select "Append data". Read the warnings! Hit return ([CR]) to accept padding. Notice that the resulting dataset is padded to the length of the data in samp1dat. Don't save the merged dataset.

A word of warning here: CUSTOMIZE each options file before merging datasets, and try to be as consistent as possible. If you load something using the default options, DON'T try to merge it before saving the options and manually customizing them. (Look in sample1.m and sample2.m for examples.) This will save many headaches. And again, DO customize your options files. They are liberally-commented text files, and quite readable.

Once you're ready to go again, select "Sel All", then chose Lon and Lat from the property submenu. Choose "XY Plot". You should see a plot with one set of X's and one of O's. Which is which? Try selecting just one section using the "Section X" menu bar (note there are now 4) and hit "XY Plot" again. You won't see any change -- first you have to clear the figure.

Select "Cl Fig", and "XY Plot" again et voila. While you're at it, try "Cl Sta". Select each section or cruise leg in turn, and select a distinct color for it. When you're done, "Sel All", and "XY Plot". Now you can see which section is which. Select "SA" and "PT" and "XY PLot". Only 2 of the colors show up!. Type 'sum(isnan(PT))' at the prompt. You'll see that the last 10 columns of potential temperature are just padding -- this variable was not defined in the data you appended. Which goes to show you that it's not terribly efficient to merge datasets until you've put some effort into to make them consistent. Kill the PT/S plot and select SA and TE instead. You should see all 4 sections appearing on the T/S diagram.

Select all stations along ~30N at one fell swoop. You should get sta equal to [7 8 9 10 12 13 15 16 17]. Select Lon PR, TE and "Contfill". You'll notice contfill does not always do a good job when the station lengths are quite different. Choose "contline" instead. The extent of the axis is greater than the extent of what's actually plotted. This is true for all the plots, just more obvious here. Plot axes are sized to the range of the whole dataset, not just the currently active stations, so you can see at a glance how the range of selected data compares with the total range.

Click on "Zoom" (or first on the plot face and then on "Zoom" if you've done anything in the interim) and zoom into to just the contoured area. Now select "StaSel" and try it out. Did you select any stations? Try to zoom in some more. The zoom feature should be turned off, since it uses the same functions of the mouse that StaSel does. Turn zoom on again and double-click in the plot window to zoom back out to the original limits.

Lastly, click on the "Stations" window and select "sealevel". Add "b2000", the 2000m isobath. Still don't recognize the location?

Pull down the "xlims" option from the "Plot Ctrl" submenu. Type '-90 -40', then click on the red "X" to show you're done. Your plot will look pretty gross. Select "ylims" from the same menu and change them to '0 40'. Hit return. Notice that the "ylims" menu did not go away, although the y-axis limits change. Delete 40 and replace it with 35. Make sure to click on the red "X" when you're done. (Note that when these options are active, they are covering up another menu, the plot symbol menu.) Now the map should be recognizeable, but the yticks are screwed up. Select "YTicks" and type in a '10'. Ticks may be specified by a scalar value (the tick interval), or you can type in each specific tick if you want irregularly spaced ticks, or if matlab seems to be resetting the ticks to some weird value. Don't worry if your tick specifications scroll off the tiny screen, they will be read and executed.

Click on "Grid". This is starting to look pretty good. Select "SaveOpts", pick a unique tagname like 'junk' (NOT a filename that exists in the current directory, please!). Open junk.m and junko.m in your favorite editor. Notice that they claim to describe 'samp1dat', while actually the number of stations &c are the numbers for the merged data. If you had saved the merged data into a new file (either by replying 'y' right after the merge, or by selecting the "Save Data" option), that filename would be the one (correctly) listed in 'junk.m'. Try "SaveData", use the filename 'junk' (STAPLOT will supply the '.mat' extension, just as in the usual Matlab session) and then "SaveOpts". Using your favorite text editor, examine junk.m and junko.m again.

When you're all done, don't forget to remove junk.* (from the STAPLOT home directory. Clear everything and run the 'install.m' script for STAPLOT. This will give you some suggestions for adding the STAPLOT directories to your Matlabpath so that STAPLOT can be accessed from anywhere, not just by cd-ing to its resident directory.

A quick note about creating isobaths. The quickest way I've found to create an isobath file is to load a gridded bathmetry into Matlab and contour it at the desired level, making sure to SAVE THE CONTOUR LINES:

[c,s] = contour( ... );

The format of the line will be a 2xN matrix. The first column entry contains the contour value and the number of points in that line segment. By reading in the number of points into a variable, and iteratively stepping forward to the next such column, you can locate all of the columns which contain this information. Once located, replace all such columns with a column of nans: [NaN NaN]'. Save the contour line in a variable called dlines. Now you have an isobath file that can be used in STAPLOT .

Quit STAPLOT and start converting all your data to the appropriate format. :-)

A NOTE ON PLATFORMS

STAPLOT has been run and tested on Unix, Linux, Windows95, Windows 3.11 and Mac systems. Windows users will not see the color selections in color :-(. Mac users will note a number of differences, among them the following: Improper rotation of the axis and contour line labels (Mac's apparently handle only 0 and 90 degree rotations), which will give you a WARNING message scrolling up the screen any time you use polar projections or the contline function. I don't know if this is an inherent Mac feature, or a limitation in the Matlab implementation for Mac's, but I suspect the former. I suppose I could suppress the label rotations, but I prefer to let Mac upgrade their OS.

Another Mac difference is that the color selection and the pulldown axes handling submenus will not appear on the STAPLOT main menu, but instead on the main menu of the Mac itself. Select the STAPLOT menu window to make these functions appear.

If you like STAPLOT, you're welcome to hand it around to anyone else you think might. Staplot is freeware. Please be sure to include all the files and scripts in their entirety (a la GNU public licensing), including a copy of this document. Bug reports and suggestions to me at:

dbyrne@grayling.umeoce.maine.edu

-- Deirdre Byrne 98/04/09