Please note that portions of the following paper have been published and are copyright IEEE (Institute of Electrical and Electronics Engineers) and cannot be reproduced without the expressed written permission of the IEEE.

A Prototype Metadata Database for Online Analytical Processing of Environmental Data

Harold Geller1,2, John Ertlschweiger3,4, Sarah Conger5 and August Ryberg6

Presented at the Ninth International Conference on Scientific and Statistical Database Management

The Evergreen State College, Olympia, Washington, August 13-15, 1997


We present preliminary results on the development of a prototype database system which demonstrates the utility of the integration of environmental metadata within an online analytical processing environment. Using existing data derived from CD-ROMs of the National Snow and Ice Data Center (NSIDC), the Consortium for International Earth Science Information Network (CIESIN) and the U.S. Geological Survey (USGS), we populated a prototype metadata database whose unique architecture facilitates the scientific and statistical investigations of geophysical parameters associated with the polar regions allowing for data fusion from other regions and earth science disciplines. Thus, the database schema facilitates interdisciplinary studies of the polar region and global environmental changes. The user can extract information combining the knowledge of two disparate sources of geophysical data to allow a query that would result in a useful product. For example, the user can examine the sulfur dioxide concentration in parts per million at a specific latitude, longitude, and altitude, and find that portion of the polar region that has greater than 90% cloud cover for a specified time of the year. Additionally, the user might wish to examine the correlation of cloud cover with a specific aerosol pollutant, establish a temporal regime, and determine the utility of aerosol pollutants as a forecaster of sea surface temperature or ice concentration. We also demonstrate the utility of allowing access to this database via the World Wide Web using an interface to the underlying Oracle database management system. Figure 1 summarizes the overarching approach.

Figure 1 Polar Ice and Aerosol Data system Diagram

1Center for Earth Observing and Space Research (CEOSR), George Mason University, Fairfax, Virginia
2SAIC/General Sciences Corporation, Laurel, Maryland
3Institute for Systems and Software Engineering (ISSE), George Mason University, Fairfax, Virginia
4BDM International, McLean, Virginia
5Hughes/STX, Lanham, Maryland
6PRC, Inc., McLean, Virginia


Remote sensing is used to monitor the entire cryosphere due the difficulties associated with access to these regions. For example, remote sensing has been used to monitor: variability of seasonal snow cover for the dual purpose of assisting the commercial ski resorts as well as city water planners [Duguay and Hurtubise, 1992]; periodic movement of glaciers, known as glacier surge, to assist global change researchers and the merchant marine [Molnia, 1993]; in addition to ice drifts, polynya evolution, effects of storms, stresses of winds and wind direction, and standard sea ice mechanisms [Kozo et al., 1992].

The polar regions play a large role in the comprehension and prediction of global climate change. There is a consensus that warming in the polar regions would be higher than in the lower latitudes and perhaps more seasonally dependent. This is due in part to the fact that in the polar regions there is a positive feedback mechanism between the solar radiation and the Bond albedo. Furthermore, the polar regions often display a strong temperature inversion effect near the surface which acts as a mechanism for trapping energy [Maxwell, 1987].

The National Snow and Ice Data Center (NSIDC) is the primary source of data access for scientists who wish to analyze geophysical parameters in the polar regions. Often, this data is provided in a form that requires reformatting by interdisciplinary scientists. Our system architecture allows for the simple query and access to the database, and provides a manageable means to begin statistical studies of the polar regions. It is an architecture which is analogous to a relational online analytical processing architecture. Such questions as the extent of sea ice coverage over a period approaching a solar cycle can be addressed with a low level of effort.

Certain aspects of global climate may best be studied from the perspective that the earth is a system comprised of non-linearly interacting components such as the atmosphere, hydrosphere and biosphere [Lovelock, 1974 and 1991]. These components exhibit aspects of self-organization and adaptivity both within and between the individual components. In this context, the sustaining of the earth's global atmosphere implies the regulation of temperature, acidity, and distribution of elements such as sulfur, phosphorous, and nitrogen. This regulation is considered to take place by the operation and interaction of biogeochemical cycles, that is the circulation of substance defined by their fluxes, associated processes, and reservoirs [Wollast et al., 1993].

One postulated temperature regulation loop is based on aerosol production changes resulting from the biochemical production of dimethyl sulfide (DMS) [Charlson, 1987]. In this scenario DMS is produced in aqueous solution by marine phytoplankton [Anderson,1992 and Charlson, 1987]. DMS is transported from the sea surface to the atmosphere and is converted, via oxidation [Charlson, 1987], to non-sea-salt sulfate (NSS-SO4) which leads to the generation of cloud-condensation-nuclei (CCN). The albedo of the resulting clouds may reduce solar radiation reaching the surface and lower the surface temperature. The production of DMS by phytoplankton is hypothesized to be directly proportional to temperature. The control is effected since the production of CCN is directly proportionate to the production of DMS. This may be supported by satellite observations of DMS, cloud cover and plankton coverage.

One approach to investigation of such processes is through data exploration. To perform such studies, it is necessary to access a wide variety of databases containing dissimilar data with considerable variance in quality, density of spatial and temporal coverage and descriptions of data lineage. Our prototype effort demonstrates one approach to this type of database integration and data fusion.

Metadata is fundamentally ancillary information about the provided data. It may consist of a set of descriptive attributes which describe a dataset or some other item of information [Shelley and Johnson, 1995]. Without metadata, the dataset itself may of little use to the scientist. For example, knowing the value of a temperature reading is of little use unless it also known where, when and how the measurement was taken. Metadata is crucial to the development and maintenance of any heterogeneous distributed scientific database or application. Most commercial databases possess static schema, however, scientific databases generally continue to grow with the addition of new data and new data types. To integrate these new data sets, there should exist some common set of metadata. The success of any database or application may best be measured by the number of new relationships discovered and by the new questions that these relationships stimulate [Bretherton, 1994].

Vast amounts of earth science data have already been collected over the past few years with the expectation of an even larger amount in the future. To assist in the fusion of this data much work has been done in the development of metadata standards. The Directory Interchange Format (DIF) was developed at the National Aeronautics and Space Administration (NASA) to support the NASA Master Directory and the Global Change Master Directory [Shelley and Johnson, 1995]. The Federal Geographic Data Committee (FGDC) has developed a content standard for digital geospatial metadata. This standard establishes the names of data elements, definitions of these elements, and information about the values that are to be provided for the data elements. An executive order issued in 1994 states that all new geospatial data collected, either directly or indirectly, shall use the standard developed by the FGDC [Standards for Digital Geospatial Metadata, 1994].

Our intent within this effort was to adhere to these standards in the development of our database schema so that new datasets would be easily integrated into our system in the future, or that our datasets would be integrated into larger sets.

Technical Approach

Our prototype effort was partially based on work done previously at the Consortium for International Earth Science Information Network (CIESIN). This effort involved the collection of raw satellite data, processing of satellite data, generation of summary statistics of the data, creation of images and the development of a CD-ROM [Colvin et al., 1993, Geller et al., 1993 and Geller and Colvin, 1994]. The focus for this effort was to extend the capabilities presented on the CD-ROM. This was to be accomplished via the World Wide Web and a web browser such as Netscape. Thus, by allowing for a single database query using a graphical user interface with active displays (with the use of Java applets), any user might query two disparate scientific databases to extract intelligent information. The users could be research scientists, students, or organizations with a specific need for environmental information. Figure 2 summarizes the effort described in this paper, and its connection to the previous effort, with the current effort occurring on the right side of the dashed line.

Figure 2 Polar Regions Metadata Database Dataflow Diagram

An integral part of the database design is to generate representative queries that will be used to extract information from the database. In this case domain experts were surveyed to ascertain typical queries that would be generated by the scientific community while conducting their research and experimentation. For our prototype the following sample queries were formulated for use during database design and later during the test phase. Initially we formulated a series of questions that the user should be able to answer with the help of the prototype. These questions included:

  1. What is the source of the data and who is the PI?
  2. What is the spatial and temporal coverage of the data?
  3. What region of the electromagnetic spectrum are these data?
  4. What is the histogram of the pixel count for sea ice concentration in a given month and year?
  5. What does the plot of the number of pixels that have a sea ice concentration greater than 90% over a two year period look like?
  6. What does an image of the data look like, and what format is it in?
  7. How does sea surface temperature and sea ice concentration compare for the same month?
  8. Where can I go to get more information about where the data were obtained?

Typically, researchers have to install custom software on their own computers or make use of ftp servers to access archival data. Thus, the user would not have the opportunity to perform initial analysis of the data until it was downloaded to their workstation. This data screening process could be time consuming and inefficient.

More recently, earth and space science services have appeared on the World Wide Web (Web) eliminating the need for special software to connect to archives. Also, Web applications can be designed to allow the user to perform some initial screening of the data allowing for the downloading of only those data of interest. Thus, our user interface was implemented via a client server Web application.

Design and Implementation Discussion

To facilitate data mining and to allow for data analysis that addresses global climate questions, we chose to develop a mechanism that would demonstrate simultaneous access to both sea-ice concentration data and selected aerosol concentration data. Each dataset has a unique set of metadata and data structure. It was these two dissimilar databases that we sought to incorporate into our prototype. The gathering of both observed aerosol data and sea ice concentration data, establishing metadata descriptions of this data, and determining appropriate schemas for the database were undertaken.

The sea ice concentration data was derived from polar orbiting instruments, namely sensors on the Defense Meteorological Satellite Program (DMSP) platform. These raw data are archived at the National Snow and Ice Data Center (NSIDC) in Boulder, Colorado. The processed sea ice concentration data used in developing our prototype effort was extracted from the Consortium for International Earth Science Information Network (CIESIN) CD-ROM titled Sea Ice in the Polar Regions and The Arctic Observatory, dated 1996.

The polar ice data consists of processed image data beginning with January 1985 and is continuous through December 1990 for three of the five parameters. The five classes, or parameters of specific data include ice velocity, sea surface temperature, sea ice concentration, sea surface wind speed, and cloud coverage. There is one image for each parameter, for each month during the data period. Each parameter is divided into subclasses or bins of data defining a specific range of data values. Each data bin contains the number of pixels of data which satisfy the subclass characteristics, and each pixel represents a 25 by 25 kilometer square on the surface of the Earth. For the purpose of this example, only the sea ice concentration parameter has be decomposed into the concentration subclasses. Obviously each parameter may possess different values for the data bin subclass. However, in every case, each subclass contains the number of pixels of data that represent the subclass.

The polar ice data available for this prototype consisted of five environmental parameters remotely derived from satellite observations over the Earthís north polar region. The five parameters are sea ice concentration, cloud cover, sea surface temperature, sea surface wind speed, and ice speed. The following metadata pertains to all parameters except ice speed which has a spatial resolution of 127 km square.

Temporal Resolution: Monthly from January 1985 to December 1990

Spatial resolution: 25 km x 25 km footprint - except ice speed 127 km x 127 km

Latitude: 48N - 87N

Longitude: 180W - 180E

Ice concentration and cloud cover are measured in percent coverage of the surface within the footprint area. Sea surface wind speed and ice speed is measured in meters/second, and sea surface temperature is measured in degrees Kelvin. The sea ice concentration parameter has eleven data bins with the bin ranges illustrated in Figure 3 using an object modeling technique (OMT) approach.


Figure 3 Sea Ice Concentration Polar Data Parameter with Subclasses


The remaining parameters in the polar regions metadata database have the following bin ranges:

Cloud Cover (%) 0, 1-10, 11-20, 21-30, 31-40, 41-50, 51-60, 61-70, 71-80, 81-90, 91-100

Ice Speed (m/s) 0.00, 0.01-0.05, 0.06-0.10, 0.11-0.15, 0.16-0.20

Sea Surface Temp (K) 0-199, 200-220, 221-225, 226-230, 231-235, 236-240, 241-245, 246-250, 251-255, 256-260, 261-265, 266-270, 271-275, 276-280

Surface Wind Speed (m/s) 0.0, 0.1-2.0, 2.1-4.0, 4.1-6.0, 6.1-8.0, 8.1-10.0, 10.1-15.0, 15.1-20.0


The acquired aerosol data consisted of concentrations of chemical species that are candidates for CCNs and the related spatial and temporal characteristics. Dimethylsulfide and sulfur dioxide concentrations and chemistry descriptions have been obtained [Blomquist, et al 1996; Yvon, et al 1996-1; Yvon, et al, 1996-2; Suhre, 1995; Andreae, 1995]. Other species are carbonyl sulfide and carbon dioxide [Thornton, et al, 1996], hydrogen peroxide [Heikes, et al, 1996], ozone [Singh, et al, 1996] and nitrogen oxides [Buhr, et al, 1996]. The aerosol species are illustrated in Figure 4 using the OMT approach. Notice that the organization and structure of aerosol data is significantly different from that of the polar ice data. The metadata schema developed for the aerosol database was intended to support queries addressing both sea ice concentration data and aerosol data queries and is summarized as follows:


I. Parameter

A. Species = Char(10)

1. Hydrogen Peroxide = H2O2

2. Carbonyl Sulfide = OCS

3. Sulfur Dioxide = SO2

B. Concentration (decimal 0.0)

1. Parts per million

2. Parts per billion

3. Parts per trillion


II. Spatial Characteristics

A. Surface area (specified as 4 corners of a rectangle)

1. nwlat = decimal degrees 0.0

2. nwlon = decimal degrees 0.0

3. selat = decimal degrees 0.0

4. selon = decimal degrees 0.0

B. Altitude range (specified as from - to)

1. From: decimal km 0.0

2. To: decimal km 0.0

III. Temporal Characteristics

A. Range (a period of time)

1. Begin: yyyy/mm/dd/hh,

2. End: yyyy/mm/dd/hh

B. Resolution

1. year = decimal 0.0

2. month = decimal 0.0

3. day = decimal 0.0

4. hour = decimal 0.0

Figure 4 Select Aerosol Data Parameters with Subclasses

Database population of the prototype database system is described here to illustrate how the full database system population would be undertaken. Our prototype was implemented using the Oracle relational database management system. Its robust design allows for the addition of new parameters without adversely affecting the existing data or tables in the relational database. Additionally, triggers have been implemented into the prototype to provide automatic updating of critical metadata parameters. A listing of some of the major tables in the relational database with their attributes are provided below.



(param_id NUMBER (3) NOT NULL,

data_date DATE NOT NULL,

image_name VARCHAR2 (100),

PRIMARY KEY (param_id, data_date));


CREATE TABLE cloud_cover

(param_id NUMBER (3) NOT NULL,

data_date DATE NOT NULL,

bin_id NUMBER (2) NOT NULL,

ixel_count NUMBER (10),

PRIMARY KEY (data_date, bin_id));


CREATE TABLE sea_ice_concentration

(param_id NUMBER (3) NOT NULL,

data_date DATE NOT NULL,

bin_id NUMBER (2) NOT NULL,

pixel_count NUMBER (10),

PRIMARY KEY (data_date, bin_id));


CREATE TABLE sea_surface_temp

(param_id NUMBER (3) NOT NULL,

data_date DATE NOT NULL,

bin_id NUMBER (2) NOT NULL,

pixel_count NUMBER (10),

PRIMARY KEY (data_date, bin_id));


CREATE TABLE sea_surface_wind_speed

(param_id NUMBER (3) NOT NULL,

data_date DATE NOT NULL,

bin_id NUMBER (2) NOT NULL,

pixel_count NUMBER (10),

PRIMARY KEY (data_date, bin_id));


CREATE TABLE ice_speed

(param_id NUMBER (3) NOT NULL,

data_date DATE NOT NULL,

bin_id NUMBER (2) NOT NULL,

pixel_count NUMBER (10),

PRIMARY KEY (data_date, bin_id));



(bin_id NUMBER (2) NOT NULL,

param_id NUMBER (3) NOT NULL,

bin_low NUMBER (4),

bin_high NUMBER (4),

PRIMARY KEY (bin_id, param_id));



(param_id NUMBER (2) NOT NULL,

param_name VARCHAR2 (25) NOT NULL,

data_units VARCHAR2 (20),

loc_units VARCHAR2 (20),

alt_units VARCHAR2 (20),

date_updated DATE,

who_updated VARCHAR2 (8),

poc_name VARCHAR2 (30),


reference VARCHAR2 (250));



(param_id NUMBER (3) NOT NULL,

p_serno NUMBER (5) NOT NULL,

start_date DATE,

end_date DATE,

loc_id NUMBER (4),

alt_id NUMBER (3),

PRIMARY KEY (param_id, p_serno));



(param_id NUMBER (3) NOT NULL,

p_serno NUMBER (5) NOT NULL,

start_date DATE,

end_date DATE,

loc_id NUMBER (4),

alt_id NUMBER (3),

PRIMARY KEY (param_id, p_serno));



(param_id NUMBER (3) NOT NULL,

p_serno NUMBER (5) NOT NULL,

start_date DATE,

end_date DATE,

loc_id NUMBER (4),

alt_id NUMBER (3),

PRIMARY KEY (param_id, p_serno));



(param_id NUMBER (2) NOT NULL,

p_serno NUMBER (5) NOT NULL,

loc_id NUMBER (3) NOT NULL,

nw_lat NUMBER (4),

nw_long NUMBER (4),

se_lat NUMBER (4),

se_long NUMBER (4),

PRIMARY KEY (param_id, p_serno, loc_id));



(param_id NUMBER (2) NOT NULL,

p_serno NUMBER (5) NOT NULL,

alt_id NUMBER (3) NOT NULL,

low_alt NUMBER (6),

high_alt NUMBER (6),

concentration NUMBER (5),

PRIMARY KEY (param_id, p_serno, alt_id));



We chose to develop a user interface incorporating Web-based client/server applications. We be concluded that there are five essential criteria for such a client/server user interface:

We built a demonstration version of the interface to the our Oracle database which adheres to these five criteria. This version takes advantage of the Web technology including HTML forms, Netscape frames, Javascript, and Java. Figure 5 illustrates the flow of information within the client/server architecture.

Figure 5 Client/Server Architecture


The user is first presented with an HTML page with links to several query forms. The user must first select one of the forms. The user can then enter a query by selecting parameters (e.g. cloud cover) and a date or time period. This query is then submitted to the query engine for processing. The query engine is a common gateway interface (CGI) script which performs the following functions:

There are two query forms currently available to the user. The first form allows the user to simultaneously display two images of the arctic region based on the date and parameter type selected. This form was based on the Arctic Observatory CD-ROM interface and allows the user to compare and contrast two images as well as view some basic metadata associated with the images. The user also has the capability to bring up a histogram of each image in a separate window and download a TIFF version of the image. It should be noted that the actual images are not stored within the Oracle RDBMS. Instead the database maintains pointers to files that are located on our Web server.

The second query form allows the researcher to perform an analysis of trends in the data and compare these trends to that in other datasets, such as aerosols. For example, the researcher can bring up a graph of sea ice concentration as a function of time and compare it to concentrations of aerosols during the same time period.

The interface allows the user to gain a basic understanding of the types of datasets contained within the archive and perform first order analyses of the data to determine if it will be useful for their studies. By using Web protocols, the interface is accessible to multiple computer platforms.

Image data was analyzed using application software on a Windows-based compatible platform equipped with the powerful data analysis software called Transform. This software was derived from software written at the National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign (UIUC). It allows for the display of histogram data from the TIFF formatted images stored on the CIESIN CD-ROM and the HDF formatted data on the USGS CD-ROM. These histograms provided the raw data for input into our polar regions metadata database system prototype.

The histogram data for the various geophysical parameters are stored in an Oracle database, whose design was discussed earlier within this paper. This data is then able to be queried using the SQLplus module of Oracle.

Each record in the database, for the DMSP derived data, contains a value which is best interpreted as an area of the polar region (i.e. a number of 25 kilometer square regions) which share the value of the geophysical parameter within which portion of the database it resides. Thus, each region is of the size 625 square kilometers and is representative of the geophysical parameter averaged by month.

The user is able to view histograms and browse imagery via a CGI program which is accessible via a standard Web based HTML file from any platform supporting a sufficiently advanced Web browser (for example, Netscape 3.0). The first Web page presented to the user allows the user to decide whether images or histogram plots are desired. This page is depicted in Figure 6. Note that the user has the option to actively investigate the original sources of the data, that is either the National Snow and Ice Data Center, CIESIN or the U.S. Geological Survey. Then the user is presented choices for either viewing the browse imagery or plotting histogram data.

The image Web page is depicted in Figure 7. The user is presented a forms based page. In the lower portion of the page, the user chooses the month, year and parameter for which an image is available. Once chosen, the user then clicks the submit button and the image will be displayed in the upper portion of the screen. These choices are identical on both halves of the screen, allowing the user to display and compare the images for the requested month and year.

The histogram display Web page is depicted in Figure 8. Here the user chooses the parameter of choice followed by a qualifier, that is, either equals, less than or greater than a chosen value for the bin or bins of interest. A start date, month and year, and an end date, month and year are also to be chosen by the user. The user can then submit a request, which is sent out as a query to the Oracle database management system. The results are displayed in the upper half of the Web page, originally developed as a bar chart type of representation.

Figure 6


Thus, our approach to this effort may be viewed as a customized online analytical processing (OLAP) approach integrating a Web user interface. The authors are fully aware of the fact that our approach does not meet Coddís twelve rules for OLAP as published in his often quoted paper [Codd, 1995]. However, this approach was taken due to the resources at hand and a knowledge of the specific statistical questions that were to be asked of the database.

Consisting of a standard relational database management system on a mainframe and a customized query generator on a local system, this approach is analogous to two-tier data warehousing architectures. This raises the question of the applicability of the two major types of OLAP tools available today, that is the MOLAP (multidimensional database online analytical processing) and ROLAP (relational online analytical processing).

Unfortunately, as of this date, both MOLAP and ROLAP tools are geared to the business community. This is understandable from the aspect of the higher volumes attainable from the business community as compared to the scientific community, however, with the coming data deluge in the science community, especially the earth science and space arena thanks to upcoming missions from the National Aeronautics and Space Administration (NASA), it is hoped that the vendors will listen to more of the functional and statistical requirements from the science community and help move the community into such applications. Which OLAP technology is best applied in the science community is not a subject for this paper and we recommend the reader investigate comparison treatises on this subject [Raden, 1997]. In the meantime, the science community appears to be approaching this OLAP technology by customized efforts such as that ongoing within George Mason Universityís Center for Earth Observing and Space Research (CEOSR) and their prototype development effort of a Virtual Domain Applications Data Center (VDADC).

Figure 7

Figure 8



We have demonstrated one approach to the development of a scientific database which allows for some statistical analyses. This approach was based on a standard relational database management system (Oracle) and a Web-based front end with CGI scripts and JAVA applets for query construction and display of results. The most obvious advantage to this type of architecture is that it is portable to any platform and is relatively user friendly. This approach allows the user interface to grow and incorporate new technologies as they become available. We believe that the portability of Java to create analysis tools that are transparently downloaded to the researcherís computer is a feature that future developments should incorporate. Such enhancements will improve the ability of interdisciplinary researchers to develop and test hypotheses.

To determine the efficacy our schema and integration of sea-ice data with aerosol data, certain questions remain to be addressed. This includes establishing queries against the data, providing an interface that allows the results of the queries to be examined in a fruitful way and providing tools to aid in evaluating the results and generating new queries. For example, to address some aspects of global warming, the user needs to be able to generate a query that will obtain a correlation of average northern Pacific CO2 concentrations and polar sea ice extent. To address the impact of the hypothesized temperature control loop, alluded to earlier in this paper, a query addressing the average Pacific (or Atlantic) CCN concentrations correlated with both sea ice extent and polar cloud cover must be able to be constructed. Similar queries are needed for specific surface and upper troposphere altitudes as well as latitude and longitude sectors that encompass smaller regions of the globe.

To address the degree of cooperation among the physical processes associated with the temperature control loop hypothesis, it is felt that pattern recognition may be a useful tool to apply to the data. This would require additional types of data such as ocean circulation and atmosphere circulation and dynamics. Other information that may be best addressed by pattern recognition includes sea surface temperatures with both spatial and temporal characteristics and phytoplankton presence and bloom quantities. Future satellite data acquisition programs may provide such global data.

The sparseness of the aerosol data causes problems related to data gaps. Since the data is currently stored as a mixture of spatial points and ranges, grids need to be generated from the data and queried against. This presents some additional problems with which remote sensing scientists have grown familiar, namely gap filling routines and techniques.

While we did not examine the use of commercially available OLAPs to address the needs of this or similar scientific database analysis systems, we believe that such commercial systems, if enhanced with the assistance of the earth and remote sensing science communities may be a future source of off-the-shelf solutions for future researchers. We can only hope that the commercial vendors view the market with enough vigor to advance this application of their technologies.


This work was initiated as part of a course at George Mason University given by Dr. Larry Kerschberg and Dr. George Michaels known as CSI 810, Scientific Databases. A portion of this work was undertaken as part of course requirements for CSI 996 with Dr. Menas Kafatos. We thank Tom Sanders for his assistance in preparation of Figure 7.


Anderson, T.L., G.V. Wolfe and S.G. Warren (1992). "Biological Sulfur, Clouds and Climate", Encyclopedia of Earth System Science, Vol. 1, PP 363 - 376, Academic Press, Inc.

Andreae, M.O., W. Elbert and S.J. de Mora (1995). "Biogenic sulfur emissions and aerosols over the tropical South Atlantic, 3. Atmospheric dimethylsulfide, aerosols and cloud condensation nuclei"; JGR, V100, D6, pp 11,335 - 11,356; June 20 1995.

Blomquist, B.W., A.R. Bandy and D.C. Thornton (1996). "Sulfur gas measurements in the eastern North Atlantic Ocean during the Atlantic Stratocumulus Transition Experiment/ Marine Aerosol and Gas Exchange"; JGR, V 101, D2, pp 4377-4392, February 20, 1996

Bretherton, F. (1994). A Reference Model for Metadata: A Strawman, DRAFT 3/2/94, available on the World Wide Web at

Buhr, M.P., K.J. Hsu, C.M. Liu, R. Liu, L. Wei, Y.C. Liu and Kuo (1996). "Trace gas measurements and air mass classification from a ground station in Taiwan during the PEM-West A experiment (1991)"; JGR, V101, D1, pp. 2025 -2035, January 20, 1996.

Charlson, R.J., et al. (1987). "Oceanic Phytoplankton, Atmospheric Sulfur, Cloud Albedo and Climate", Nature Vol. 326, PP 655-661.

Codd, E.F. (1995). Twelve Rules for On Line Analytical Processing, Computerworld, April 13, 1995.

Colvin, P., F. Tanis, C. Chiesa and H. Geller (1993). Design and Development of an Arctic Geographic Information System for Global Change Research. Eos, 74:43,87.

Duguay, C.R. and P. Hurtubise (1992). Monitoring the Spatial and Spectral Variability of Seasonal Snow Cover with Landsat MSS and TM for Climate Studies ASPRS/ACSM/RT 92 Technical Papers Volume 1 - Global Change and Education, August 1992, pp. 346-354.

FGDC, The Federal Geographic Data Committee (1994). Content Standards for Digital Geospatial Metadata (June 8), Federal Geographic Data Committee, Washington, D.C.

Geller, H.A., R.J. Coullahan and P. Colvin (1993). Mechanisms for Gaining Access to Geophysical Data and Information for Global Change Research. Eos, 74:43,86.

Geller, H. and P. Colvin, (1994). Utilization of Model and Empirical Data in an Arctic GIS for Geophysical Model Refinement. Eos, 75:44,88.

Heikes, B.G., et al. (1996). "Hydrogen peroxide and methylhydroperoxode distributions related to ozone and odd hydrogen over the North Pacific in the fall of 1991";JGR V101, D1, pp. 1891 - 1905, January 20, 1996.

Kozo, T.L., R.W. Fett, L.D. Farmer and D.S. Sodhi (1992). Clues to Causes of Deformation Features in Coastal Sea Ice Eos, Transactions, American Geophysical Union, Volume 73, Number 36, September 8, 1992, pp. 385-389.

Lovelock, J.E (1991). Healing Gaia: Practical Medicine for the Planet, Harmony Books- Crown Publishers, New York, New York.

Lovelock, J.E. and L. Margulis (1974). "Atmospheric Homeostasis By and For the Biosphere: the Gaia Hypothesis", Tellus, XXVI, pp. 2 - 9.

Maxwell, B. (1987). Atmospheric and Climatic Change in the Canadian Arctic Northern Perspectives, Volume 15, Number 2, p. 4.

Molnia, B.F. (1993). Major Glacier Surge Continues Eos, Transactions, American Geophysical Union, Volume 74, Number 45, November 9, 1993, pp. 521-524.

Raden, N. (1997). "Choosing the Right OLAP Technology" in Planning and Designing the Data Warehouse by Barquin, R.C. and H.A. Edelstein, eds., Prentice Hall PTR, Upper Saddle River, New Jersey.

Shelley, E.P. and B.D. Johnson (1995). Metadata: Concepts and Models. Proceedings of the Third National Conference on the Management of Geoscience Information and Data, organised by the Australian Mineral Foundation, Adelaide, Australia, 18-20 July 1995, pp 4.1-5.

Singh, H.B., et al (1996). "Low ozone in the marine boundary layer off the tropical Pacific Ocean: Photochemical loss, chlorine atoms, and entrainment"; JGR, V101, D1, pp 1907 - 1917, January 20, 1996.

Suhre, K., M.O. Andreae and R. Rosset (1995). "Biogenic sulfur emissions and aerosols over the tropical South Atlantic, 2. One-dimensional simulation of sulfur chemistry in the marine boundary layer"; JGR, V100, D6, pp 11,323 - 11,334; June 20, 1995.

Thornton, D.C., A.R. Bandy and B.W. Blomquist (1996). "Impact of anthropogenic and biogenic, sources and sinks on carbonyl sulfide in the North Pacific troposphere"; JGR, V101, D1, pp 1873 - 1881; January 20, 1996

Wollast, R., Mackenzie, F., Chou, L., eds, (1993). Interactions of C, N, P and S Biogeochemical Cycles and Global Change, (proceedings of NATO Advanced Workshop on Interactions of C, N, P and S Biogeochemical Cycles, Melreux, Belgium, March 4-8, 1991), Springer-Verlag Berlin Heidelberg, printed in Germany.

Yvon, S.A., E.S. Saltzman, D.J. Cooper, T.S. Bates, and A.M. Thompson (1996). "Atmospheric sulfur cycling in the tropical Pacific marine boundary layer (12S, 135W): A comparision of field data and model results 1, Dimethylsulfide"; JGR, V 101, D3, pp 6899-6909, March 20, 1996.

Yvon, SA and E.S. Saltzman (1996). "Atmospheric sulfur cycling in the tropical Pacific marine boundary layer (12S, 135W): A comparision of field data and model results 2. Sulfur dioxode"; JGR, V 101, D3, pp 6911-6918, March 20,1996.