Purpose and objectives

The main purpose of the DPDC is to make diatom data easily available to paleoecologists, paleoclimatologists, and diatomists interested in global change. The DPDC is intended primarily for researchers looking for long-term data related to climate change and other global environmental issues, and for diatom ecologists who are using methods for inferring environmental characteristics from diatom data and need more information on ecological characteristics of taxa. Most data are from studies published in the literature, but not all. All data are freely available to the public.

Key objectives of the DPDC

  • improve diatom paleolimnologist’s ability to infer trends in characteristics related to global change by providing easy access to a large amount of data on diatom distributions and ecology
  • improve accuracy of ecosystem models by giving modelers easy access to many data sets of long-term ecological trends inferred from diatom data
  • create a DPDC database with an underlying structure compatible with other National Climatic Data Center Paleoclimate-related databases, e.g., the North American Pollen Database
  • create a website where users can:
    - browse and download data sets so they can explore them further and use them in their own studies
    - search the database for occurrences of taxa in all samples and view associated ecological information
  • provide a desktop computer application that contributors can use to enter and submit their data, as well as a program to add those data to the DPDC database that minimizes the time required for a database administrator
  • eventually include a large number of North and South American diatom data sets relating to global change

Data contents

As of October 2002, the DPDC includes more than 20 data sets with over 4000 samples, and representing over 700 sites. The main types of data included are site information, diatom counts, water chemistry and other environmental data, diatom inferred values and dates for stratigraphies.

The DPDC is intended to include a wide range of diatom data sets. Sullivan and Charles (1994) describe of range of possible data set types and a list of data sets that could be added. They are not limited by geographic area, though initial emphasis will be on North and South America. They can represent any time period, but those of most current interest include Late Glacial to Holocene, Younger Dryas, Little Ice Age, recent times (measurement records), and times of rapid environmental change.

Raw diatom counts (actual numbers of valves or cells counted by analysts) are included in the database. Both raw and percentage counts are provided in response to most retrievals. The taxon names used in the data sets submitted by the contributors are stored. These names are linked to a master list of diatom taxa names in the database so that all counts can be interrelated. This allows retreival of information about occurrence of individual taxa among all data sets. There has been no attempt to harmonize taxonomy among data sets. The taxonomic code system is based on the list developed for the PIRLA Project (Paleoecological Investigation of Recent Lake Acidification) in the mid-1980's, and later expanded and modified to the NADED (North American Diatom Ecological Database) list of names currently used in the Phycology Section at the Academy of Natural Sciences. DPDC data set contributors have also contributed additional taxa to the list.

Database structure

Many aspects of the database design are similar to those in the North American Pollen Database, and other databases associated with the NOAA Paleoclimatology Program in the National Climatic Data Center. This is required so that the diatom data will be stored with a structure such that they can be viewed and analyzed with PaleoVu and other programs that allow browsing and visualization of multiple types of paleo data. Key parts of the database structure were borrowed directly from the PIRLA database (Paleolimnological Investigation of Lake Acidification), especially those dealing with diatom data.

The DPDC data are managed with Microsoft SQL Server 7.0 on a Dell PowerEdge computer running Windows 2000. The web application uses Internet Information Server (IIS 5.0), Active Server Pages (ASP), and Visual Basic 6.

The DPDC is designed to hold many types of detailed information about a project. The basic data, such as diatom counts, dates, site locations, and physical and chemical environmental variables are stored in tables, along with derived or secondary information such as inferred variables. There are tables designed to hold much of the underlying, supportive information: e.g., taxonomic information, inference techniques, raw data and techniques concerning dating, worker names and addresses, textual notes on many fields, etc. The amount of information about a project that can be entered is limited primarily by the amount the investigator is willing to contribute.

Data availability

All data submitted to the DPDC will be available to the public. This is in accordance with policies of agencies and programs that sponsored the DPDC. Data sets and results of data searches can be downloaded as tab-delimited ASCII files. In the future, we would like to make the information available in formats and programs designed for the NCDC, such as PaleoVu, SiteSeer, and ShowTime. These programs allow investigators to see and view data by selecting sites displayed on a geographical map, and to search on various parameters for available data.


Participants at a NOAA funded Workshop on Feasibility of a Paleolimnology Data Co-op, held at the Academy of Natural Sciences in Philadelphia in May, 1993, recommended creation of a paleolimnology data cooperative (Sullivan and Charles 1994). Based partly on this recommendation, the NOAA Paleoclimatology Program funded a proposal (beginning June 1995) to form the Diatom Paleolimnology Data Cooperative (DPDC). The proposal was submitted by Donald F. Charles of the Academy, P. Roger Sweets of the University of Louisville, and Timothy J. Sullivan of E&S Environmental Chemistry of Corvallis, OR. The National Science Foundation’s Earth System History program funded further development of the database from June 1998 to July 2002.

The first developer to work on the database was Kellie B. Vaché, of E&S Environmental Chemistry. He designed and developed the first version of the database in 1996. Beginning in 1998, Patrick Cotter (Phycology Section, Patrick Center, ANSP) took over responsibility for the database, made some revisions in database structure, wrote many queries, and developed the initial website application. In January 2001, Kathleen Sprouffske ( ANSP PCER Phycology Section) became the DPDC database administrator and application developer. She made additional modifications to the database design, updated the web application, wrote a DPDC data entry application, and developed an application to automatically upload data from the application to the SQL server database. Chamira Ratnayaka (ANSP PCER Phycology Section) participated in the application development efforts during the Spring and Summer of 2002 and Kai Snyder (E&S Environmental Chemistry) provided valuable feedback on the data entry application.


John Keltner at the World Data Center for Paleoclimatology in Boulder, Colorado provided considerable assistance with many aspects of DPDC database design, especially in terms of making it compatible with others developed within the NOAA Paleoclimatology Program. Sherilyn Fritz, John Smol, and Platt Bradbury served as advisors to the project, reviewing the database and website a various stages and making helpful comments. Roger Sweets placed an announcement of the DPDC on the Diatom Home Page at Indiana University, contacted several people about making contributions, and acquired several data sets. Diane Winter (ANSP PCER Phycology Section) reformatted files and prepared several data sets for entry to the DPDC. She also provided many valuable comments on design of the data entry application, the application instruction manual, and the data entry process in general.


Sullivan, T.J. and D.F. Charles. 1994. The feasibility and utility of a paleolimnology/paleoclimate data cooperative for North America. Journal of Paleolimnology 10: 265-273.

Related Websites

The European Diatom Database Initiative (EDDI) is a web-accessible database of diatom training sets and transfer functions similar in many ways to the DPDC, but covering many regions in Europe, and parts of Africa and Asia. The EDDI website has applications that allow users to apply transfer functions to their sediment core data online. Transfer functions are primarily for inferring environmental conditions related to surface water acidification, eutrophication and climate change.

The Diatom Home Page is a major resource for those interested in diatoms and related algae. It has links to many other diatom-related sites.

