More Information on the DPDC

Purpose and objectives

The main purpose of the DPDC is to make diatom data easily available to paleoecologists, paleoclimatologists, and diatomists interested in global change. The DPDC is intended primarily for researchers looking for long-term data related to climate change and other global environmental issues, and for diatom ecologists who are using methods for inferring environmental characteristics from diatom data and need more information on ecological characteristics of taxa. Most data are from studies published in the literature, but not all. All data are freely available to the public.

Key objectives of the DPDC

  • improve diatom paleolimnologist’s ability to infer trends in characteristics related to global change by providing easy access to a large amount of data on diatom distributions and ecology
  • improve accuracy of ecosystem models by giving modelers easy access to many data sets of long-term ecological trends inferred from diatom data
  • create a DPDC database with an underlying structure compatible with other paleoclimate-related databases, e.g., the North American Pollen Database
  • create a website where users can:
    - browse and download data sets so they can explore them further and use them in their own studies
    - search the database for occurrences of taxa in all samples and view associated ecological information
  • include a large number of North and South American diatom data sets relating to global change

Data contents

As of October 2021, the DPDC includes more than 20 data sets with over 4000 samples, and representing over 700 sites. The main types of data included are site information, diatom counts, water chemistry and other environmental data, and chronologies.

The DPDC is intended to include a wide range of diatom data sets (Sullivan and Charles 1994). They are not limited by geographic area, but emphasize North and South America. They can represent any time period, but those of most current interest include Late Glacial to Holocene, Younger Dryas, Little Ice Age, recent times (measurement records), and times of rapid environmental change.

Raw diatom counts (actual numbers of valves or cells counted by analysts) are included in the database. Both raw and percentage counts are provided in response to most retrievals. The taxon names used in the data sets submitted by the contributors are linked to a master list of diatom taxa names in the database so that all counts can be interrelated. This allows retrieval of information about occurrence of individual taxa among all data sets. There has been no attempt to harmonize taxonomy among data sets. The taxonomic code system is based on the list developed for the PIRLA Project (Paleoecological Investigation of Recent Lake Acidification) in the mid-1980's, and later expanded and modified to the NADED (North American Diatom Ecological Database) list of names currently used in the Phycology Section at The Academy of Natural Sciences of Drexel University. DPDC data set contributors have also added names to the list.

Database structure

Many aspects of the database design are similar to those in the North American Pollen Database, and other constituent databases within Neotoma. This is required so that the diatom data can be stored with a structure such that they can be viewed and analyzed with programs that allow browsing and visualization of multiple types of paleo data. Key parts of the database structure were borrowed directly from the PIRLA database (Paleolimnological Investigation of Lake Acidification), especially those dealing with diatom data.

The DPDC data are managed at ANSP with Microsoft SQL Server 2014 on a server running Windows Server 2012 R2. The web application uses Internet Information Server (IIS 8.5). Older applications using Active Server Pages (ASP), and Visual Basic 6 are currently non-working. We do not know if we'll be able to resurrect those or not.

The DPDC is designed to hold many types of information about a project. The basic data, such as diatom counts, dates, site locations, and physical and chemical environmental variables are stored in tables, along with derived or secondary information. There are also tables designed to hold much of the underlying, supportive information: e.g., taxonomic information, raw data and techniques concerning dating, worker names and addresses, textual notes on many fields, etc. The amount of information about a project that can be entered is limited primarily by the amount investigators were willing to contribute.

Data availability

All data in the DPDC are available to the public. This is in accordance with policies of agencies and programs that sponsored the DPDC. Data sets and results of data searches can be downloaded as tab-delimited ASCII files.

Data downloaded from this website should be cited; following is an example. Original source publications and investigators should also be cited. “Data were obtained from the Diatom Paleolimnology Data Cooperative ("), a constituent database of Neotoma Paleoecology database ("). The work of data contributors, data stewards, and the Neotoma and DPDC communities is gratefully acknowledged.”


Participants at a NOAA funded Workshop on Feasibility of a Paleolimnology Data Co-op, held at the Academy of Natural Sciences in Philadelphia in May, 1993, recommended creation of a paleolimnology data cooperative (Sullivan and Charles 1994). Based partly on this recommendation, the NOAA Paleoclimatology Program funded a proposal (beginning June 1995) to form the Diatom Paleolimnology Data Cooperative (DPDC). The proposal was submitted by Donald F. Charles of the Academy, P. Roger Sweets of the University of Louisville, and Timothy J. Sullivan of E&S Environmental Chemistry of Corvallis, OR. The National Science Foundation’s Earth System History program funded further development of the database from 1998 to 2002 and again from 2004 - 2008. Further development of the DPDC database stopped in 2010 when data entry shifted to the Neotoma paleoecology database (Williams et al. 2018). The DPDC is now a constituent database in Neotoma, and is supported by Neotoma grants from the National Science Foundation, Division of Earth Sciences, Geoinformatics Program (2016 - 2020, 2020 - 2023).

The first developer to work on the database was Kellie B. Vaché, of E&S Environmental Chemistry. He designed and developed the first version of the database in 1996. Beginning in 1998, Patrick Cotter (Phycology Section, Patrick Center, ANSP) took over responsibility for the database, made some revisions in database structure, wrote many queries, and developed the initial website application. In January 2001, Kathleen Sprouffske ( ANSP PCER Phycology Section) became the DPDC database administrator and application developer. She made additional modifications to the database design, updated the web application, wrote a DPDC data entry application, and developed an application to automatically upload data from the application to the SQL server database. Chamira Ratnayaka (ANSP PCER Phycology Section) participated in the application development efforts during the Spring and Summer of 2002 and Kai Snyder (E&S Environmental Chemistry) provided valuable feedback on the data entry application.


John Keltner at the World Data Center for Paleoclimatology in Boulder, Colorado provided considerable assistance with many aspects of DPDC database design. Sherilyn Fritz, John Smol, and Platt Bradbury served as advisors to the project, reviewing the database and website at various stages and making helpful comments. Roger Sweets placed an announcement of the DPDC on the Diatom Home Page at Indiana University, contacted several people about making contributions, and acquired several data sets. Diane Winter (ANSP PCER Phycology Section) reformatted files and prepared several data sets for entry to the DPDC. She also provided many valuable comments on design of the data entry application, the application instruction manual, and the data entry process in general. Mihaela Enache and Sonja Hausmann also helped acquire and enter several datasets. Pat Palmer has ably maintained the DPDC database and website for the past several years; Patrick Boylan provided much valuable assistance with data management.


Sullivan, T.J. and D.F. Charles. 1994. The feasibility and utility of a paleolimnology/paleoclimate data cooperative for North America. Journal of Paleolimnology 10: 265-273. DOI: 10.1007/BF00684036

Williams, J.W., E.C. Grimm, J. Blois, D.F. Charles, E. Davis, S.J. Goring, R.W. Graham, A.J. Smith, M. Anderson, J. Arroyo-Cabrales, A.C. Ashworth, J.L. Betancourt, B.W. Bills, R.K. Booth, P. Buckland, B.B. Curry, T. Giesecke, S.T. Jackson, C. Latorre, J. Nichols, T. Purdum, R.E. Roth, M. Stryker, H. Takahara. 2018. The Neotoma Paleoecology Database, a multiproxy, international, community-curated data resource. Quaternary Research 2017: 1-22. DOI: 10.1017/qua.2017.105

Related Websites

The European Diatom Database Initiative (EDDI) is a web-accessible database of diatom training sets and transfer functions similar in many ways to the DPDC, but covering many regions in Europe, and parts of Africa and Asia. The EDDI website has applications that allow users to apply transfer functions to their sediment core data online. Transfer functions are primarily for inferring environmental conditions related to surface water acidification, eutrophication and climate change.

The Diatom Home Page is a major resource for those interested in diatoms and related algae. It has links to many other diatom-related sites.