Skip to contents

A function to retrieve collections from the Paleobiology Database (PBDB), matched to Macrostrat units.


  unit_id = NULL,
  column_id = NULL,
  interval_name = NULL,
  age = NULL,
  age_top = NULL,
  age_bottom = NULL,
  lithology = NULL,
  lithology_id = NULL,
  lithology_type = NULL,
  lithology_class = NULL,
  environ = NULL,
  environ_id = NULL,
  environ_type = NULL,
  environ_class = NULL,
  econ = NULL,
  econ_id = NULL,
  econ_type = NULL,
  econ_class = NULL,
  project_id = NULL,
  strat_name_id = NULL,
  sf = FALSE



integer. Filter PBDB collections to those within one or more unit(s) as specified by their unique identification number(s).


integer. Filter PBDB collections to those within one or more column(s) as specified by their unique identification number(s).


character. Filter PBDB collections to those that overlap with a named chronostratigraphic time interval (e.g., "Permian").


numeric. Filter PBDB collections to those that overlap with the specified numerical age, in millions of years before present.


numeric. Filter PBDB collections to those that overlap with the age range between the specified numerical age and age_bottom. Should be in millions of years before present. age_bottom must also be specified, and this must be older than age_top.


numeric. Filter PBDB collections to those that overlap with the age range between the specified numerical age and age_top. Should be in millions of years before present. age_top must also be specified, and this must be younger than age_bottom.


character. Filter PBDB collections to those containing a named lithology (e.g., "shale", "sandstone").


integer. Filter PBDB collections to those containing one or more lithology(ies) identified by their unique identification number(s).


character. Filter PBDB collections to those containing a named lithology type (e.g., "carbonate", "siliciclastic").


character. Filter PBDB collections to those containing a named lithology class (e.g., "sedimentary", "igneous", "metamorphic").


character. Filter PBDB collections to those containing a named environment (e.g., "delta plain", "pond").


integer. Filter PBDB collections to those containing one or more environment(s) as specified by their unique identification number(s).


character. Filter PBDB collections to those containing a named environment type (e.g., "fluvial", "eolian", "glacial").


character. Filter PBDB collections to those containing a named environment class (e.g., "marine", "non-marine").


character. Filter PBDB collections to those containing a named economic attribute (e.g., "brick", "ground water", "gold").


integer. Filter PBDB collections to those containing one or more economic attribute(s) as specified by their unique identification number(s).


character. Filter PBDB collections to those containing a named economic attribute type (e.g., "construction", "aquifer", "mineral").


character. Filter PBDB collections to those containing a named economic attribute class (e.g., "material", "water", "precious commodity").


integer. Filter sections to those contained within a Macrostrat project as specified by its unique identification number.


integer. Filter PBDB collections to those containing a unit that matches one or more stratigraphic name(s) as specified by their unique identification number(s).


logical. Should the results be returned as an sf object? Defaults to FALSE.


A dataframe containing the following columns:

  • collection_no: The unique identification number of the collection, as assigned in the PBDB.

  • collection_name: The unique name of the collection, as assigned in the PBDB.

  • t_age: The top age of the unit containing the collection, estimated using the continuous time age model, in millions of years before present.

  • b_age: The bottom age of the unit containing the collection, estimated using the continuous time age model, in millions of years before present.

  • pbdb_occs: The count of PBDB occurrences in the specified PBDB collection.

  • genus_no: A vector containing the unique identification number for each genus that appears in the collection, corresponding to the genus_no column in the Paleobiology Database.

  • taxon_no: The count of unique taxa in the specified PBDB collection.

  • unit_id: The unique identification number of the Macrostrat unit containing the specified PBDB collection.

  • col_id: The unique identification number of the Macrostrat column containing the specified PBDB collection.

  • refs: Reference for the source of the data.

  • strat_name_concept_id: The unique identification number of the stratigraphic name concept containing the specified PBDB collection.

If sf = TRUE, an sf object is returned instead.


More information can be found for the inputs for this function using the definition functions (beginning with def_). Terminology related to the PBDB can be found at or in the suggested references below.


Christopher D. Dean


Lewis A. Jones


Peters, S.E. and McClennen, M. (2016). The Paleobiology Database application programming interface. Paleobiology, 42(1), pp. 1–7. doi:10.1017/pab.2015.39 .

Uhen, M.D., Allen, B., Behboudi, N., Clapham, M.E., Dunne, E., Hendy, A., Holroyd, P.A., Hopkins, M., Mannion, P., Novack-Gottshall, P. and Pimiento, C. (2023). Paleobiology Database User Guide Version 1.0. PaleoBios, 40(11), pp. 1–56. doi:10.5070/P9401160531 .

See also


# \donttest{
 # Get fossils by Macrostrat column ID
 ex1 <- get_fossils(column_id = 10)
 # Get fossils by Macrostrat unit ID
 ex2 <- get_fossils(unit_id = 6279)
 # Get fossils by lithology and age
 ex3 <- get_fossils(lithology = "sandstone", age_top = 66, age_bottom = 73)
 # Get fossils by environment type and age
 ex4 <- get_fossils(environ_type = "fluvial", age =  253)
# }