Here we will use data on fossilized organism, which we call “fossil occurrences”. the data was originally obtained from the Paleobiology database, a free-to-use resource that is the standard repository for such data and which has been used by hundreds or thousands of scientific research articles to study the history and dynamics of biodiversity in Deep Time.
## phylum class order family genus
## 1 Chordata Mammalia Artiodactyla Bovidae Brabovus
## 2 Chordata Mammalia Cetacea Protocetidae Georgiacetus
## 3 Chordata Mammalia Sirenia Prorastomidae Pezosiren
## 4 Chordata Mammalia Artiodactyla Xiphodontidae Dichodon
## 5 Chordata Mammalia Carnivora Canidae Cynodictis
## 6 Chordata Mammalia NO_ORDER_SPECIFIED Nyctitheriidae Saturninia
## species early_interval late_interval max_ma min_ma midpoint
## 1 Brabovus nanincisus Pliocene 5.333 2.588 3.9605
## 2 Georgiacetus vogtlensis Lutetian 47.800 41.300 44.5500
## 3 Pezosiren portelli Lutetian 47.800 41.300 44.5500
## 4 Dichodon cervinus Late Eocene 37.200 33.900 35.5500
## 5 Cynodictis lacustris Late Eocene 37.200 33.900 35.5500
## 6 Saturninia gracilis Late Eocene 37.200 33.900 35.5500
## lng lat
## 1 35.13000 -3.13000
## 2 -81.76056 33.14333
## 3 -77.91666 18.33333
## 4 -1.08560 50.67720
## 5 -1.08560 50.67720
## 6 -1.55000 50.66667
Students can use functions in the package to calculate the diversity of a certain taxonomic rank, and plot it through time.
To make it more fun, we will compare tow different taxonomic levels, and will plot them in a relative, log scale:
spDTT = calcFossilDivTT(mammals_fossil, tax.lvl = "species")
genusDTT = calcFossilDivTT(mammals_fossil, tax.lvl = "genus")
famDTT = calcFossilDivTT(mammals_fossil, tax.lvl = "family")
# And to allow comparisons, we will use relative richness:
plot(x=genusDTT$age, xlim = rev(range(genusDTT$age)),
y=log(genusDTT$div)-log(max(genusDTT$div)),
xlab="Time (Million years ago)",
ylab="Log relative diversity",
type="l", col="blue", ylim=c(-7,0))
lines(x=famDTT$age,
y=log(famDTT$div)-log(max(famDTT$div)),
col="red")
lines(x=spDTT$age,
y=log(spDTT$div)-log(max(spDTT$div)),
col="black")
We can also visualize fossil records by running:
Are they different? how could we test? Hint: look at the column names of the fossil object:
## [1] "phylum" "class" "order" "family"
## [5] "genus" "species" "early_interval" "late_interval"
## [9] "max_ma" "min_ma" "midpoint" "lng"
## [13] "lat"
Now, how complete is the record? We can use another dataset to explored this:
And see the proportion of living species with a fossil occurrence.
## [1] 0.1978531
Students can for instance explore how this varies across different mammal groups, and which type of factors (e.g. biological, geological factors) seem to be influencing this the most.
Results from this dataset can also be compared (when relevant) with other fossil records:
We can also compare the temporal trends of species number, for instance. To help in that way, we also provide biodiversity timeseries for more clades:
## clade source stem_age rel_time time_ma richness
## 1 anth Alroy2010 617 617.000 0.000 7138
## 2 anth Alroy2010 617 609.754 7.246 7246
## 3 anth Alroy2010 617 601.030 15.970 12673
## 4 anth Alroy2010 617 593.970 23.030 8485
## 5 anth Alroy2010 617 583.100 33.900 14475
## 6 anth Alroy2010 617 575.800 41.200 9280
It contains many fossil datasets, and we will only plot the first 4 of them
clades = unique(timeseries_fossil$clade)[1:4]
cols= c("#ffd353", "#ef8737", "#bb292c", "#62205f")
par(mfrow=c(2,2))
for(i in 1:length(clades)){
aux = timeseries_fossil[timeseries_fossil$clade==clades[i], ]
plot(aux$time_ma, log(aux$richness), col=cols[i], lwd=3,
main=clades[i], type="l", frame.plot = F,
xlab="Time (Mya)", ylab="Log richness",
xlim=rev(range(aux$time_ma)))
}
Students can explore this other dataset, and compare it with the previously discussed ones. They can compare richness thought time, calculate statistics related to richness change. Explore factors (e.g. mass extinction) that might have affected some clades, among other learning problems.
But fossils are not the only way to explore the timescale of evolution. Below, we are going to show how some functions that use this type of data work.
With other functions, students can also explore molecular sequences and compare species.
First we load the dataset of protein sequences from the cytochrome
oxidase 1 gene. This gene, often known as CO1
, is a
mitochondrial gene that plays a key role in cellular respiration (e.g.,
the primary aerobic pathway to energy ( ATP ) generation).
CO1
contains approximately 513 aminoacids (AA) and has been
used by previous studies for reconstructing phylogenetic trees and
estimating divergence times between taxa by assuming a molecular
clock:
##
## 17 amino acid sequences, each with length 513
## $cnidaria
## [1] "-RWIFSTNHKDIGTLYL"
##
## $snake
## [1] "TRWLFSTNHKDIGTLYL"
##
## $echinoderm
## [1] "NRWLFSTNHKDIGTLYL"
##
## $mollusk
## [1] "MRWLFSTNHKDIGTLYI"
##
## $alligator
## [1] "HRWFFSTNHKDIGTLYF"
##
## $lamprey
## [1] "IRWLFSTNHKDIGTLYL"
##
## $shark
## [1] "NRWLFSTNHKDIGTLYL"
##
## $bird
## [1] "NRWLFSTNHKDIGTLYL"
##
## $frog
## [1] "TRWLFSTNHKDIGTLYL"
##
## $fish
## [1] "TRWLFSTNHKDIGTLYL"
##
## $platypus
## [1] "NRWLFSTNHKDIGTLYL"
##
## $human
## [1] "DRWLFSTNHKDIGTLYL"
##
## $chimpanzee
## [1] "DRWLFSTNHKDIGTLYL"
##
## $bryozoa
## [1] "MRWLGSTNHKDIGTLYF"
##
## $annelid
## [1] "MRWLYSTNHKDIGTLYF"
##
## $insect
## [1] "RQWLFSTNHKDIGTLYF"
##
## $crustacea
## [1] "RQWLFSTNHKDIGTLYL"
We can compare two sequences in terms AA difference number. For instance if we want to compare a species of snake with one species of bird, we type:
## [1] 123
And to calculate the proportion of differences, we type:
## snake
## 0.2397661
We can also quickly visualize the sequences by directly handling all elements in this object:
And using patterns of molecular divergence, as well as fossil occurrences, we can build and, specially, date, molecular phylogenies.