Statistics on Grundtvig's use of bible references

Nikolaj Frederik Severin Grundtvig is one of the most influential persons in Danish history. He was a polymath and a very prolific man, a pastor, author, poet, philosopher, historian, teacher and politician, as wikipedia describes him.

This project includes a statical analysis of his use of bible references, as found in https://tekster.kb.dk/. The data comes from the Center for Grundtvigforskning which is the publisher of Grundtvigs Værker referred to as GV.

I use the counts of individual bible references to compare all the years of Grundtvig's professional life. The philologists at the Grundtig centre has identified 11499 references to the bible in the GV. They refer to 4637 different locations in the bible, i.e., he uses each location 2.5 times on the average. In reality there are large number of bible locations that just appear once in a reference, whereas there are a number of favourates which is cited 50-60 in the corpus.

What you find below is an outline of my analysis of of his bible references from a technical point of view, with references to scripts and data. The paper.

Prerequisites

Most scripts require saxon be available. Do this by

SAXON_JAR="/usr/share/maven-repo/net/sf/saxon/Saxon-HE/9.9.1.5/Saxon-HE-9.9.1.5.jar"
SAXON="java -jar $SAXON_JAR "

or source parameters.sh. This might have to be modified to your installation, obviously.

Find references from texts using find and xslt which requires that you have access to the TEI source files. I cannot provide you with that, since they are not ownded by me or my employer.

Any way, my data extraction is done with a shell one-liner and the xslt script

find build/text-retriever/gv/  \
	-name 'txt.xml' -exec xsltproc \
	--stringparam file {}  ../bible-references/explore-references.xsl {} \;  > \
	../bible-references/table-all.xml

The result is stored in table-all.xml, which is really a huge html table.

Aggregating references by year

I then wrote a shell script, aggregate.sh which runs an xsl transform counting the number of references per year given that table-all.xml. It also makes that for given bible references, which can be given as a parameter.


#!/bin/bash

source "parameters.sh"

$SAXON table-all.xml aggregate-per-year.xsl > aggregated-references-per-year.text

$SAXON table-all.xml aggregate-per-year-for-given-reference.xsl > selected_ref_1.text
$SAXON ref='1 Kor 13,12'   table-all.xml aggregate-per-year-for-given-reference.xsl > selected_ref_2.text
$SAXON ref='1 Kor 13,13'   table-all.xml aggregate-per-year-for-given-reference.xsl > selected_ref_3.text
$SAXON ref='Matt 28,18-20' table-all.xml aggregate-per-year-for-given-reference.xsl > selected_ref_4.text

gnuplot < plot_references.gp
gnuplot < plot_selected_references.gp

This yields two graphs

Aggregating references by year for verse only

I then repeat that using bible references in verse (songs, psalms and poems) using aggregate-poetry.sh.


#!/bin/bash

source "parameters.sh"

$SAXON poetry-table.xml aggregate-per-year.xsl > aggregated-references-per-year.text

$SAXON poetry-table.xml aggregate-per-year-for-given-reference.xsl > selected_poetry_ref_1.text
$SAXON ref='1 Mos 2,7'   poetry-table.xml aggregate-per-year-for-given-reference.xsl > selected_poetry_ref_2.text
$SAXON ref='Joh 6,63'   poetry-table.xml aggregate-per-year-for-given-reference.xsl > selected_poetry_ref_3.text
$SAXON ref='Ordsp 4,23' poetry-table.xml aggregate-per-year-for-given-reference.xsl > selected_poetry_ref_4.text

gnuplot < plot_selected_poetry_references.gp

The resulting time series is presented in a plot

selected_poetry_refs_per_year.pdf

Making cluster analysis and a cladogram

Here I use the counts of individual bible references to calculate a similarity between years and through a similarity matrix and from that plot a cladogram.

$SAXON table-all.xml clustering_data.xsl  > clustering-data.text

The clustering-data.text contains one line per year and 4637 columns, one for each bible location Grundtvig referred to. The values in the table refer to the number of times he referenced that locations. Most entries are zero, obviously. The do-cluster-analysis.r does the calculations. See Altuna Akalin (2020) Computational Genomics with R

#!/usr/bin/Rscript


yearly_quotes <- read.table(file = "clustering-data.text",
	      	     row.names=1,
                     sep = "",
                     quote = "\"",
                     dec = ".",
                     fill = TRUE,
                     comment.char = "#")

# yearly_quotes



pdf( "cladogram.pdf", width = 15, height = 10 )

# d=dist(yearly_quotes,method="euclidean")
d=dist(yearly_quotes,method="minkowski")
hc <- hclust(d,method="complete")

hcd <- as.dendrogram(hc)

plot(hcd,xlab="Years")

The final result is in the cladogram.pdf

Number of works per year by genre

Transform all files using text-types-by-year.xsl using run_text-types-by-year.pl collecting all data in a single file, e.g.,
counts-of-text-by-type-and-year.xml. Use that to aggregate per year using breakdown-per-year.xsl
plot using plot_words_per_year.gp

See words_per_year.pdf

Name		Name	Last commit message	Last commit date
Latest commit History 170 Commits
dist		dist
plugin		plugin
presentation_imgs		presentation_imgs
L1000538-01.jpeg		L1000538-01.jpeg
README.md		README.md
abstract.txt		abstract.txt
aggregate-per-year-for-given-reference.xsl		aggregate-per-year-for-given-reference.xsl
aggregate-per-year.xsl		aggregate-per-year.xsl
aggregate.sh		aggregate.sh
aggregated-poetry-references-per-year.text		aggregated-poetry-references-per-year.text
aggregated-prose-references-per-year.text		aggregated-prose-references-per-year.text
aggregated-references-per-year.text		aggregated-references-per-year.text
all_keys.text		all_keys.text
article.html		article.html
article.ms		article.ms
article.pdf		article.pdf
article.xml		article.xml
breakdown-per-year.xsl		breakdown-per-year.xsl
build-document.sh		build-document.sh
characters_per_year.eps		characters_per_year.eps
cladogram-poetry.pdf		cladogram-poetry.pdf
cladogram-poetry.png		cladogram-poetry.png
cladogram-prose.pdf		cladogram-prose.pdf
cladogram-prose.png		cladogram-prose.png
cladogram.pdf		cladogram.pdf
cladogram.png		cladogram.png
cluster.sh		cluster.sh
clustering-data-poetry.text		clustering-data-poetry.text
clustering-data-prose.text		clustering-data-prose.text
clustering-data.text		clustering-data.text
coordinate-axes.fig		coordinate-axes.fig
coordinate-axes.png		coordinate-axes.png
coordinate-system.fig		coordinate-system.fig
coordinate-system.fig.bak		coordinate-system.fig.bak
coordinate-system.png		coordinate-system.png
counts-of-text-by-type-and-year.xml		counts-of-text-by-type-and-year.xml
crises.text		crises.text
distribution.eps		distribution.eps
distribution.pdf		distribution.pdf
distribution.png		distribution.png
distribution.text		distribution.text
distribution_poetry.text		distribution_poetry.text
distribution_prose.text		distribution_prose.text
do-cluster-analysis.r		do-cluster-analysis.r
extract_clustering_data.xsl		extract_clustering_data.xsl
find-references.xsl		find-references.xsl
gv_characters_by_year.text		gv_characters_by_year.text
normalize-distro.pl		normalize-distro.pl
other_events.text		other_events.text
parameters.ms		parameters.ms
parameters.sh		parameters.sh
plot_characters_vs_years.gp		plot_characters_vs_years.gp
plot_comparison_poetry_prose.gp		plot_comparison_poetry_prose.gp
plot_distributions.gp		plot_distributions.gp
plot_poetry_references.gp		plot_poetry_references.gp
plot_references.gp		plot_references.gp
plot_selected_poetry_references.gp		plot_selected_poetry_references.gp
plot_selected_prose_references.gp		plot_selected_prose_references.gp
plot_selected_references.gp		plot_selected_references.gp
plot_words_per_year.gp		plot_words_per_year.gp
presentation.html		presentation.html
presentation.pdf		presentation.pdf
refs_1corinthians_13_12.eps		refs_1corinthians_13_12.eps
refs_1corinthians_13_12.pdf		refs_1corinthians_13_12.pdf
refs_1corinthians_13_12.png		refs_1corinthians_13_12.png
refs_1corinthians_13_13.eps		refs_1corinthians_13_13.eps
refs_1corinthians_13_13.pdf		refs_1corinthians_13_13.pdf
refs_1corinthians_13_13.png		refs_1corinthians_13_13.png
refs_genesis_1_27.eps		refs_genesis_1_27.eps
refs_genesis_1_27.pdf		refs_genesis_1_27.pdf
refs_genesis_1_27.png		refs_genesis_1_27.png
refs_genesis_2_7.eps		refs_genesis_2_7.eps
refs_genesis_2_7.pdf		refs_genesis_2_7.pdf
refs_genesis_2_7.png		refs_genesis_2_7.png
refs_in_poetry_per_year.eps		refs_in_poetry_per_year.eps
refs_in_poetry_per_year.pdf		refs_in_poetry_per_year.pdf
refs_in_poetry_per_year.png		refs_in_poetry_per_year.png
refs_john_6_23.eps		refs_john_6_23.eps
refs_john_6_23.jpeg		refs_john_6_23.jpeg
refs_john_6_23.pdf		refs_john_6_23.pdf
refs_john_6_23.png		refs_john_6_23.png
refs_matt_16_18.eps		refs_matt_16_18.eps
refs_matt_16_18.pdf		refs_matt_16_18.pdf
refs_matt_16_18.png		refs_matt_16_18.png
refs_per_year.eps		refs_per_year.eps
refs_per_year.pdf		refs_per_year.pdf
refs_per_year.png		refs_per_year.png
refs_proverbs_4_23.eps		refs_proverbs_4_23.eps
refs_proverbs_4_23.pdf		refs_proverbs_4_23.pdf
refs_proverbs_4_23.png		refs_proverbs_4_23.png
refs_psalm_23_4.eps		refs_psalm_23_4.eps
refs_psalm_23_4.pdf		refs_psalm_23_4.pdf
refs_psalm_23_4.png		refs_psalm_23_4.png
render.xsl		render.xsl
run_text-types-by-year.pl		run_text-types-by-year.pl
selected_poetry_ref_1.text		selected_poetry_ref_1.text
selected_poetry_ref_2.text		selected_poetry_ref_2.text
selected_poetry_ref_3.text		selected_poetry_ref_3.text
selected_poetry_ref_4.text		selected_poetry_ref_4.text
selected_poetry_ref_5.text		selected_poetry_ref_5.text

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Statistics on Grundtvig's use of bible references

Prerequisites

Aggregating references by year

Aggregating references by year for verse only

Making cluster analysis and a cladogram

Number of works per year by genre

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Statistics on Grundtvig's use of bible references

Prerequisites

Aggregating references by year

Aggregating references by year for verse only

Making cluster analysis and a cladogram

Number of works per year by genre

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages