GitHub - domsteinbach/hssfaks-data-extractor: A custom crawler for old "Handschriften-Faksimile" websites generating json data by reading out all pages

HSSFAKS DATA EXTRACTOR

Extracts all data from the Handschriften Faksimiles pages as json.

It does so by iterating all pages via their nextPage() function of the base websites and reads out all data displayed at every page. In the end you get a JSON file containing all data extracted as a download.

HowTo run:

import pageParser into template: <script src="pageParser.js" type="text/javascript"></script>
call readOutManuscript(), e.g. create a button to call the readOutManuscript() function Get data into normal form!
open the html of the faksimile in a browser and start at the very first page of the very first manuscript and hit the button. You will get a download of a JSON file containing all data needed to build a proper website or database.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
pageParser.js		pageParser.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HSSFAKS DATA EXTRACTOR

About

Uh oh!

Releases

Packages

Languages

domsteinbach/hssfaks-data-extractor

Folders and files

Latest commit

History

Repository files navigation

HSSFAKS DATA EXTRACTOR

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages