To run:
stack exec example --cache-dir cache -a user-agents.txt -o output.csv
During testing/development, you can run the scraper from within GHCI:
cd examplestack ghcimainTest "--cache-dir cache --cache-only -a user-agents.txt -o output.csv"
To run the scraper with anonymization:
cd examplebash build-proxies.sh > torrc-filetor -f torrc-file &(wait until logs report success)stack exec example -- --cache-dir cache -a user-agents.txt --torrc torrc-file o outdata.csv -m 8111 +RTS -N15where *8111is the port to an EKG monitor onlocalhost*-N15is how many cores to use- After a long time you will need to kill the process manually.
Develop with one of:
stack ghcinix-shell --run 'cabal repl'
Build with one of:
stack buildnix-shell --run 'cabal build'nix-build