Skip to content

Add strategy for Internet Archive torrents #158

@nemobis

Description

@nemobis

The Internet Archive has over 50 million torrent files for as many items.

The trackers are:

The archive.org servers themselves don't seed the torrents over bittorrent, they only provide web seeds. Few torrents will have any seeders at all, but a few have leechers. It would be nice to be able to search the most active ones. Some of the most included torrents include audio or video

You can get the info_hash list from the "btih" field inside the "files" field of the XML metadata of each item, or in JSON with the internetarchive Python utility, ofr instance:

ia search -i "format:bittorrent -format:warc -format:arc -access-restricted-item:true" | xargs -P4 -I§ sh -c "ia metadata § | jello -rl '\
result=[f.btih for f in _.files if f.name.endswith(\"archive.torrent\")]
result' " > ia_all_torrents_btih.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions