Stars
Code for the paper 'A Longitudinal Study of Content Control Mechanisms' presented at the TempWeb workshop (WWW'24)
A list of AI agents and robots to block.
High performance search for IP addresses and CIDR ranges
Known tags and settings suggested to opt out of having your content used for AI training.
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
Perl toolchain docs, specs, guidelines, etc.
Downloadable snapshots of the Chrome Top Million Websites pulled from public CrUX data in Google BigQuery.
List all IP ranges from Google Cloud Platform
Google Chrome, Firefox, and Thunderbird extension that lets you write email in Markdown and render it before sending.
Extracted CPAN (all latest files extracted)
The core software distribution for the Inform 7 programming language.
Papers from the computer science community to read and discuss.
Advanced Data Protection Control (ADPC) is a mechanism to communicate data subjects' (users') consent and privacy decisions with data controllers (service providers).
A Python program to scrape secrets from GitHub through usage of a large repository of dorks.
Apache Block Bad Bots, (Referer) Spam Referrer Blocker, Vulnerability Scanners, Malware, Adware, Ransomware, Malicious Sites, Wordpress Theme Detectors and Fail2Ban Jail for Repeat Offenders
A Headless Chrome rendering solution
Rendertron middleware for python applications.
Essential metrics for a healthy site.