`susm`: A scraper for discovering and unpacking source maps

susm recursively crawls a website, following HTML links, scripts, stylesheets and sitemaps. For each file it encounters that references or includes a source map (e.g., JavaScript bundles, CSS files), it attempts to locate and download that map. susm then attempts to extract any source code files and write them to disk, preserving their relative pat 6CBC hs as defined in the map.

Building

First, clone the repository:

git clone https://github.com/dixslyf/susm.git
cd susm

To build the scraper, run:

cargo build --release

The compiled binary will be available at target/release/susm (assuming Cargo's default target directory).

Nix

This repository provides a Nix flake.

To build the scraper with Nix, run:

nix build github:dixslyf/susm

To run the scraper:

nix run github:dixslyf/susm

Usage

susm has two primary modes of operation:

Crawl a website and unpack discovered source maps:
```
susm site <URL> [OPTIONS]
```
Unpack a single local source map file:
```
susm file <PATH> [OPTIONS]
```

For additional options, run:

susm --help

Rate Limiting and `robots.txt`

susm applies a polite crawling policy by default.

Requests are rate-limited per host to avoid overloading servers. By default, susm waits 500 milliseconds between requests with a slight random jitter. The request interval can be adjusted with the --request-interval (-i) flag.

susm also respects the robots.txt exclusion standard. Before crawling, it retrieves and parses the site’s robots.txt file (if present) and skips any paths disallowed for its user agent.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
deny.toml		deny.toml
flake.lock		flake.lock
flake.nix		flake.nix

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

`susm`: A scraper for discovering and unpacking source maps

Building

Nix

Usage

Rate Limiting and `robots.txt`

About

Uh oh!

Languages

License

dixslyf/susm

Folders and files

Latest commit

History

Repository files navigation

susm: A scraper for discovering and unpacking source maps

Building

Nix

Usage

Rate Limiting and robots.txt

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages

`susm`: A scraper for discovering and unpacking source maps

Rate Limiting and `robots.txt`