DEV Community Grade 10 1h ago

r4b1t_h0l3

→ Try it: gnomeman4201.github.io/r4b1t It's a curated random link generator for security and OSINT researchers. 53,869 verified live URLs. Roll one. See what happens. Yes, it's basically StumbleUpon for your niche. That's the point. StumbleUpon worked. Nobody built a replacement for it when it died, especially not for security research. So I did. What a real session looks like I opened the tool this morning. Here's exactly what I rolled, in order: cvedb.shodan.io — Shodan's CVE database. Structured vulnerability data, searchable, free. hnd.techlearningcollective.com — Hackers Next Door, an infosec conference I'd never heard of. easyperf.net — Performance engineering blog. Low-level, serious, no fluff. engineeringblog.yelp.com/2014/11/scaling-elasticsearch — 2014 Yelp post on Elasticsearch at scale. Still accurate. domains-index.com — Domain registration intelligence. OSINT pivoting tool. metapicz.com — EXIF metadata viewer. Forgot this existed. Bookmarked. discover.maxar.com — Satellite imagery browser. Geospatial OSINT. github.com/jamesm0rr1s/BurpSuite-Add-and-Track-Custom-Issues — BurpSuite extension I didn't know existed. insanecoding.blogspot.co.uk/2014/05/a-good-idea-with-bad-usage-devurandom — Post on /dev/urandom misuse from 2014. Referenced everywhere, never read it until now. en.wikipedia.org/wiki/Software_development — Wikipedia. In a pool of 53,000 URLs this came up. I hit SPROUT on it anyway. Ten rolls. Two tools I'm adding to my workflow. One conference I'm looking up. Three things I'd completely forgotten existed. How it works — entirely in your browser The entire pool loads into a JavaScript array in memory on page open. One 2MB fetch of a plain text file. Everything else runs client-side. No backend decides what you see. No server query on each roll. No tracking. Rolling a URL: function ee ( pool ) { let url , attempts = 0 ; do { url = pool [ Math . floor ( Math . random () * pool . length )]; attempts ++ ; const domain = new URL ( url ). hostname . replace ( /^www \. / , "" ); if (( domainCount [ domain ] || 0 ) >= 2 && attempts < 25 ) continue ; if ( url !== lastUrl ) break ; } while ( attempts < 30 ); return url ; } Domain diversity is enforced in the roll — you won't see the same domain more than twice in 25 attempts. 14,488 unique domains in the pool. SPROUT — semantic navigation without AI Hit SPROUT on any URL and get four directional suggestions: DEEPER — further into this niche SIDEWAYS — adjacent territory OPPOSITE — contrasting view WEIRD — unexpected tangent I originally used the Anthropic API for this. It worked. Then I ripped it out. What actually happens now: Read the OG title + description already fetched for the preview card (zero extra cost) Query Wikipedia's free API for the domain — get the intro extract and article categories Extract top keywords by frequency, filtered against a stopword list Score the pool by keyword overlap using Jaccard similarity function xe ( url , keywords ) { const text = ( hostname + pathname ). toLowerCase (); let hits = 0 ; for ( const kw of keywords ) { if ( text . includes ( kw )) hits ++ ; } return hits / Math . max ( keywords . length , 1 ); } Runs against up to 3,000 randomly sampled URLs. In milliseconds. In your browser. No API key. No cost. No rate limit. Where it fails: OG metadata is often garbage — generic descriptions, SEO spam, or missing entirely. When that happens SPROUT falls back to URL-only token matching, which is coarser. I accepted this tradeoff over adding an AI dependency. The WEIRD direction is intentionally low-signal — it's supposed to surprise you. The pool — and the link rot problem Starting corpus: ~120,000 URLs from Start.me OSINT pages and GitHub awesome-lists across 21 categories. Every URL swept with HEAD requests, 10 second timeout, 50 concurrent workers, results checkpointed to SQLite. What survived: 53,869 verified live URLs across 14,488 unique domains . Yes, it's my curated bookmarks folder. That's also the point — random across the whole internet is noise. The curation is what makes the randomness useful. Link rot: A GitHub Actions workflow runs pool_sweep.py every Sunday, hits every URL, and auto-commits the pruned pool. Dead links get culled weekly. It's not perfect — a site can return 200 while serving a parking page — but it catches the obvious rot. What it doesn't do No login. No analytics. No recommendation engine. No ads. The Cloudflare Worker handles OG metadata fetching only — origin-locked, rate limited at 60 req/min per IP, RFC1918 blocked. The core loop — roll, visit, skip — works without it entirely. gnomeman4201.github.io/r4b1t Source: github.com/GnomeMan4201/r4b1t Submit a URL: GitHub Issues badBANANA Research Collective

→ Try it: gnomeman4201.github.io/r4b1t It's a curated random link generator for security and OSINT researchers. 53,869 verified live URLs. Roll one. See what happens. Yes, it's basically StumbleUpon for your niche. That's the point. StumbleUpon worked. Nobody built a replacement for it when it died, especially not for security research. So I did. What a real session looks like I opened the tool this morning. Here's exactly what I rolled, in order: - cvedb.shodan.io — Shodan's CVE database. Structured vulnerability data, searchable, free. - hnd.techlearningcollective.com — Hackers Next Door, an infosec conference I'd never heard of. - easyperf.net — Performance engineering blog. Low-level, serious, no fluff. - engineeringblog.yelp.com/2014/11/scaling-elasticsearch — 2014 Yelp post on Elasticsearch at scale. Still accurate. - domains-index.com — Domain registration intelligence. OSINT pivoting tool. - metapicz.com — EXIF metadata viewer. Forgot this existed. Bookmarked. - discover.maxar.com — Satellite imagery browser. Geospatial OSINT. - github.com/jamesm0rr1s/BurpSuite-Add-and-Track-Custom-Issues — BurpSuite extension I didn't know existed. - insanecoding.blogspot.co.uk/2014/05/a-good-idea-with-bad-usage-devurandom — Post on /dev/urandom misuse from 2014. Referenced everywhere, never read it until now. - en.wikipedia.org/wiki/Software_development — Wikipedia. In a pool of 53,000 URLs this came up. I hit SPROUT on it anyway. Ten rolls. Two tools I'm adding to my workflow. One conference I'm looking up. Three things I'd completely forgotten existed. How it works — entirely in your browser The entire pool loads into a JavaScript array in memory on page open. One 2MB fetch of a plain text file. Everything else runs client-side. No backend decides what you see. No server query on each roll. No tracking. Rolling a URL: function ee(pool) { let url, attempts = 0; do { url = pool[Math.floor(Math.random() * pool.length)]; attempts++; const domain = new URL(url).hostname.replace(/^www\./, ""); if ((domainCount[domain] || 0) >= 2 && attempts < 25) continue; if (url !== lastUrl) break; } while (attempts < 30); return url; } Domain diversity is enforced in the roll — you won't see the same domain more than twice in 25 attempts. 14,488 unique domains in the pool. SPROUT — semantic navigation without AI Hit SPROUT on any URL and get four directional suggestions: - DEEPER — further into this niche - SIDEWAYS — adjacent territory - OPPOSITE — contrasting view - WEIRD — unexpected tangent I originally used the Anthropic API for this. It worked. Then I ripped it out. What actually happens now: - Read the OG title + description already fetched for the preview card (zero extra cost) - Query Wikipedia's free API for the domain — get the intro extract and article categories - Extract top keywords by frequency, filtered against a stopword list - Score the pool by keyword overlap using Jaccard similarity function xe(url, keywords) { const text = (hostname + pathname).toLowerCase(); let hits = 0; for (const kw of keywords) { if (text.includes(kw)) hits++; } return hits / Math.max(keywords.length, 1); } Runs against up to 3,000 randomly sampled URLs. In milliseconds. In your browser. No API key. No cost. No rate limit. Where it fails: OG metadata is often garbage — generic descriptions, SEO spam, or missing entirely. When that happens SPROUT falls back to URL-only token matching, which is coarser. I accepted this tradeoff over adding an AI dependency. The WEIRD direction is intentionally low-signal — it's supposed to surprise you. The pool — and the link rot problem Starting corpus: ~120,000 URLs from Start.me OSINT pages and GitHub awesome-lists across 21 categories. Every URL swept with HEAD requests, 10 second timeout, 50 concurrent workers, results checkpointed to SQLite. What survived: 53,869 verified live URLs across 14,488 unique domains. Yes, it's my curated bookmarks folder. That's also the point — random across the whole internet is noise. The curation is what makes the randomness useful. Link rot: A GitHub Actions workflow runs pool_sweep.py every Sunday, hits every URL, and auto-commits the pruned pool. Dead links get culled weekly. It's not perfect — a site can return 200 while serving a parking page — but it catches the obvious rot. What it doesn't do No login. No analytics. No recommendation engine. No ads. The Cloudflare Worker handles OG metadata fetching only — origin-locked, rate limited at 60 req/min per IP, RFC1918 blocked. The core loop — roll, visit, skip — works without it entirely. Source: github.com/GnomeMan4201/r4b1t Submit a URL: GitHub Issues badBANANA Research Collective Top comments (0)

Read on DEV Community ↗ ← Back to News

r4b1t_h0l3

Comments