A GPL dep can quietly poison your closed-source product. I built a tiny offline tool that catches it.
DEV Community Grade 10 2d ago

A GPL dep can quietly poison your closed-source product. I built a tiny offline tool that catches it.

A few months ago a lawyer asked our team a simple question: "Can you prove nothing in this product is GPL?" We couldn't — not quickly. A couple thousand transitive deps across Node and Python services, and the honest answer was "uh, probably?", which is not what you want to tell a lawyer. So I went looking for a tool to just tell me, locally, right now , which licenses my dependencies carry and which ones are a problem. What I found: license-checker — the npm default, ~900K weekly downloads — has been unmaintained for years , and it dumps raw license strings; it won't tell you GPL is a bigger deal than MPL. Snyk, FOSSA, Black Duck — all good, all want a signup, an API token, and a network round-trip before classifying a folder already on my disk. But the info I needed was already in my node_modules : every package ships a package.json license field, every Python wheel a METADATA file. Why am I uploading anything? So I built licsniff — a zero-dependency CLI that reads those files locally, classifies each license into a risk tier , and exits. No account, no network, nothing to set up. npx licsniff PACKAGEVERSIONLICENSERISK some-gpl-lib 2.1.0GPL-3.0strong-copyleft mystery-pkg0.0.3(none) unknown copyleft-utils 1.4.0LGPL-2.1 weak-copyleft left-pad 1.3.0MITpermissive fast-json3.1.4(MIT OR Apache-2.0)permissive Riskiest first. The line you actually need to worry about is at the top. Tiers, not just strings The whole point is that "GPL-3.0" is only useful if you know what bucket it falls into. licsniff sorts every license into one of five tiers: permissive — MIT, ISC, BSD, Apache-2.0, 0BSD, Unlicense, CC0… use freely. weak-copyleft — LGPL-*, MPL-2.0, EPL-*, CDDL-*. File/linking obligations. strong-copyleft — GPL-*, AGPL-*. Can force you to open-source your code. proprietary — UNLICENSED , SEE LICENSE IN … . Not open source at all. unknown — missing or unrecognized. The scariest, honestly — you don't even know what you're shipping. SPDX expressions, parsed properly Real metadata isn't clean — you get (MIT OR Apache-2.0) , GPL-3.0 AND MIT , GPLv3 , GPL-3.0+ , GPL-3.0-only , Apache License 2.0 . licsniff normalizes all of it and evaluates the boolean expressions the way they actually work: OR → the least restrictive option wins (you pick the friendly one), so (MIT OR GPL-3.0) is permissive . AND → the most restrictive wins, so GPL-3.0 AND MIT is strong-copyleft . (You don't want a false "permissive" gating your CI here.) The flag that earns its keep: --fail-on This is what made it stick on our team. Drop one line in CI: licsniff --fail-on strong-copyleft It exits 1 the moment any dependency lands at or above that tier, so a GPL transitive dep can never sneak in through an npm install again. There's also --summary for counts and --json | jq for everything else. It runs on both ecosystems Half our services are Node, half Python, so licsniff ships on both registries. Same tool, same tiers; each version audits its own ecosystem: npx licsniff # Node — scans node_modules, zero deps pipx run licsniff # Python — scans site-packages, pure stdlib The Python build reads *.dist-info/METADATA , including the modern PEP 639 License-Expression: field newer wheels use. Both ports share the exact same classifier , tested against the same vectors, so they tier a license byte-for-byte identically. A few design notes One pure function at the core. classifyLicense(idOrName) → {tier, spdx} has no I/O, no clock, no globals; the CLI is just a thin folder-reader around it. That's why the Node and Python builds can be proven identical — they run one shared test table. Offline and read-only by design. Never writes a file, never opens a socket. Safe in air-gapped CI, on a client's machine, everywhere. Try it / break it Code, issues, and the full README: Node: https://github.com/jjdoor/licsniff Python: https://github.com/jjdoor/licsniff-py It's MIT and small. I'd genuinely like to know which license string it mis-tiers — paste me a weird one from your node_modules and I'll add it to the vectors. How are you checking your dependency licenses today — or are you, like past me, just hoping no lawyer asks?

A few months ago a lawyer asked our team a simple question: "Can you prove nothing in this product is GPL?" We couldn't — not quickly. A couple thousand transitive deps across Node and Python services, and the honest answer was "uh, probably?", which is not what you want to tell a lawyer. So I went looking for a tool to just tell me, locally, right now, which licenses my dependencies carry and which ones are a problem. What I found: - license-checker — the npm default, ~900K weekly downloads — has been unmaintained for years, and it dumps raw license strings; it won't tell you GPL is a bigger deal than MPL. - Snyk, FOSSA, Black Duck — all good, all want a signup, an API token, and a network round-trip before classifying a folder already on my disk. But the info I needed was already in my node_modules : every package ships a package.json license field, every Python wheel a METADATA file. Why am I uploading anything? So I built licsniff — a zero-dependency CLI that reads those files locally, classifies each license into a risk tier, and exits. No account, no network, nothing to set up. npx licsniff PACKAGE VERSION LICENSE RISK some-gpl-lib 2.1.0 GPL-3.0 strong-copyleft mystery-pkg 0.0.3 (none) unknown copyleft-utils 1.4.0 LGPL-2.1 weak-copyleft left-pad 1.3.0 MIT permissive fast-json 3.1.4 (MIT OR Apache-2.0) permissive Riskiest first. The line you actually need to worry about is at the top. Tiers, not just strings The whole point is that "GPL-3.0" is only useful if you know what bucket it falls into. licsniff sorts every license into one of five tiers: - permissive — MIT, ISC, BSD, Apache-2.0, 0BSD, Unlicense, CC0… use freely. - weak-copyleft — LGPL-*, MPL-2.0, EPL-*, CDDL-*. File/linking obligations. - strong-copyleft — GPL-*, AGPL-*. Can force you to open-source your code. - proprietary — UNLICENSED ,SEE LICENSE IN … . Not open source at all. - unknown — missing or unrecognized. The scariest, honestly — you don't even know what you're shipping. SPDX expressions, parsed properly Real metadata isn't clean — you get (MIT OR Apache-2.0) , GPL-3.0 AND MIT , GPLv3 , GPL-3.0+ , GPL-3.0-only , Apache License 2.0 . licsniff normalizes all of it and evaluates the boolean expressions the way they actually work: - OR → the least restrictive option wins (you pick the friendly one), so(MIT OR GPL-3.0) ispermissive . - AND → the most restrictive wins, soGPL-3.0 AND MIT isstrong-copyleft . (You don't want a false "permissive" gating your CI here.) The flag that earns its keep: --fail-on This is what made it stick on our team. Drop one line in CI: licsniff --fail-on strong-copyleft It exits 1 the moment any dependency lands at or above that tier, so a GPL transitive dep can never sneak in through an npm install again. There's also --summary for counts and --json | jq for everything else. It runs on both ecosystems Half our services are Node, half Python, so licsniff ships on both registries. Same tool, same tiers; each version audits its own ecosystem: npx licsniff # Node — scans node_modules, zero deps pipx run licsniff # Python — scans site-packages, pure stdlib The Python build reads *.dist-info/METADATA , including the modern PEP 639 License-Expression: field newer wheels use. Both ports share the exact same classifier, tested against the same vectors, so they tier a license byte-for-byte identically. A few design notes - One pure function at the core. classifyLicense(idOrName) → {tier, spdx} has no I/O, no clock, no globals; the CLI is just a thin folder-reader around it. That's why the Node and Python builds can be proven identical — they run one shared test table. - Offline and read-only by design. Never writes a file, never opens a socket. Safe in air-gapped CI, on a client's machine, everywhere. Try it / break it Code, issues, and the full README: It's MIT and small. I'd genuinely like to know which license string it mis-tiers — paste me a weird one from your node_modules and I'll add it to the vectors. How are you checking your dependency licenses today — or are you, like past me, just hoping no lawyer asks? Top comments (0)

Comments

No comments yet. Start the discussion.