For 6-months, this website was compromised. I am not sure what exactly happened, but it was most likely password-reuse, which lend itself to this problem. The problem became apparent when I first noticed an unusual link to a ride-sharing service. Later, I saw more of those links. That’s when I realized that I couldn’t merely sit and scan every blog post manually and decided to write a small interactive link checker tool. This tool whitelists the starting domain and allows you to whitelist URLs on a per-domain basis. The whitelist is persisted at the end of execution and will be used next time you use the tool.

Say, your website is example.com,

Sh
1
2
3
4
go run outbound-link-checker.go \
  -domain example.com \
  -starting-url https://example.com \
  -num-url-crawl-limit -1

The tool starts from the starting URL and scans all the links on the page. If any of those links are in the domain, they are scanned further. If they are not, then they are checked against the whitelist, the non-whitelisted domains would be prompted back to you for whitelisting.

Using the tool, I caught quite a few more such bad links. Note: The tool does not execute Javascript. Thus, it will miss any dynamically generated links.