BigPhish is an application developed for Internet researchers and law enforcement professionals to study the deployment of phishing kits in the wild. This software product originates from a 2021 USENIX paper, which proposed a method to discover new phishing domains by leveraging Certificate Transparency (CT) logs. By continuously monitoring these logs and crawling potential phishing domains, it can identify and monitor domains that have been setup up with a phishing kit. Phishing kits are identified by searching for fingerprints, which are unique file paths associated with those kits. BigPhish offers an all-in-one solution for monitoring CT logs, crawling domains, storing them in Elasticsearch and a Web application to interact with the results.
BigPhish consists of 10 modules, some off-the-shelf and some custom build. Each module functions within its own Docker container:
- Certscanner detects potential phishing domains from a stream of TLS certificates.
- Crawler automatically visits the potential phishing domains, analyzes them, and tries to identify the used phishing kit.
- Elasticsearch index storage to store all gathered information.
- Kibana for easy interaction with Elasticsearch.
- VPN routes the requests from the crawler through a VPN connection.
- API to allow for easy and secure access to the phishing data.
- NodeJS serves as a front-end Web application to interact with the data.
- NGINX serves a bunkerized NGINX container located in front of the API to secure Internet-facing connections.
- MinIO object storage to store objects like screenshots and downloaded phishing kits.
- Monitor service that monitors the phishing domains and sends out notifications about new domains.
BigPhish was developed by TNO‘s cybercrime team in close collaboration with Dutch law enforcement.