PurposeAn introduction to the wonderful world of honeypots
I've always been very interested how attackers identify and attempt to exploit systems on the Internet, specifically hosts that don't appear to be very active. With the rise of online scanner repositories such as Shodan and Censys, identifying vulnerable hosts has been made significantly easier. I was curious exactly what would happen when I set up an intentionally accessible host on the Internet and just recorded actions. This post is from the first five full months of logs and their analysis.
In December of 2019, I started playing around with a free instance of AWS. I created a micro instance and was keenly aware of various cloud misconfigurations from default or non-existent security controls. I was worried about people trying to break into it - but then I thought about for a while and realized it would be interesting to record their actions.
This setup is known as a honeypot - concepts of honeypots have been around for a while and are widely used in personal, professional, research, and enterprise applications. Honeypots have been used as a security tool on computer networks for decades. Several companies, such as Thinkst, base their entire business model around creating easy to setup, manage, and reliable honeypots to identify malicious activities. Thinkst is also the author of OpenCanary, the open source project I've used on my honeypot.
Basically, honeypots are a form of systems either intentionally misconfigured or pretend to be vulnerable. Actions are recorded when attackers attempt to exploit services, allowing me to analyze the attacks for some of the data points below:
- Targeted services
- Commonly used usernames and passwords
- Geolocation
- Exploit attempts over time
Cyber Threat IntelligenceTaking the Attacks and Defending with them
I've decided to publish the data gathered from my honeypots in order to help identify these attackers and help defend other networks. This knowledge of these attacks I've identified is part of the Cyber Threat Intelligence cycle and is used to give specific information on which IP addresses to block. All of the IPs identified in my intelligence performed one or more unsolicited malicious actions against these honeypots.
I've been uploading these IPs daily to AbuseIPDB using their bulk upload API with a log parsing script I wrote by hand. As of writing this, I'm at about 46,000 reports for malicious activity. I also finished up a new set of scripts to automatically parse and upload this intelligence daily to post as feeds on my website. All of these scripts run off of a Raspberry Pi 4b sitting on my desk.
Honeypot SetupHow I listen...
This specific honeypot is running on an AWS micro instance geo-located in Virginia. I am running OpenCanary with six separate ports open to the Internet.
- SSH
- Redis
- SIP
- Telnet
- HTTP
- FTP
I have a Raspberry Pi 4b at my house which connects to the honeypot at midnight UTC and downloads the latest version of attack logs. The same Pi parses and uploads the attackers to AbuseIPDB, then parses for the specific intel feeds before they are uploaded to this site.
Note: There were some issues with the canarydaemon taking up all of the resources and preventing any additional connections. The system load was at 100% and required a reset with opencanaryd --restart
. This also resulted in the syslog completely filling up and some not-fun attempts to log into a server without any disk space. I learned that tab-completion doesn't work when there's no disk space also!
Data OverviewKey Takeaways
Overall, there were 430,444 events where someone on the Internet attempted to exploit my host. Inside of these events is a ton of great data, with most of the events being attacks against the SSH or Redis services.
I expected to receive a lot of brute-forcing and other scanning attempts, but I was very surprised by some key data points. For example, only 168 events for HTTP was much lower than expected - with no attempts to log into the fake application.
Here's exactly how the data broke down between each protocol:
Port | Protocol | Events |
---|---|---|
22 | SSH | 277,604 |
6379 | Redis | 141,859 |
5060 | SIP | 5,745 |
23 | telnet | 4,190 |
80 | HTTP | 168 |
21 | FTP | 87 |
As far as Geo-Analysis, China was overwhelmingly the top attacker across all open ports. Second came the U.S., followed by France in third - I believe most of the attacks from France would have originiated from OVH, which hosts several types of virtual private servers.
There was a pretty significant amount of brute-forcing attacks, here's a quick wordcloud of the highest amount of usernames and passwords attempted. Please note that there is no working username or password for this honeypot, so no matter how hard these people try they will never be able to login.
Obviously the top username attempted was root
, not very surprising. Some other interesting usernames include butter
, csgo
, minecraft
, and the classic changeme
.
What isn't shown in the passwords attempted is that the overwhelmingly used password was blank, where the attackers tried to log into the host without a password. Some interesting passwords I've found in this wordcloud are apple
, hacker
, raspberry
, and of course the tried and true 123456
.
There's a ton of even more great data inside of these attacks, I'll break it down by protocol and show some of the interesting information I've found during my analysis.
SSH AnalysisAttacks on TCP port 22
SSH was the top targeted port on my honeypot by far, this is due to the fact that logging into poorly configured SSH services is still one of the most effective ways to hack into servers. Looking at the attackers by IP address shows that one specific IP 221.237.8.118
performed over 28% of all attacks on SSH.
Looking further into this top IP shows that it is (unsurprisingly) hosted in China and is from one of the telecoms in the area. Searching in other Cyber Threat Intelligence databases shows other reports of attacks across the Internet as well.
Beyond some other analysis of IP addresses, there's really nothing too surprising coming out of this SSH data. A lot of it is default username and password combiniations, items which should always be changed when a host is initially configured. I'll continue to gather and analyze this data on SSH because it clearly shows that this protocol is the top choice for attackers to exploit. I expect this to remain the same as long as usernames and passwords are used.
Redis AnalysisAttacks on TCP port 6379
Redis is an interesting application that is used to help manage data inside of a database. My first exposure to this program was when I pwned the box Postman on HackTheBox. Redis is configured by default to allow anonymous access to the application over the Internet, which led to an excellent analysis about Redis Security and how anyone can drop files on a server to allow further exploitation. Postman follows this same exploit chain, which was a lot of fun to learn.
When setting up this honeypot I knew that enabling the Redis port for analysis would very likely display similar attacks. I was very interested in the differences of exploiting a target in a controlled, authorized environment such as HackTheBox (legal and allowed!) versus exploits launched from the Internet without my authorization (illegal and immoral!).
Analyzing the activity shows large spikes of activity from attackers attempting similar exploits. There were nine separate spikes in attacks, each spike contained at least 9,000 events sent to the honeypot. All other times contained around 25 to 200 events per day. Drilling into the events shows that almost 99% of the attacks on the Redis service were for the AUTH
command, indicating attempts to log into the system.
Analysis on the AUTH
attempts showed that China attempted 96.79% of these attacks, which Russia followed up with 2.09%. In total, there were 10,392 separate passwords attempted on the Redis service.
The attacks against the Redis service are generally completed with the CONFIG
command, followed up with either a directory to set or a key to create. I learned about an interesting attack which attempts to configure a cron job to maintain persistence on the host. I was not able to gather the specific attacker's payload for the cron job, this may be a limitation of the interactiveness of the honeypot application. Overall, very interesting data which shows that Redis is still very much a target by cybercriminals.
Telnet AnalysisAttacks on TCP port 23
Telnet is an old application used to deliver shell-based access to a host or an application. It's been around pretty much forever and is used to interface with devices that may not otherwise have an input or output. What's really interesting is the fact that a ton of Internet of Things (IoT) devices use telnet as the main method of communication. Botnets such as Mirai have taken advantage of these poorly secured IoT devices and vulnerable protocols to establish world-wide botnets. I had no idea what to expect and was interested to jump into the data.
Overall, there was 4,190 attacks on TCP port 23 from 169 attacking IPs. 115 usernames were attempted with 294 password attempts. While analyzing usernames I did find what appears to be some HTTP GET requests that were passed over TCP 23. The most interesting request was for a very specific PHP file named B7uAS15Xx.php
. This makes me believe that the attacker was searching for a previously planted PHP backdoor file, OSINT does not reveal any information about this specific filename though.
SIP AnalysisAttacks on UDP port 5060
There were 5,745 events for Session Initiation Protocol (SIP) launched from 890 IPs. 40.7% of these attacks were from Estonia, followed by 20.4% from USA, and France with 16.25%. China was only 0.942% of SIP attacks, which feels very different than the rest of the data. It's possible that China launched attacks from other cloud infrastructure.
Analyzing the SIP User Agents shows that the top UA was friendly-scanner
, known to be used by the SIPVicious toolset. This scanner is anything but friendly. It's interesting to note that these attackers didnt even bother trying to hide the fact that they're using SIPVicious and are just scanning anyway. Some other interesting UAs are Zedan 5
, Avaya IP Phone 1120E
, Trixbox
, and Cisco
. Also, there was one event that used COVID-19
as well!
OPTIONS
was the most used method in the SIP traffic and seen 79.3%, seemingly used to determine which extensions are available to dial and enumerate the phone. INVITE
was used 10.1% of the time and appears to be attempting to establish a phone call with the host.
HTTP AnalysisAttacks on TCP port 80
I was surprised to see such a low number of events on the HTTP service, as well as no login attempts on the fake application.
I found some interesting User Agents used to poke the HTTP site. The first was CATExplorador/1.0beta (sistemes at domini dot cat; http://domini.cat/catexplorador.html)
, there is no information at the site linked inside of the UA. OSINT shows that this is a known crawler, but there's not much other information behind it. I've also identified facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
which appears to be the bot which gathers information and images for a facebook link. I was not able to find where my honeypot was shared on facebook.
ConclusionsInteresting analysis and future events
I've had a great time capturing and reporting this malicious activity from my honeypot. Setting up this infrastructure also forced me to write more python, bash, and cron jobs across several different devices. There are a handful of things I'm interested in changing as well, specifically trying to figure out a way to run the fake HTTP application over HTTPS and port 443. I believe that I would be receiving much more web based attacks on 443 instead of 80. Also, I should enable samba support to help identify attacks across TCP 445. I believe that SMB will continue to be targeted by attackers as well.
I will continue to publish the threat intel on this website daily, check it out!
If you've made it this far, thanks for reading! If you're interested in getting your hands on my data, just let me know! The best way to contact me is email at m4lwhere@protonmail.com.