Tags: how-to, bug bounty, hack the box, python, recon, luigi
Welcome back! If you found your way here without reading the prior posts in this series, you may want to start with some of the links to previous posts (below). This post is part three of a multi-part series demonstrating how to build an automated pipeline for target reconnaissance. The target in question could be the target of a pentest, bug bounty, or capture the flag challenge (shout out to my HTB peoples!). By the end of the series, we’ll have built a functional recon pipeline that can be tailored to fit your own needs.
Previous posts:
Part III will:
nmap
to our pipelinesearchsploit
vulnerability check to our pipelinePart III’s git tags:
To get the repository to the point at which we’ll start, we can run one of the following commands. Which command used depends on if the repository is already present or not.
git clone --branch stage-2 https://github.com/epi052/recon-pipeline.git
git checkout tags/stage-2
Roadmap:
If you would like to skip to this point in the code, run the following
git
command from within the cloned repository:git checkout tags/stage-2
In this post, we’ll add nmap
to our pipeline. Our nmap
scan will read target information from the pickled dictionary created by the ParseMasscanOutput Task. After reading in the target information, we’ll generate nmap
commands that only scan the open ports that belong to each host. To facilitate scans across multiple hosts, we’ll be adding threading to this particular module. Let’s begin.
We’ll start by adding a new file to our recon
module named nmap.py
with the following contents.
1import luigi
2from luigi.util import inherits
3
4from recon.masscan import ParseMasscanOutput
5
6@inherits(ParseMasscanOutput)
7class ThreadedNmap(luigi.Task):
8 threads = luigi.Parameter(default=10)
9
10 def requires(self):
11 args = {
12 "rate": self.rate,
13 "target_file": self.target_file,
14 "top_ports": self.top_ports,
15 "interface": self.interface,
16 "ports": self.ports,
17 }
18 return ParseMasscanOutput(**args)
19
20 def output(self):
21 return luigi.LocalTarget(f"{self.target_file}-nmap-results")
If you’ve read Part I and Part II already, nothing in the code above should be surprising. We’re creating a new Task that depends on the ParseMasscanOutput Task. We’re adding a new Parameter and naming it threads
and giving it a default of 10
. The threads
variable will correspond to the number of threads used to run multiple nmaps in parallel. In the output
method, we specify the folder on the filesystem that this Task produces. Other than that, this is pretty standard fare.
Next up, we have the run
method. The run
method is where our heavy lifting is done.
The code below is a simple sanity check for the value passed to the threads
Parameter.
23 def run(self):
24 """ Parses pickled target info dictionary and runs targeted nmap scans against only open ports. """
25 try:
26 self.threads = abs(int(self.threads))
27 except TypeError:
28 return logging.error("The value supplied to --threads must be a non-negative integer.")
Next, we load the pickled dictionary of target information from disk and deserialize it.
30 ip_dict = pickle.load(open(self.input().path, "rb"))
After that, we build out a template for our nmap
commands. The list below defines the structure we’ll use for each of execution of nmap
. There are two placeholders. The first placeholder is where we’ll specify the protocol, either -sT
for tcp or -sU
for udp. The second placeholder is where we’ll specify the ports to scan.
32 nmap_command = [
33 "nmap",
34 "--open",
35 "PLACEHOLDER-IDX-2" "-n",
36 "-sC",
37 "-T",
38 "4",
39 "-sV",
40 "-Pn",
41 "-p",
42 "PLACEHOLDER-IDX-10",
43 "-oA",
44 ]
As seen above, our nmap commands will resemble what’s below, after replacing the placeholders with meaningful data.
nmap --open -sT -sC -T 4 -sV -Pn -p 43,25,21,53,22 -oA
Obviously what’s above is not a complete nmap
command. There are a few more pieces we need to add to the list before we’re done. The last two pieces of the command are seen below on lines 55 and 57. On line 55, we add the argument to the -oA
option, specifying the name of our three output files and the directory in which they’ll live. On line 57, we add the target as the last item in the list, ultimately making it the last part of the command itself. On line 61, we have a python equivalent of mkdir -p
. This line won’t error out if the folder already exists and creates parent folders if they’re needed. The folder that’s created is the one we specified in the output
method above.
46 commands = list()
47
48 for target, protocol_dict in ip_dict.items():
49 for protocol, ports in protocol_dict.items():
50 tmp_cmd = nmap_command[:]
51 tmp_cmd[2] = "-sT" if protocol == "tcp" else "-sU"
52
53 # arg to -oA, will drop into subdir off curdir
54 tmp_cmd[9] = ports
55 tmp_cmd.append(f"{self.output().path}/nmap.{target}-{protocol}")
56
57 tmp_cmd.append(target) # target as final arg to nmap
58
59 commands.append(tmp_cmd)
60
61 Path(self.output().path).mkdir(parents=True, exist_ok=True)
The astute reader may be wondering why we’re storing each command list in another list. The answer is that we need all the commands in an iterable in order to add threading to our Task. Python can make simple threading tasks like this incredibly easy (not every threaded program is this simple…). Check out the code below for the implementation.
63 with concurrent.futures.ThreadPoolExecutor(max_workers=self.threads) as executor:
64 executor.map(subprocess.run, commands)
In the code above, we use a ThreadPoolExecutor in order to execute parallel nmaps. The class creates a pool of threads for use. Imagine we have 17 nmap commands to execute. Assuming we passed 10
to the max_workers
keyword argument, the first ten will spawn immediately, one per thread. The first nmap command to finish has its thread returned to the pool. Once returned to the pool, the thread will begin the eleventh task. This process repeats until all of the commands are executed.
The call to map
on line 64 calls the subprocess.run
function max_workers
times passing in one command from commands
per worker (thread). Pretty neat right? Writing a threaded application doesn’t get much easier than that.
Here we have the finalized code with comments.
1import pickle
2import logging
3import subprocess
4import concurrent.futures
5from pathlib import Path
6
7import luigi
8from luigi.util import inherits
9
10from recon.masscan import ParseMasscanOutput
11
12
13@inherits(ParseMasscanOutput)
14class ThreadedNmap(luigi.Task):
15 """ Run nmap against specific targets and ports gained from the ParseMasscanOutput Task.
16
17 nmap commands are structured like the example below.
18
19 nmap --open -sT -sC -T 4 -sV -Pn -p 43,25,21,53,22 -oA htb-targets-nmap-results/nmap.10.10.10.155-tcp 10.10.10.155
20
21 The corresponding luigi command is shown below.
22
23 PYTHONPATH=$(pwd) luigi --local-scheduler --module recon.nmap ThreadedNmap --target-file htb-targets --top-ports 5000
24
25 Args:
26 threads: number of threads for parallel nmap command execution
27 rate: desired rate for transmitting packets (packets per second) *--* Required by upstream Task
28 interface: use the named raw network interface, such as "eth0" *--* Required by upstream Task
29 top_ports: Scan top N most popular ports *--* Required by upstream Task
30 ports: specifies the port(s) to be scanned *--* Required by upstream Task
31 target_file: specifies the file on disk containing a list of ips or domains *--* Required by upstream Task
32 """
33
34 threads = luigi.Parameter(default=10)
35
36 def requires(self):
37 """ ThreadedNmap depends on ParseMasscanOutput to run.
38
39 TargetList expects target_file as a parameter.
40 Masscan expects rate, target_file, interface, and either ports or top_ports as parameters.
41
42 Returns:
43 luigi.Task - ParseMasscanOutput
44 """
45 args = {
46 "rate": self.rate,
47 "target_file": self.target_file,
48 "top_ports": self.top_ports,
49 "interface": self.interface,
50 "ports": self.ports,
51 }
52 return ParseMasscanOutput(**args)
53
54 def output(self):
55 """ Returns the target output for this task.
56
57 Naming convention for the output folder is TARGET_FILE-nmap-results.
58
59 The output folder will be populated with all of the output files generated by
60 any nmap commands run. Because the nmap command uses -oA, there will be three
61 files per target scanned: .xml, .nmap, .gnmap.
62
63 Returns:
64 luigi.local_target.LocalTarget
65 """
66 return luigi.LocalTarget(f"{self.target_file}-nmap-results")
67
68 def run(self):
69 """ Parses pickled target info dictionary and runs targeted nmap scans against only open ports. """
70 try:
71 self.threads = abs(int(self.threads))
72 except TypeError:
73 return logging.error("The value supplied to --threads must be a non-negative integer.")
74
75 ip_dict = pickle.load(open(self.input().path, "rb"))
76
77 nmap_command = [ # placeholders will be overwritten with appropriate info in loop below
78 "nmap",
79 "--open",
80 "PLACEHOLDER-IDX-2" "-n",
81 "-sC",
82 "-T",
83 "4",
84 "-sV",
85 "-Pn",
86 "-p",
87 "PLACEHOLDER-IDX-10",
88 "-oA",
89 ]
90
91 commands = list()
92
93 """
94 ip_dict structure
95 {
96 "IP_ADDRESS":
97 {'udp': {"161", "5000", ... },
98 ...
99 i.e. {protocol: set(ports) }
100 }
101 """
102 for target, protocol_dict in ip_dict.items():
103 for protocol, ports in protocol_dict.items():
104 tmp_cmd = nmap_command[:]
105 tmp_cmd[2] = "-sT" if protocol == "tcp" else "-sU"
106
107 # arg to -oA, will drop into subdir off curdir
108 tmp_cmd[9] = ports
109 tmp_cmd.append(f"{self.output().path}/nmap.{target}-{protocol}")
110
111 tmp_cmd.append(target) # target as final arg to nmap
112
113 commands.append(tmp_cmd)
114
115 # basically mkdir -p, won't error out if already there
116 Path(self.output().path).mkdir(parents=True, exist_ok=True)
117
118 with concurrent.futures.ThreadPoolExecutor(max_workers=self.threads) as executor:
119 executor.map(subprocess.run, commands)
Now that we have our nmap scans complete, the first thing we’ll do with them is run searchsploit against the results to check for any low hanging fruit. Let’s go!
As we saw above, there is a definite pattern to writing these Tasks now. Let’s start by getting the boilerplate out of the way. We’re still working in the nmap.py
file for this Task.
67@inherits(ThreadedNmap)
68class Searchsploit(luigi.Task):
69 def requires(self):
70 args = {
71 "rate": self.rate,
72 "ports": self.ports,
73 "threads": self.threads,
74 "top_ports": self.top_ports,
75 "interface": self.interface,
76 "target_file": self.target_file,
77 }
78 return ThreadedNmap(**args)
79
80 def output(self):
81 return luigi.LocalTarget(f"{self.target_file}-searchsploit-results")
Just like earlier, this code is nothing new. We’re creating a new Task that depends on the ThreadedNmap Task that we just created. Similar to the ThreadedNmap Task, this Task will create a folder of results in which to store each run of searchsploit
.
Next, we’ll look at the run
method, where we execute searchsploit
and save the results. If you’re not aware, searchsploit
accepts a --nmap
option and accepts nmap’s xml results as input. Therefore, we’re going to grab each xml file that we created in the ThreadedNmap task above and pass each one of them to searchsploit
for processing.
83 def run(self):
84 for entry in Path(self.input().path).glob("nmap*.xml"):
85 proc = subprocess.run(["searchsploit", "--nmap", str(entry)], stderr=subprocess.PIPE)
86 if proc.stderr:
87 Path(self.output().path).mkdir(parents=True, exist_ok=True)
88
89 # grap the target specifier out of TGT-searchsploit-results/nmap.10.10.10.157-tcp -> i.e. 10.10.10.157
90 target = entry.stem.replace("nmap.", "").replace("-tcp", "").replace("-udp", "")
91
92 Path(
93 f"{self.output().path}/searchsploit.{target}-{entry.stem[-3:]}.txt"
94 ).write_bytes(proc.stderr)
There’s a lot going on with line 84, so let’s break it down. Based on the results of output
from ThreadedNmap (which we access here as this Task’s input()
), we look in that directory for any files that start with nmap
and end with .xml
. Once we have all of the xml files, we iterate over each one.
Line 85 is where we execute searchsploit
. The command structure is simple; an example is shown below.
searchsploit --nmap wall-searchsploit-results/nmap.10.10.10.157-tcp.xml
Line 86 simply checks whether or not our command generated output on STDERR. searchsploit
prints to STDERR, so that’s what we’ll need to look to capture any results.
On line 87, we’re creating the directory where our searchsploit
results will live. The path we’re using is specified above in the output
method. Again, this is python’s equivalent to mkdir -p
, so we don’t need to worry about exceptions here.
Next, on line 90, we’re doing some string formatting. The xml files in the directory containing nmap results all conform to the same pattern: nmap.TGT-PROTOCOL.xml
. We want to grab the TGT
portion of the filename to reuse it for naming our files for this Task. First, we grab the filename (entry.stem
) and simply do a .replace()
on all of the parts of the string we no longer want. Whether .replace()
finds something to replace or not, it returns a string (either altered or unaltered, as appropriate). Knowing that, we can chain .replace()
calls together without worrying about if any of the substrings exist or not.
Finally, on line 92, we create the output file where we’ll store our results and write what we saw on STDERR to that file. Hopefully the only potentially confusing part of this line is entry.stem[-3:]
. All we’re doing with that particular snippet of code boils down to stripping off the .xml
from the filename. Our final naming convention resembles some of the earlier Tasks we created and can be seen below.
htb-targets-searchsploit-results/searchsploit.10.10.10.154-tcp.txt
Our finalized code can be seen below in all its commented and docstrung glory!
128@inherits(ThreadedNmap)
129class Searchsploit(luigi.Task):
130 """ Run searchcploit against each nmap*.xml file in the TARGET-nmap-results directory and write results to disk.
131
132 searchsploit commands are structured like the example below.
133
134 searchsploit --nmap htb-targets-nmap-results/nmap.10.10.10.155-tcp.xml
135
136 The corresponding luigi command is shown below.
137
138 PYTHONPATH=$(pwd) luigi --local-scheduler --module recon.nmap Searchsploit --target-file htb-targets --top-ports 5000
139
140 Args:
141 threads: number of threads for parallel nmap command execution *--* Required by upstream Task
142 rate: desired rate for transmitting packets (packets per second) *--* Required by upstream Task
143 interface: use the named raw network interface, such as "eth0" *--* Required by upstream Task
144 top_ports: Scan top N most popular ports *--* Required by upstream Task
145 ports: specifies the port(s) to be scanned *--* Required by upstream Task
146 target_file: specifies the file on disk containing a list of ips or domains *--* Required by upstream Task
147 """
148
149 def requires(self):
150 """ Searchsploit depends on ThreadedNmap to run.
151
152 TargetList expects target_file as a parameter.
153 Masscan expects rate, target_file, interface, and either ports or top_ports as parameters.
154 ThreadedNmap expects threads
155
156 Returns:
157 luigi.Task - ThreadedNmap
158 """
159 args = {
160 "rate": self.rate,
161 "ports": self.ports,
162 "threads": self.threads,
163 "top_ports": self.top_ports,
164 "interface": self.interface,
165 "target_file": self.target_file,
166 }
167 return ThreadedNmap(**args)
168
169 def output(self):
170 """ Returns the target output for this task.
171
172 Naming convention for the output folder is TARGET_FILE-searchsploit-results.
173
174 The output folder will be populated with all of the output files generated by
175 any searchsploit commands run.
176
177 Returns:
178 luigi.local_target.LocalTarget
179 """
180 return luigi.LocalTarget(f"{self.target_file}-searchsploit-results")
181
182 def run(self):
183 """ Grabs the xml files created by ThreadedNmap and runs searchsploit --nmap on each one, saving the output. """
184 for entry in Path(self.input().path).glob("nmap*.xml"):
185 proc = subprocess.run(["searchsploit", "--nmap", str(entry)], stderr=subprocess.PIPE)
186 if proc.stderr:
187 Path(self.output().path).mkdir(parents=True, exist_ok=True)
188
189 # grap the target specifier out of TGT-searchsploit-results/nmap.10.10.10.157-tcp -> i.e. 10.10.10.157
190 target = entry.stem.replace("nmap.", "").replace("-tcp", "").replace("-udp", "")
191
192 Path(
193 f"{self.output().path}/searchsploit.{target}-{entry.stem[-3:]}.txt"
194 ).write_bytes(proc.stderr)
Before we finish up, we can test out our new addition to the pipeline with the following command.
PYTHONPATH=$(pwd) luigi --local-scheduler --module recon.nmap SearchSploit --target-file htb-targets --top-ports 1000
That wraps things up for this post. In the next installment, we’ll incorporate subdomain enumeration into our pipeline!