Sql Injection Explained

0 views

Skip to first unread message

Jamie Swearengin

unread,

Aug 5, 2024, 6:51:20 AM8/5/24

to eanmiinacco

Imagineyou're a robot in a warehouse full of boxes. Your job is to fetch a box from somewhere in the warehouse, and put it on the conveyor belt. Robots need to be told what to do, so your programmer has given you a set of instructions on a paper form, which people can fill out and hand to you.

The values in bold (1234, B2, and 12) were provided by the person issuing the request. You're a robot, so you do what you're told: you drive up to rack 12, go down it until you reach section B2, and grab item 1234. You then drive back to the conveyor belt and drop the item onto it.

Again, the parts in bold were provided by the person issuing the request. Since you're a robot, you do exactly what the user just told you to do. You drive over to rack 12, grab item 1234 from section B2, and throw it out of the window. Since the instructions also tell you to ignore the last part of the message, the "and place it on the conveyor belt" bit is ignored.

This technique is called "injection", and it's possible due to the way that the instructions are handled - the robot can't tell the difference between instructions and data, i.e. the actions it has to perform, and the things it has to do those actions on.

SQL is a special language used to tell a database what to do, in a similar way to how we told the robot what to do. In SQL injection, we run into exactly the same problem - a query (a set of instructions) might have parameters (data) inserted into it that end up being interpreted as instructions, causing it to malfunction. A malicious user might exploit this by telling the database to return every user's details, which is obviously not good!

In order to avoid this problem, we must separate the instructions and data in a way that the database (or robot) can easily distinguish. This is usually done by sending them separately. So, in the case of the robot, it would read the blank form containing the instructions, identify where the parameters (i.e. the blank spaces) are, and store it. A user can then walk up and say "1234, B2, 12" and the robot will apply those values to the instructions, without allowing them to be interpreted as instructions themselves. In SQL, this technique is known as parameterised queries.

Your boss is the legitimate program code.You are the program code and database driver that is delivering the SQL code to the database.The letter is the SQL code that is being passed to the database.The thief is the attacker.The cashier is the database.The identification is typically a login and password to the database.

If you're really explaining to your grandmother, use writing a paper check as an example. (In the USA) back in the day, you'd write the dollar amount numerically in one field, then you'd write the same thing in words. For example in one field, you'd write "100.00" and in the second, longer field, you'd write "One hundred dollars and zero cents". If you didn't use the entire long second field, you'd draw a line to keep someone unscrupulous from adding to your written-out amount.

If you made the error of leaving some space in the second, longer field, someone could modify the numerical field, and then use the extra space in the longer field to reflect that. The modifier could obtain more money than you intended when you wrote the check.

The idea that first came to mind was to explain it in terms of a Mad Lib. The stories where words are left out, and to fill in the blanks you ask the group for words of the types indicated and write them in, then read the resulting story.

That's the normal way to fill out a Mad Lib. But what if someone else knew the story and what blanks you were filling in (or could guess)? Then, instead of a single word, what if that person gave you a few words? What if the words they gave you included a period ending the sentence? If you filled it in, you may find that what was provided still "fits", but it drastically changes the story more than any one word that you'd normally fill in could ever do. You could, if you had space, add entire paragraphs to the Mad Lib and turn it into something very different.

That, in non-techie terms, is SQL injection in a nutshell. You provide some "blank spaces" for data that will be inserted into a SQL command, much like words into a Mad Lib. An attacker then enters a value that isn't what you expect; instead of a simple value to look for, he enters a piece of a SQL statement that ends the one you wrote, and then adds his own SQL command after that as a new "sentence". The additional statement can be very damaging, like a command to delete the database, or to create a new user with a lot of permissions in the system. The computer, not knowing the difference, will simply perform all the tasks it is commanded to do.

I would explain it as being like telling a cashier that the customer is always right and they should do whatever they can to meet the customer's need. Then since there are no checks about the reasonableness of the request, when a customer comes in and says they want the entire store for free, the cashier loads all the inventory in to their truck for them.

"a weakness in how some websites handle input from users (e.g. where you put your name into a registration form) which can allow an attacker to get access to the database storing all the user information for the site"

"Some web applications don't correctly separate user input from the instructions for the database, which can allow attackers to instruct database directly, through the information they fill in the website form. This can allow the attacker to read other users' information out of the database, or change some of the information in there."

I think you can get the best effect with just demonstrating the attack. Write a harmless looking web formular and show the result of the query using the user input. Then after entering your own prepared input, your audience will have an "aha-experience" after finding passwords in the result. I made such a demo page, just click the "next arrow" to fill in a prepared input.

P.S. Should you write such a page on your own, be very careful that you do not allow testers to get unwanted privileges. Best is when you alone will run the demo on your local system with the lowest possible database privileges (not all kinds of attacks should be possible, it's just a demo). Make a white-list of allowed expressions.

The database is like a magical genie (or, Oracle) that grants wishes. We've told our genie to only grant a maximum of three wishes, but if we don't verify what people wish for, then someone would easily outsmart it by asking for something clever like "a hundred more wishes" or "everyone else's wishes".

Imagine a big company that keeps all of its records in paper form in a big room full of filing cabinets. In order to retrieve or make changes to files someone will fill out a simple fill-in-the-blanks form and then that form will be sent to a clerk who follows the instructions on the form.

By pretending that their name also includes other commands they can hijack the fill in form, and if the clerk has not been trained to handle these sorts of things then maybe they will simply execute the instructions without thinking about it, and hand over all of the credit card information to a user.

The trick with SQL injections is making sure that your code is smart enough to be able to ensure that users can't change the intent of the commands you are sending to the database and are unable to retrieve data or make changes which they should not be permitted to.

If you don't need to do it fast and have a piece of paper available, just demonstrate the entire thing. SQL is pretty similar to natural language, so just take a simple query and demonstrate the attack:

"Can you see where this is going? Now, on this system, it is also possible to issue multiple commands, called queries, if you just separate them with a semicolon. Can you guess what happens if someone puts in 1; DELETE EVERYTHING;?"

Sometimes hackers will put computer / programming commands into boxes on the internet to trick the website into doing something it shouldn't. Therefore we check the information entered on websites for what might be a "Command."

Assume you are working in a large office building. Every clerk has an own key to its office (=SQL-query the programmer wanted to execute). Now someone takes a needle file and modifies his key a bit, i.e. removing a pin (=SQL-Injection, changing the SQL-query). This modified key can open different (or maybe all doors). So the clerk has access to more or all offices within this house and can read/change documents from other offices.

One night I ordered food and I accidentally injected a Burger into the order. The delivery guy confused a comment as another item on the order list and made it. Even though no price was attached to it.

Imagine you're in a restaurant, and the waiter comes over and tells you to write your order down on a piece of paper. You write down Chips and Lemonade, but you also rsh g hnbǎo, which is Chinese for "twenty burgers!"

The waitress can't read the Chinese so she ignores it. But the Chef happens to be Bilingual, and can read the Chinese, so also cooks you twenty extra burgers for free, as the waitress only charged you for the Chips and Lemonade!

Note this isn't entirely how SQLi works, because like demonstrated in the top answer here, SQLi is more like misinterpreted instructions- rather than the hidden instructions like in my scenario, but anyway.

I participated in a webinar this morning about prompt injection, organized by LangChain and hosted by Harrison Chase, with Willem Pienaar, Kojin Oshiba (Robust Intelligence), and Jonathan Cohen and Christopher Parisien (Nvidia Research).

But where this gets really dangerous-- these two examples are kind of fun. Where it gets dangerous is when we start building these AI assistants that have tools. And everyone is building these. Everyone wants these. I want an assistant that I can tell, read my latest email and draft a reply, and it just goes ahead and does it.