I was reading Does $_SESSION[‘username’] need to be escaped before getting into an SQL query? and it said “You need to escape every string you pass to the sql query, regardless of its origin”. Now I know something like this is really basic. A Google search turned up over 20, 000 results. Stackoverflow alone had 20 pages of results but no one actually explains what escaping a string is or how to do it. It is just assumed. Can you help me? I want to learn because as always I am making a web app in PHP.
I have looked at:
Inserting Escape Characters, What are all the escape characters in Java?,
Cant escape a string with addcslashes(),
what does mysql_real_escape_string() really do?,
How can i escape double quotes from a string in php?,
MySQL_real_escape_string not adding slashes?,
remove escape sequences from string in php I could go on but I am sure you get the point. This is not laziness.
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Escaping a string means to reduce ambiguity in quotes (and other characters) used in that string. For instance, when you’re defining a string, you typically surround it in either double quotes or single quotes:
But what if my string had double quotes within it?
Now I have ambiguity – the interpreter doesn’t know where my string ends. If I want to keep my double quotes, I have a couple options. I could use single quotes around my string:
Or I can escape my quotes:
Any quote that is preceded by a slash is escaped, and understood to be part of the value of the string.
When it comes to queries, MySQL has certain keywords it watches for that we cannot use in our queries without causing some confusion. Suppose we had a table of values where a column was named “Select”, and we wanted to select that:
SELECT select FROM myTable
We’ve now introduced some ambiguity into our query. Within our query, we can reduce that ambiguity by using back-ticks:
SELECT `select` FROM myTable
This removes the confusion we’ve introduced by using poor judgment in selecting field names.
A lot of this can be handled for you by simply passing your values through
mysql_real_escape_string(). In the example below you can see that we’re passing user-submitted data through this function to ensure it won’t cause any problems for our query:
// Query $query = sprintf("SELECT * FROM users WHERE user='%s' AND password='%s'", mysql_real_escape_string($user), mysql_real_escape_string($password));
Other methods exist for escaping strings, such as
quotemeta, and more, though you’ll find that when the goal is to run a safe query, by and large developers prefer
pg_escape_string (in the context of PostgreSQL.
Some characters have special meaning to the SQL database you are using. When these characters are being used in a query they can cause unexpected and/or unintended behavior including allowing an attacker to compromise your database. To prevent these characters from affecting a query in this way they need to be escaped, or to say it a different way, the database needs to be told to not treat them as special characters in this query.
In the case of
mysql_real_escape_string() it escapes
x1a as these, when not escaped, can cause the previously mentioned problems which includes SQL injections with a MySQL database.
For simplicity, you could basically imagine the backslash “” to be a command to the interpreter during runtime.
For e.g. while interpreting this statement:
$txt = "Hello world!";
during the lexical analysis phase ( or when splitting up the statement into individual tokens) these would be the tokens identified
However the backslash within the string will cause an extra set of tokens and is interpreted as a command to do something with the character that immediately follows it :
$txt = "this " is escaped";
results in the following tokens:
the interpreter already knows (or has preset routes it can take) what to do based on the character that succeeds the
token. So in the case of
" it proceeds to treat it as a character and not as the end-of-string command.