What characters are allowed as a shortcode tag and how should they be sanitized?

I am working on a plugin where the user can define shortcode tags himself. What would you suggest to allow in there? My thought is only allow ascii characters.

So how do I sanitize? strip_tags and then regex to allow only a-z, 0-9 or is there a better solution? Does WordPress have filter for that? Could I maybe use the filter WordPress uses for slugs?

thanks for the answers i will just do this, if there must be one ascii char anyway then i just require three.

foreach ( $shortcodes as $key => $var ) {

$var = preg_replace('/[^a-z0-9_]/', '', $var ); // strip away everything except a-z,0-9 underscore

if ( strlen($var) < 3 )
    continue; // if less then 3 chars AFTER the strip don't save

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

You can use almost every character. Just the character / is dangerous. Do not allow it. WordPress is using preg_quote to escape the shortcode name, but it doesn’t include its own regex delimiter / when doing that. So the shortcode will not be properly escaped and you get a PHP warning.

Besides that, there are just two basic rules for a shortcode name:

  1. It should be at least two characters long.
  2. It should contain at least one US-ASCII character (a-z0-9).

So this works:

foreach ( array ( '.-o', ']b', 'äoß', 'o"o', "o'o", '❤m' ) as $shortcode )
{
    add_shortcode( $shortcode, 't5_crazy_shortcode_handler' );
}

function t5_crazy_shortcode_handler( $attrs = array(), $content = NULL, $shortcode )
{
    return "<pre>$shortcode: $shortcodenn$attrsn"
        . htmlspecialchars( print_r( $attrs, TRUE ) )
        . "nn$content"
        . htmlspecialchars( print_r( $content, TRUE ) )
        . '</pre>';
}

Method 2

It seems WordPress has some issues with shortcodes tags that have hyphens, so you probably want to avoid that. Unsure if this is still an issue with WP 3.3.x.

Most of the ‘sanitize’ functions in WP’s wp-includes/formatting.php file (like sanitize_title) do work similar to what you might need, but they do allow hyphens. If you only want to return alphanumeric, and not hyphens, you’d probably better off to write a function that takes a string, uses preg_replace to remove spaces and only do alphanumeric. You could replace the spaces with underscores, since it doesn’t look like the Shortcode API has issues with those in shortcode tags.


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x