Could you provide a regex that match Twitter usernames?
Extra bonus if a Python example is provided.
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
(?<=^|(?<=[^a-zA-Z0-9-_.]))@([A-Za-z]+[A-Za-z0-9-_]+)
I’ve used this as it disregards emails.
Here is a sample tweet:
@Hello how are @you doing @my_friend, email @000 me @ [email protected] @shahmirj
Matches:
- @Hello
- @you
- @my_friend
- @shahmirj
It will also work for hashtags, I use the same expression with the @ changed to #.
Method 2
If you’re talking about the @username thing they use on twitter, then you can use this:
import re twitter_username_re = re.compile(r'@([A-Za-z0-9_]+)')
To make every instance an HTML link, you could do something like this:
my_html_str = twitter_username_re.sub(lambda m: '<a href="http://twitter.com/%s" rel="nofollow noreferrer noopener">%s</a>' % (m.group(1), m.group(0)), my_tweet)
Method 3
The regex I use, and that have been tested in multiple contexts :
/(^|[^@w])@(w{1,15})b/
This is the cleanest way I’ve found to test and replace Twitter username in strings.
#!/usr/bin/python
import re
text = "@RayFranco is answering to @jjconti, this is a real '@username83' but this is [email protected], and this is a @probablyfaketwitterusername";
ftext = re.sub( r'(^|[^@w])@(w{1,15})b', '\1<a href="http://twitter.com/\2" rel="nofollow noreferrer noopener">\2</a>', text )
print ftext;
This will return me as expected :
<a href="http://twitter.com/RayFranco" rel="nofollow noreferrer noopener">RayFranco</a> is answering to <a href="http://twitter.com/jjconti" rel="nofollow noreferrer noopener">jjconti</a>, this is a real '<a href="http://twitter.com/username83" rel="nofollow noreferrer noopener">username83</a>' but this is <a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="7a1b143a1f171b131654191517">[email protected]</a>, and this is a @probablyfaketwitterusername
Based on Twitter specs :
Your username cannot be longer than 15 characters. Your real name can be longer (20 characters), but usernames are kept shorter for the sake of ease.
A username can only contain alphanumeric characters (letters A-Z, numbers 0-9) with the exception of underscores, as noted above. Check to make sure your desired username doesn’t contain any symbols, dashes, or spaces.
Method 4
Twitter recently released to open source in various languages including Java, Ruby (gem) and Javascript implementations of the code they use for finding user names, hash tags, lists and urls.
It is very regular expression oriented.
Method 5
The only characters accepted in the form are A-Z, 0-9, and underscore. Usernames are not case-sensitive, though, so you could use r'@(?i)[a-z0-9_]+' to match everything correctly and also discern between users.
Method 6
This is a method I have used in a project that takes the text attribute of a tweet object and returns the text with both the hashtags and user_mentions linked to their appropriate pages on twitter, complying with the most recent twitter display guidelines
def link_tweet(tweet): """ This method takes the text attribute from a tweet object and returns it with user_mentions and hashtags linked """ tweet = re.sub(r'(A|s)@(w+)', r'<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="516011">[email protected]</a><a href="http://www.twitter.com/2" rel="nofollow noreferrer noopener">2</a>', str(tweet)) return re.sub(r'(A|s)#(w+)', r'1#<a href="http://search.twitter.com/search?q=%232" rel="nofollow noreferrer noopener">2</a>', str(tweet))
Once you call this method you can pass in the param my_tweet[x].text. Hope this is helpful.
Method 7
Shorter, /@([w]+)/ works fine.
Method 8
This regex seems to solve Twitter usernames:
^@[A-Za-z0-9_]{1,15}$
Max 15 characters, allows underscores directly after the @, (which Twitter does), and allows all underscores (which, after a quick search, I found that Twitter apparently also does). Excludes email addresses.
Method 9
In case you need to match all the handle, @handle and twitter.com/handle formats, this is a variation:
import re
match = re.search(r'^(?:.*twitter.com/|@?)(w{1,15})(?:$|/.*$)', text)
handle = match.group(1)
Explanation, examples and working regex here:
https://regex101.com/r/7KbhqA/3
Matched
myhandle @myhandle @my_handle_2 twitter.com/myhandle <div class="oceanwp-oembed-wrap clr"><a class="twitter-timeline" data-width="1200" data-height="1000" data-dnt="true" href="https://twitter.com/MyHandle?ref_src=twsrc%5Etfw" rel="nofollow noreferrer noopener">Tweets by MyHandle</a></div> https://twitter.com/myhandle/randomstuff
Not matched
mysuperhandleistoolong @mysuperhandleistoolong https://twitter.com/mysuperhandleistoolong
Method 10
You can use the following regex: ^@[A-Za-z0-9_]{1,15}$
In python:
import re
pattern = re.compile('^@[A-Za-z0-9_]{1,15}$')
pattern.match('@Your_handle')
This will check if the string exactly matches the regex.
In a ‘practical’ setting, you could use it as follows:
pattern = re.compile('^@[A-Za-z0-9_]{1,15}$')
if pattern.match('@Your_handle'):
print('Match')
else:
print('No Match')
Method 11
I have used the existing answers and modified it for my use case. (username must be longer then 4 characters)
^[A-z0-9_]{5,15}$
Rules:
- Your username must be longer than 4 characters.
- Your username must be shorter than 15 characters.
- Your username can only contain letters, numbers and ‘_’.
Source: https://help.twitter.com/en/managing-your-account/twitter-username-rules
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0