Get url from a text

Possible Duplicate:
regex for URL including query string

I have a text or message.

Hey! try this http://www.test.com/test.aspx?id=53

Our requirement is to get link from a text.We are using following code

List<string> list = new List<string>();
Regex urlRx = new
Regex(@"(?<url>(http:|https:[/][/]|www.)([a-z]|[A-Z]|[0-9]|[/.]|[~])*)",
RegexOptions.IgnoreCase);

MatchCollection matches = urlRx.Matches(message);
foreach (Match match in matches)
{
   list.Add(match.Value);
}
return list;

It gives url but not the complete one.Output of the code is

http://www.test.com/test.aspx

But we need complete url like

http://www.test.com/test.aspx?id=53

Please suggest how to resolve that issue.Thanks in advance.

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

Try this regex, returns the query string also

(http|ftp|https)://([w+?.w+])+([a-zA-Z0-9~!@#$%^&*()_-=+\/?.:;',]*)?

You can test it on gskinner

Method 2

public List<string> GetLinks(string message)
{
    List<string> list = new List<string>();
    Regex urlRx = new Regex(@"((https?|ftp|file)://|www.)[A-Za-z0-9.-]+(/[A-Za-z0-9?&=;+!'()*-._~%]*)*", RegexOptions.IgnoreCase);

    MatchCollection matches = urlRx.Matches(message);
    foreach (Match match in matches)
    {
        list.Add(match.Value);
    }
    return list;
}

var list = GetLinks("Hey yo check this: http://www.google.com/?q=stackoverflow and this: http://www.mysite.com/?id=10&author=me");

It will find the following type of links:

http:// ...
https:// ...
file:// ...
www. ...

Method 3

If you are using this urls later on your code (extracting a part, querystring or etc.) please consider using

Uri class combine with HttpUtility helper.

Uri uri;
String strUrl = "http://www.test.com/test.aspx?id=53";
bool isUri = Uri.TryCreate(strUrl, UriKind.RelativeOrAbsolute, out uri);
if(isUri){
    Console.WriteLine(uri.PathAndQuery.ToString());
}else{
    Console.WriteLine("invalid");
}

It could help you with this operations.


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x