Uniq won’t remove duplicate

I was using the following command

curl -silent http://api.openstreetmap.org/api/0.6/relation/2919627 http://api.openstreetmap.org/api/0.6/relation/2919628 | grep node | awk '{print $3}' | uniq

when I wondered why uniq wouldn’t remove the duplicates. Any idea why ?

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

You have to sort the output in order for the uniq command to be able to work. See the man page:

Filter adjacent matching lines from INPUT (or standard input), writing to OUTPUT (or standard output).

So you can pipe the output into sort first and then uniq it. Or you can make use of sort‘s ability to perform the sort and unique all together like so:

$ ...your command... | sort -u

Examples

sort | uniq

$ cat <(seq 5) <(seq 5) | sort | uniq
1
2
3
4
5

sort -u

$ cat <(seq 5) <(seq 5) | sort -u
1
2
3
4
5

Your example

$ curl -silent http://api.openstreetmap.org/api/0.6/relation/2919627 http://api.openstreetmap.org/api/0.6/relation/2919628 
      | grep node | awk '{print $3}' | sort -u
ref="1828989762"
ref="1829038636"
ref="1829656128"
ref="1865479751"
ref="451116245"
ref="451237910"
ref="451237911"
ref="451237917"
ref="451237920"
ref="451237925"
ref="451237933"
ref="451237934"
ref="451237941"
ref="451237943"
ref="451237945"
ref="451237947"
ref="451237950"
ref="451237953"

Method 2

Remove adjacent duplicate lines while retaining line order (that is, without sorting first) using the -u option:

uniq -u input.txt >output.txt

This is a bit odd, because reading the man page or the help it seems like it’s saying it will only print the lines of the input file that are not duplicated, but that’s not so. The -u option will print lines that are duplicated, but will output them just once, as well as lines that are not duplicated. It’s badly worded, and I don’t understand why this isn’t the default behavior anyway. Odd. But it is what it is.


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x