Why isn’t this awk command doing a full outer join?

Objective: Merge the contents of two files using common key present in the files

 file1.txt
 =========
 key1   11
 key2   12
 key3   13


 file2.txt
 =========
 key2   22
 key3   23
 key4   24
 key5   25


 Expected Output :
 ==================
 key1   11
 key2   12    22
 key3   13    23 
 key4   24
 key5   25

Approaches tried:

  1. join command:
    join -a 1 -a 2 file1.txt file2.txt ## full outer join
  2. awk:
    awk 'FNR==NR{a[$1]=$2;next;}{ print $0, a[$1]}' 2.txt 1.txt

Approach 2 is resulting in a right outer join and NOT a full outer join:

   key1  11
   key2  12    22
   key3  13    23

What needs to be modified in approach 2 to result in a full outer join?

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

My solution using join:

join -a1 -a2  -1 1 -2 1 -o 0,1.2,2.2 -e "NULL" file1 file2

I don’t know much about awk for joining large files and always use join.

key1 11 NULL
key2 12 22
key3 13 23
key4 NULL 24
key5 NULL 25

Method 2

My solution with awk:

awk '{a[$1]=a[$1]" "$2} END{for(i in a)print i, a[i]}' file1.txt file2.txt

With keyn as index, append the second fields from each line to corresponding a[keyn](with space). At the end, print all the indices and array element.

Output:

AMD$ awk '{a[$1]=a[$1]" "$2} END{for(i in a)print i, a[i]}' file1.txt file2.txt
key1  11
key2  12 22
key3  13 23
key4  24
key5  25

Method 3

With awk, try:

awk '{a[$1]=($1 in a)?a[$1]" "$2:$2};END{for(i in a)print i,a[i]}' file1 file2

For huge files, you should use join instead of awk approach, since when awk approach will store all files content in memory before printing out.

Method 4

Your first join seems to be ok here, although it is misspelled in caps letters:

$>join -a 1 -a 2 file1.txt file2.txt 
key1 11
key2 12 22
key3 13 23
key4 24
key5 25


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x