(Video materials in preparation)
join2 : Joins a master file to a transaction file.
(For records that match, joins data from master;
for records that do not match, joins dummy data.)
Usage : join2 key=<key> <master> <tran>
Options : +<string>
-f<n>
-e
-s<c>
Version : Tue Jan 9 09:02:34 JST 2024
Edition : 1
Only those records in the text file <tran> where the <key> fields of
<tran> match corresponding fields of <master> are selected, then
joined with the fields in <master> and output. The join occurs by
adding the fields from <master> immediately after the <key> fields
in <tran>. For records that do not match, padding data "_" is joined
for the amount of fields in master. It is also possible to specify
different padding data by +<string> option.
The <key> fields in <master> and <tran> MUST be sorted. Also, the
key fields in <master> must only contain unique values (the same
value cannot be repeated in the <key> fields). The <key> fields in
<tran> do not have this requirement; multiple records in <tran> can
have the same value in the <key> fields.
<key> designates the field position as fllows:
single field 2 the 2nd field
NF the last field
NF-1 the field just before the last field
contiguous fields 2/4 from the 2nd field to the 4th field
4/2 from the 4th field to the 2nd field
NF-3/NF from NF-3 field to the NF field
combination 2@NF the 2nd field and the NF field
There is no limit on the length of the key field or on the number
of key fields. The key field can also contain multi-byte characters
such as Japanese.
If you specify "r" as comparison method after the field position,
the fields are compared in reverse order. If you specify "n" as
comparison method after the field position, that field's values will
be compared as numbers. If you specify "nr" as comparison method
after the field, the values will be compared in reverse order as
numbers. If you specify comparison method before or after the "/",
you must use the same comparison method for both fields.
2n/5n OK
2n/5nr Error
2n/5r Error
When you specify "e" as comparison method or specify -e ootion and no
method, characters in the field are replaced as follows and compared
as string:
_ ==> 0x20 (space)
\0 ==> 0x00 (null)
\t ==> 0x09 (tab stop)
\n ==> 0x0a (new line)
\r ==> 0x0d (carrige return)
\_ ==> 0x5f (underscore)
\\ ==> 0x5c (back slash)
If <master> is an empty file (0 bytes) an error is generated. If
-f<n> option is specified, this error is not generated and <n> is
used as number of non-key fields of <master>.
If "-" is specified for <master> then the command reads from
standard input. If "-" is specified for <tran> or if <tran> is
omitted then thecommand reads from standard input.
(Master file: master)
$ cat master
0000003 Wilson_____ 26 F
0000005 Hawking____ 50 F
0000007 Newton_____ 42 F
0000010 Tesla______ 50 F
(Transaction file: grades)
$ cat grades
A 0000000 91 59 20 76 54
A 0000001 46 39 8 5 21
A 0000003 30 50 71 36 30
A 0000004 58 71 20 10 6
A 0000005 82 79 16 21 80
A 0000007 50 2 33 15 62
A 0000008 52 91 44 9 0
A 0000009 60 89 33 18 6
A 0000010 95 60 35 93 76
A 0000011 92 56 83 96 75
Rows that don't match <master> are padded with "_".
$ join2 key=2 master grades > data
$ cat data
A 0000000 _ _ _ 91 59 20 76 54
A 0000001 _ _ _ 46 39 8 5 21
A 0000003 Wilson_____ 26 F 30 50 71 36 30
A 0000004 _ _ _ 58 71 20 10 6
A 0000005 Hawking____ 50 F 82 79 16 21 80
A 0000007 Newton_____ 42 F 50 2 33 15 62
A 0000008 _ _ _ 52 91 44 9 0
A 0000009 _ _ _ 60 89 33 18 6
A 0000010 Tesla______ 50 F 95 60 35 93 76
A 0000011 _ _ _ 92 56 83 96 75
When specifying multiple continuous fields starting on the left as
the key field.
# (Master: master)
$ cat master
A 0000003 Wilson_____ 26 F
A 0000005 Hawking____ 50 F
B 0000007 Newton_____ 42 F
C 0000010 Tesla______ 50 F
(Transaction: grades)
$ cat grades
01 A 0000000 91 59 20 76 54
02 A 0000001 46 39 8 5 21
03 A 0000003 30 50 71 36 30
04 A 0000004 58 71 20 10 6
05 A 0000005 82 79 16 21 80
06 B 0000007 50 2 33 15 62
07 B 0000008 52 91 44 9 0
08 C 0000009 60 89 33 18 6
09 C 0000010 95 60 35 93 76
10 C 0000011 92 56 83 96 75
Match key on the 2nd and 3rd fields.
$ join2 key=2/3 master grades > data
$ cat data
01 A 0000000 _ _ _ 91 59 20 76 54
02 A 0000001 _ _ _ 46 39 8 5 21
03 A 0000003 Wilson_____ 26 F 30 50 71 36 30
04 A 0000004 _ _ _ 58 71 20 10 6
05 A 0000005 Hawking____ 50 F 82 79 16 21 80
06 B 0000007 Newton_____ 42 F 50 2 33 15 62
07 B 0000008 _ _ _ 52 91 44 9 0
08 C 0000009 _ _ _ 60 89 33 18 6
09 C 0000010 Tesla______ 50 F 95 60 35 93 76
10 C 0000011 _ _ _ 92 56 83 96 75
The "+<string>" option lets you choose the character to use for
padding. Specify the padding string after the "+".
(Master file: master)
$ cat master
0000003 Wilson_____ 26 F
0000005 Hawking____ 50 F
0000007 Newton_____ 42 F
0000010 Tesla______ 50 F
(Transaction file: grades)
$ cat grades
0000000 91 59 20 76 54
0000001 46 39 8 5 21
0000003 30 50 71 36 30
0000004 58 71 20 10 6
0000005 82 79 16 21 80
0000007 50 2 33 15 62
0000008 52 91 44 9 0
0000009 60 89 33 18 6
0000010 95 60 35 93 76
0000011 92 56 83 96 75
Use "@" as the dummy data.
$ join2 +@ key=1 master grades > data
$ cat data
0000000 @ @ @ 91 59 20 76 54
0000001 @ @ @ 46 39 8 5 21
0000003 Wilson_____ 26 F 30 50 71 36 30
0000004 @ @ @ 58 71 20 10 6
0000005 Hawking____ 50 F 82 79 16 21 80
0000007 Newton_____ 42 F 50 2 33 15 62
0000008 @ @ @ 52 91 44 9 0
0000009 @ @ @ 60 89 33 18 6
0000010 Tesla______ 50 F 95 60 35 93 76
0000011 @ @ @ 92 56 83 96 75
$ join2 -f3 key=1 /dev/null grades > data
$ cat data
0000000 _ _ _ 91 59 20 76 54
0000001 _ _ _ 46 39 8 5 21
0000003 _ _ _ 30 50 71 36 30
0000004 _ _ _ 58 71 20 10 6
0000005 _ _ _ 82 79 16 21 80
0000007 _ _ _ 50 2 33 15 62
0000008 _ _ _ 52 91 44 9 0
0000009 _ _ _ 60 89 33 18 6
0000010 _ _ _ 95 60 35 93 76
0000011 _ _ _ 92 56 83 96 75