(Video materials in preparation)
join1 : Joins a master file to a transaction file.
(Only records with matching key fields are selected)
Usage : join1 key=<key> <master> <tran>
Options : +ng[<fd>]
-e
-s<c>
Version : Tue Jan 9 09:02:34 JST 2024
Edition : 1
Only those records in the text file <tran> where the <key> fields of
<tran> match corresponding fields of <master> are selected, then
joined with the fields in <master> and output. The join occurs by
adding the fields from <master> immediately after the <key> fields
in <tran>.
The <key> fields in <master> and <tran> MUST be sorted. Also, the
key fields in <master> must only contain unique values (the same
value cannot be repeated in the <key> fields). The <key> fields in
<tran> do not have this requirement; multiple records in <tran> can
have the same value in the <key> fields.
<key> designates the field position as fllows:
single field 2 the 2nd field
NF the last field
NF-1 the field just before the last field
contiguous fields 2/4 from the 2nd field to the 4th field
4/2 from the 4th field to the 2nd field
NF-3/NF from NF-3 field to the NF field
combination 2@NF the 2nd field and the NF field
There is no limit on the length of the key field or on the number
of key fields. The key field can also contain multi-byte characters
such as Japanese.
If you specify "r" as comparison method after the field position,
the fields are compared in reverse order. If you specify "n" as
comparison method after the field position, that field's values will
be compared as numbers. If you specify "nr" as comparison method
after the field, the values will be compared in reverse order as
numbers. If you specify comparison method before or after the "/",
you must use the same comparison method for both fields.
2n/5n OK
2n/5nr Error
2n/5r Error
When you specify "e" as comparison method or specify -e ootion and no
method, characters in the field are replaced as follows and compared
as string:
_ ==> 0x20 (space)
\0 ==> 0x00 (null)
\t ==> 0x09 (tab stop)
\n ==> 0x0a (new line)
\r ==> 0x0d (carrige return)
\_ ==> 0x5f (underscore)
\\ ==> 0x5c (back slash)
If the file name is omitted or if it is specified as "-" then the
command will read from standard input.
Select the records from the "expense" that contain data about the four
people listed in "master" and add the information from "master" to
those records.
(Master file: master)
$ cat master
0000003 Wilson_____ 26 F
0000005 Hawking____ 50 F
0000007 Newton_____ 42 F
0000010 Tesla______ 50 F
(Transaction file: expense)
$ cat expense
20070401 0000001 300
20070403 0000001 500
20070404 0000001 700
20070401 0000003 200
20070402 0000003 400
20070405 0000003 600
20070401 0000005 250
20070402 0000005 450
20070402 0000007 210
20070404 0000007 410
20070406 0000007 610
"master" is joined using the second field of "expense" as the key.
The join is only performed on records whose key fields exists in
"master". Records that don't match are ignored. "expense" can
contain multiple records that have the same key.
$ join1 key=2 master expense > data
$ cat data
20070401 0000003 Wilson_____ 26 F 200
20070402 0000003 Wilson_____ 26 F 400
20070405 0000003 Wilson_____ 26 F 600
20070401 0000005 Hawking____ 50 F 250
20070402 0000005 Hawking____ 50 F 450
20070402 0000007 Newton_____ 42 F 210
20070404 0000007 Newton_____ 42 F 410
20070406 0000007 Newton_____ 42 F 610
When specifying multiple continuous fields starting on the left as the
key field.
# (Master: master)
$ cat master
A 0000003 Wilson_____ 26 F
A 0000005 Hawking____ 50 F
B 0000007 Newton_____ 42 F
C 0000010 Tesla______ 50 F
(Transaction: grades)
$ cat grades
01 A 0000000 91 59 20 76 54
02 A 0000001 46 39 8 5 21
03 A 0000003 30 50 71 36 30
04 A 0000004 58 71 20 10 6
05 A 0000005 82 79 16 21 80
06 B 0000007 50 2 33 15 62
07 B 0000008 52 91 44 9 0
08 C 0000009 60 89 33 18 6
09 C 0000010 95 60 35 93 76
10 C 0000011 92 56 83 96 75
Using the second and third fields within file "grades" as key fields
(File is sorted in ascending order by fields 2 and 3), select only
the records where the same key value exists in "master" (sorted in
ascending order by fields 1 and 2), then join to the data in
"master" and output.
$ join1 key=2/3 master tran > data
03 A 0000003 Wilson_____ 26 F 30 50 71 36 30
05 A 0000005 Hawking____ 50 F 82 79 16 21 80
06 B 0000007 Newton_____ 42 F 50 2 33 15 62
09 C 0000010 Tesla______ 50 F 95 60 35 93 76
It is also possible to select multiple non-adjacent fields from the
left as key fields. In this case join the fields together with the
"@" symbol. In the example below, "tran" is sorted by fields 4 and
2 while "master" is sorted by fields 3 and 1.
$ join1 key=4@2 master tran > data
You can also select records that have no matching key in "master".
Matching records are output to stdout while non-matching records are
output to stderr. In this case, matching records are joined with
data from the master file, but non-matching records have no key in
the master file so they are not joined with anything and are output
as-is.
$ join1 +ng key=<key> <master> <tran> > ok-data 2> ng-data
(Master file: master)
$ cat master
0000003 Wilson_____ 26 F
0000005 Hawking____ 50 F
0000007 Newton_____ 42 F
0000010 Tesla______ 50 F
(Transaction file: grades)
$ cat grades
0000000 91 59 20 76 54
0000001 46 39 8 5 21
0000003 30 50 71 36 30
0000004 58 71 20 10 6
0000005 82 79 16 21 80
0000007 50 2 33 15 62
0000008 52 91 44 9 0
0000009 60 89 33 18 6
0000010 95 60 35 93 76
0000011 92 56 83 96 75
Output from "grades" the data for the four people who exist in
"master" and also output all records from "grades" that do not
exist in "master".
$ join1 +ng key=1 master grades > ok-data 2> ng-data
$ cat ok-data # (Matched Data)
0000003 Wilson_____ 26 F 30 50 71 36 30
0000005 Hawking____ 50 F 82 79 16 21 80
0000007 Newton_____ 42 F 50 2 33 15 62
0000010 Tesla______ 50 F 95 60 35 93 76
$ cat ng-data # (Unmatched Data)
0000000 91 59 20 76 54
0000001 46 39 8 5 21
0000004 58 71 20 10 6
0000008 52 91 44 9 0
0000009 60 89 33 18 6
0000011 92 56 83 96 75