(Video materials in preparation)
sm2 : Sum values by key
Usage : sm2 <k1> <k2> [<s1> <s2>]
sm2 [key=<key>] [val=<val>]
Options : +count
-e
-s<c>
Version : Tue Jan 9 09:02:34 JST 2024
Edition : 1
Sums each field of records that have the same key within <file>.
The range of key fields begins with "k1" and ends with "k2", and
every field in the range from "s1" to "s2" is summed. Records with
the same key are summed in a single line and output. Fields not
specified as key or sum fields are not output.
<key> designates the field position as fllows:
single field 2 the 2nd field
NF the last field
NF-1 the field just before the last field
contiguous fields 2/4 from the 2nd field to the 4th field
4/2 from the 4th field to the 2nd field
NF-3/NF from NF-3 field to the NF field
combination 2@NF the 2nd field and the NF field
There is no limit on the length of the key field or on the number
of key fields. The key field can also contain multi-byte characters
such as Japanese.
If you specify "r" as comparison method after the field position,
the fields are compared in reverse order. If you specify "n" as
comparison method after the field position, that field's values will
be compared as numbers. If you specify "nr" as comparison method
after the field, the values will be compared in reverse order as
numbers. If you specify comparison method before or after the "/",
you must use the same comparison method for both fields.
2n/5n OK
2n/5nr Error
2n/5r Error
When you specify "e" as comparison method or specify -e ootion and no
method, characters in the field are replaced as follows and compared
as string:
_ ==> 0x20 (space)
\0 ==> 0x00 (null)
\t ==> 0x09 (tab stop)
\n ==> 0x0a (new line)
\r ==> 0x0d (carrige return)
\_ ==> 0x5f (underscore)
\\ ==> 0x5c (back slash)
<val> is same as <key> except no comparison method is allowed.
Qty_Sold per Item, Store and Day
No Store Date A B C D E
(Original Data: data1)
0001 43rd_St_Shop 20060201 91 59 20 76 54
0001 43rd_St_Shop 20060202 46 39 8 5 21
0001 43rd_St_Shop 20060203 82 0 23 84 10
0002 Tribeca_Shop 20060201 30 50 71 36 30
0002 Tribeca_Shop 20060202 78 13 44 28 51
0002 Tribeca_Shop 20060203 58 71 20 10 6
0003 Wall_St_Shop 20060201 82 79 16 21 80
0003 Wall_St_Shop 20060202 50 2 33 15 62
0003 Wall_St_Shop 20060203 52 91 44 9 0
0004 Midtown_Shop 20060201 60 89 33 18 6
0004 Midtown_Shop 20060202 95 60 35 93 76
0004 Midtown_Shop 20060203 92 56 83 96 75
Output the quantity sold for each shop. The key fields are from
field 1 to 2 and the sum fields are from 4 to 8.
$ sm2 1 2 4 8 data1 | fcols
0001 43rd_St_Shop 219 98 51 165 85
0002 Tribeca_Shop 166 134 135 74 87
0003 Wall_St_Shop 184 172 93 45 142
0004 Midtown_Shop 247 205 151 207 157
The "+count" option also inserts the total number of records that
match each key immediately after the key field.
(Original Data: data2)
1111 3
1111 5
1111 2
2222 3
2222 10
3333 4
3333 8
3333 9
3333 6
Outputs rows with the same key.
$ sm2 +count 1 1 2 2 data2
1111 3 10
2222 2 13
3333 4 27
If there are decimals in the values, sm2 will output all results to
the highest precision found in the values summed.
$ fcols -- data3
a 1.4 2.55
a 2 4
b 1.33 2.1
b 5.222 3.12
$ sm2 1 1 2 3 data3 | fcols --
a 3.4 6.55
b 6.552 5.22
If you specify 0 0 for the key, sm2 outputs the grand total of the
sum fields.
$ cat data4
a 1
b 2
c 3
$ sm2 0 0 2 2 data4
6
To sum the sizes of all files in a specified directory:
$ ls -l directory | tail -n +2 | sm2 0 0 5 5