(Video materials in preparation)
tagcount : Count the number of rows that contain the same key
Usage : tagcount <k1> <k2> <file>
tagcount key=<key> <file>
Options : -e
-s<c>
--tagname <tag>
Version : Tue Jan 9 09:02:34 JST 2024
Edition : 1
This tool outputs the number of rows (records) where the key field
is the same value as the value specified in the file passed as an
argument or via standard input. The key fields are specified using
<k1> as the first field and <k2> as the last field.
There is 2 way to specify key field: using <k1> as the first field
and <k2> as the last field, and using <key> as the key specification.
<key> designates the field position as fllows:
single field TAGa TAGa field
contiguous fields TAGa/TAGb from TAGa field to TAGb field
combination TAGa@TAGb TAGa field and TAGb field
There is no limit on the length of the key field or on the number
of key fields. The key field can also contain multi-byte characters
such as Japanese.
If you specify ":r" as comparison method after the field position,
the fields are compared in reverse order. If you specify ":n" as
comparison method after the field position, that field's values
will be compared as numbers. If you specify ":nr" as comparison
method after the field, the values will be compared in reverse order
as numbers. If you specify comparison method before or after the "/",
you must use the same comparison method for both fields.
TAGa:n/TAGb:n OK
TAGa:n/TAGb:nr Error
TAGa:n/TAGb:r Error
When you specify ":e" as comparison method or specify -e ootion and
no method, characters in the field are replaced as follows and
compared as string:
_ ==> 0x20 (space)
\0 ==> 0x00 (null)
\t ==> 0x09 (tab stop)
\n ==> 0x0a (new line)
\r ==> 0x0d (carrige return)
\_ ==> 0x5f (underscore)
\\ ==> 0x5c (back slash)
The tag name specfying the field can be eclosed by braces {}. In
this case, tag names can include special charcter like "/" or "@".
Moreover, tag names can include pairs of braces. When comparison
method is attached to brace enclosed tag name, ":" should be ommited.
{TAGa}n/{TAGb}n
$ cat data
K1 K2 T1 T2 N1 N2 N3 N4 N5
01 Texas 01 Austin 91 59 20 76 54
01 Texas 02 Dallas 46 39 8 5 21
01 Texas 03 Houston 82 0 23 84 10
02 New_York 04 Manhattan 30 50 71 36 30
02 New_York 05 Brooklyn 78 13 44 28 51
02 New_York 06 Bronx 58 71 20 10 6
02 New_York 07 Queens 39 22 13 76 08
02 New_York 08 StatenIsland 82 79 16 21 80
02 New_York 09 Harlem 50 2 33 15 62
03 New_Jersey 10 Newark 52 91 44 9 0
03 New_Jersey 11 Camden 60 89 33 18 6
03 New_Jersey 12 Trenton 95 60 35 93 76
04 Connecticut 13 Hartford 92 56 83 96 75
04 Connecticut 14 New_Haven 30 12 32 44 19
04 Connecticut 15 Mystic 48 66 23 71 24
04 Connecticut 16 Bridgetown 45 21 24 39 03
Counts the rows for each state and outputs the result.
$ tagcount K1 K2 data
K1 K2 *
01 Texas 3
02 New_York 6
03 New_Jersey 3
04 Connecticut 4
Use key=<key> format.
$ tagcount key=K1/K2 data
K1 K2 *
01 Texas 3
02 New_York 6
03 New_Jersey 3
04 Connecticut 4
Use the --tagname <tag> option to provide a name for the results
field.
$ tagcount --tagname COUNT K1 K2 data
K1 K2 COUNT
01 Texas 3
02 New_York 6
03 New_Jersey 3
04 Connecticut 4