Difference between revisions of "Awk"

From wiki
Jump to navigation Jump to search
Line 56: Line 56:
 
:Execute code if var matches <pattern>
 
:Execute code if var matches <pattern>
 
:!~ to negate the match
 
:!~ to negate the match
 +
 +
==Magic==
 +
;gawk '{if (NR%3 == 0) {print p$0;p=""}else{p=p$0}}'
 +
:Combine lines per 3 (Basic code found on stackoverflow [https://stackoverflow.com/questions/3194534/joining-two-consecutive-lines-using-awk-or-sed])

Revision as of 01:07, 19 November 2018

Basics

awk -F"," -v VAR=<value> '{<code}'
In awk: Use "," as field-separator, set VAR to <value> in.


awk '{ for(i = 1; i <= NF; i++) { print $i; } }'
Iterate over all fields

Standard variables

  • NF Number of fields on the line
  • FS The field seperator (default is whitespace)
  • $0 The entire line
  • $1 First field in line
a=length(field)
Get the length of a field.


String manipulation

substr(<string>,<start>,<num>)
From <sting> return <num> characters, starting from <start>
n=split(var,ARR,<fs>)
Split var in array ARR, n holds the number of elements in ARR, <fs> is the field separator, if not given the variable FS is used as field separator (default white space).

Default action of awk for read line is:

NF=split(var,ARR," ")
NR++
$0=var
for ( i=1 ; i<=NF ; i++ ) {
 $i=ARR[i]
}
gsub(<regexp>,<string>,<variable>)
Replace <regexp> with <string> in <variable>. Return number of replacements.
<variable> is modified.
printf "%-10s %05d\n", $1, $2
Format output like in Python:Strings#Advanced

Calculations

Print average of field 4 for all records in <file> containing 'GW'

awk 'BEGIN {}
/GW/ {GW+=$4}
END {print GW/NR}' <file>

Matching

/<pattern>/ {code}
Execute code if the current line matches <pattern>
var ~ /<pattern>/ {code}
Execute code if var matches <pattern>
!~ to negate the match

Magic

gawk '{if (NR%3 == 0) {print p$0;p=""}else{p=p$0}}'
Combine lines per 3 (Basic code found on stackoverflow [1])