From wiki
Jump to navigation Jump to search


awk -F"," -v VAR=<value> '{codeblock}'
In awk: Use "," as field-separator and set VAR to <value>. With -v you can pass variables from the shell to the awk program.
awk '{ for(i = 1; i <= NF; i++) { print i,$i; } }'
Iterate over all fields

Standard variables

  • NF Number of fields on the line
  • FS The field separator (default is whitespace)
  • $0 The entire line
  • $1 First field in line
  • NR The number of the current line
Get the length of a field.


Arrays can have named keys, and work a bit like a python dict or Perl hashes.

Output each string found in the array <ar> only 1 time.

 dict[ar[1]] = 1
} END {
 for (key in dict) 
  OUT=OUT" "key
 print OUT

String manipulation

From <sting> return <num> characters, starting from <start>
Split var in array ARR, n holds the number of elements in ARR, <fs> is the field separator, if not given the variable FS is used as field separator (default white space).

Default action of awk for read line is:

NF=split(var,ARR," ")
for ( i=1 ; i<=NF ; i++ ) {
Replace <regexp> with <string> in <variable>. Return number of replacements.
<variable> is modified.
printf "%-10s %05d\n", $1, $2
Format output like in Python:Strings#Advanced


Print average of field 4 for all records in <file> containing 'GW'

awk 'BEGIN {}
/GW/ {GW+=$4}
END {print GW/NR}' <file>


/<pattern>/ {codeblock}
Execute code if the current line matches <pattern>
var ~ /<pattern>/ {codeblock}
Execute code if var matches <pattern>
!~ to negate the match

Control statements

If ( VAR == <value> ) { codeblock } else { codeblock }
The if, then, else construction
if (n)
if (n != 0)
If n is not equal to zero
if n < 5 this returns 10, else 0
for (var in ARR) { print ARR[var] }
Read all indexes of ARR in arbitrary order.
if (var in ARR) { print ARR[var] }
Check if index var is in ARR
if (ARR[var] == "" )
Check if index var has a value
for (init;test;incr) { codeblock }
Loop. Start with init, as long as test is true, execute codeblock, after each loop execute incr.
for (num=10;num<=100:num++) {
print num
function NAME (par,par,..)
Create a function. Parameters of the functions are local. All local variables should be defined as parameter to avoid overwriting a global variable. Overwriting a global variable on the other hand is a way to return results of the function.
return <value>
End of function and give it a return value


gawk '{if (NR%3 == 0) {print p$0;p=""}else{p=p$0}}'
Combine lines per 3 (Basic code found on stackoverflow [1])