Search Results for “awk”

Source: AWK

AWK () is a domain-specific language designed for text processing and typically used as a data extraction and reporting tool. Like sed and grep, it is a filter, and it is a standard feature of most Unix-like operating systems.
The AWK language is a data-driven scripting language consisting of a set of actions to be taken against streams of textual data – either run directly on files or used as part of a pipeline – for purposes of extracting or transforming text, such as producing formatted reports. The language extensively uses the string datatype, associative arrays (that is, arrays indexed by key strings), and regular expressions. While AWK has a limited intended application domain and was especially designed to support one-liner programs, the language is Turing-complete, and even the early Bell Labs users of AWK often wrote well-structured large AWK programs.
AWK was created at Bell Labs in the 1970s, and its name is derived from the surnames of its authors: Alfred Aho (author of egrep), Peter Weinberger (who worked on tiny relational databases), and Brian Kernighan. The acronym is pronounced the same as the name of the bird species auk, which is illustrated on the cover of The AWK Programming Language. When written in all lowercase letters, as awk, it refers to the Unix or Plan 9 program that runs scripts written in the AWK programming language.

History

AWK

Structure of AWK programs

AWK

Commands

AWK

= The print command

AWK

= Built-in variables

Awk

= Variables and syntax

= User-defined functions

Examples

= Hello, World!

AWK

= Print lines longer than 80 characters

= Count words

= Sum last word

AWK

= Match a range of input lines

AWK

1 is true for the 1st, 5th, 9th, etc., lines of input. Likewise, NR % 4

= Calculate word frequencies

awk

= Match pattern from command line

awk

Self-contained AWK scripts

AWK

awk

The -f tells awk that the argument that follows is the file to read the AWK program from, which is the same flag that is used in sed. Since they are often used for one-liners, both these programs default to executing a program given as a command-line argument, rather than a separate file.

Versions and implementations

AWK was originally written in 1977 and distributed with Version 7 Unix.
In 1985 its authors started expanding the language, most significantly by adding user-defined functions. The language is described in the book The AWK Programming Language, published 1988, and its implementation was made available in releases of UNIX System V. To avoid confusion with the incompatible older version, this version was sometimes called "new awk" or nawk. This implementation was released under a free software license in 1996 and is still maintained by Brian Kernighan (see external links below).
Old versions of Unix, such as UNIX/32V, included awkcc, which converted AWK to C. Kernighan wrote a program to turn awk into C++; its state is not known.

BWK awk, also known as nawk, refers to the version by Brian Kernighan. It has been dubbed the "One True AWK" because of the use of the term in association with the book that originally described the language and the fact that Kernighan was one of the original authors of AWK. FreeBSD refers to this version as one-true-awk. This version also has features not in the book, such as tolower and ENVIRON that are explained above; see the FIXES file in the source archive for details. This version is used by, for example, Android, FreeBSD, NetBSD, OpenBSD, macOS, and illumos. Brian Kernighan and Arnold Robbins are the main contributors to a source repository for nawk: github.com/onetrueawk/awk.
gawk (GNU awk) is another free-software implementation and the only implementation that makes serious progress implementing internationalization and localization and TCP/IP networking. It was written before the original implementation became freely available. It includes its own debugger, and its profiler enables the user to make measured performance enhancements to a script. It also enables the user to extend functionality with shared libraries. Some Linux distributions include gawk as their default AWK implementation. As of version 5.2 (September 2022) gawk includes a persistent memory feature that can remember script-defined variables and functions from one invocation of a script to the next and pass data between unrelated scripts, as described in the Persistent-Memory gawk User Manual: www.gnu.org/software/gawk/manual/pm-gawk/.
gawk-csv. The CSV extension of gawk provides facilities for inputting and outputting CSV formatted data.
mawk is a very fast AWK implementation by Mike Brennan based on a bytecode interpreter.
libmawk is a fork of mawk, allowing applications to embed multiple parallel instances of awk interpreters.
awka (whose front end is written atop the mawk program) is another translator of AWK scripts into C code. When compiled, statically including the author's libawka.a, the resulting executables are considerably sped up and, according to the author's tests, compare very well with other versions of AWK, Perl, or Tcl. Small scripts will turn into programs of 160–170 kB.
tawk (Thompson AWK) is an AWK compiler for Solaris, DOS, OS/2, and Windows, previously sold by Thompson Automation Software (which has ceased its activities).
Jawk is a project to implement AWK in Java, hosted on SourceForge. Extensions to the language are added to provide access to Java features within AWK scripts (i.e., Java threads, sockets, collections, etc.).
xgawk is a fork of gawk that extends gawk with dynamically loadable libraries. The XMLgawk extension was integrated into the official GNU Awk release 4.1.0.
QSEAWK is an embedded AWK interpreter implementation included in the QSE library that provides embedding application programming interface (API) for C and C++.
libfawk is a very small, function-only, reentrant, embeddable interpreter written in C
BusyBox includes an AWK implementation written by Dmitry Zakharov. This is a very small implementation suitable for embedded systems.
CLAWK by Michael Parker provides an AWK implementation in Common Lisp, based upon the regular expression library of the same author.
goawk is an AWK implementation in Go with a few convenience extensions by Ben Hoyt, hosted on Github.
The gawk manual has a list of more Awk implementations.

Books

Aho, Alfred V.; Kernighan, Brian W.; Weinberger, Peter J. (1988-01-01). The AWK Programming Language. New York, NY: Addison-Wesley. ISBN 0-201-07981-X. Retrieved 2017-01-22.
Aho, Alfred V.; Kernighan, Brian W.; Weinberger, Peter J. (2023-09-06). The AWK Programming Language, Second Edition. Hoboken, New Jersey: Addison-Wesley Professional. ISBN 978-0-13-826972-2. Archived from the original on 2023-10-27. Retrieved 2023-11-03.
Robbins, Arnold (2001-05-15). Effective awk Programming (3rd ed.). Sebastopol, CA: O'Reilly Media. ISBN 0-596-00070-7. Retrieved 2009-04-16.
Dougherty, Dale; Robbins, Arnold (1997-03-01). sed & awk (2nd ed.). Sebastopol, CA: O'Reilly Media. ISBN 1-56592-225-5. Retrieved 2009-04-16.
Robbins, Arnold (2000). Effective Awk Programming: A User's Guide for Gnu Awk (1.0.3 ed.). Bloomington, IN: iUniverse. ISBN 0-595-10034-1. Archived from the original on 12 April 2009. Retrieved 2009-04-16.

References

External links

The Amazing Awk Assembler by Henry Spencer.
"AWK (formerly) at Curlie". Curlie. Archived from the original on 2022-03-18.
awklang.org The site for things related to the awk language