BASH Programming

Bash cut command

The beauty of programming in bash is that if a command is available in the environment that you are working in or your script will run in, you can make use of it just by knowing its name. Commands such as those included in the Coreutils software toolbox are available on most systems. The cut command is no exception.

Despite how it sounds, if you happen to be most comfortable working in a desktop environment with a graphical user interface, the cut command doesn’t fill your clipboard. Instead, it cuts out pieces of standard input or a file and spills them out on your screen. Now you are bourne-again.

As it happens, the cut command is a powerful tool that helps you navigate through the complexities of text formatted documents and get things done in the command line and bash scripts like a boss.

Here we will focus on examples, getting our hands dirty as we dive deeper into the bash cut command. Read on.

When to use the cut command?

Use the cut command when manipulating field delimited text files such as CSV, log files, any text file with a format. For example, you may want to reduce the number of columns in a file instead of using other commands like awk. You may also want to retrieve the first section of text found inside parenthesis without using other commands like grep.

Cut command example in bash, Single cut examples

Example) Some random cut

Here is a quick cut example where the delimiter is set but the selected field is variable showing how to use the cut command dynamically.

Commands

echo "a|b" | cut '-d|' "-f$(( RANDOM%2+1))"

Output

a

Example) Cutting out IP addresses from nslookup output

nslookup is a helpful command-line utility for looking up a host IPs and names that you will find in commonly used DNS tools. It may be old but gets the job done. It comes with an output that is to my best guess standard across most systems.

For example, consider the command that follows.

Command

nslookup linuxhint.com

Output

Server:  dns.google
Address:  8.8.8.8
Server:    linuxhint.com
Address:  64.91.238.144

Now suppose that we want to reduce the nslookup output into a single ip by cutting, here is a snippet showing how to cut out nslookup ip values in bash. Note that we assume the lookup always returns with success just to make our example work. You may implement a more robust version as an exercise.

Commands

_ ()
{
nslookup ${1} | tail -n 2 | cut '-d:' '-f2' | xargs
}
_ linuxhint.com

Output

64.91.238.144

Example) Cutting out IP from dig output

Dig is a command-line utility included in a package call Bind 9 like nslookup that I just came across recently. I guess I really should have read advanced Linux networking commands. It is particularly helpful when trying to lookup large batches of host ips. Here is what the corresponding command line output would look like.

Command

dig linuxhint.com

Output

; <<>> DiG 9.14.6 <<>> linuxhint.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 38251
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
 
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;linuxhint.com.                 IN      A
 
;; ANSWER SECTION:
linuxhint.com.          806     IN      A       64.91.238.144
 
;; Query time: 14 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; MSG SIZE  rcvd: 58

Notes on format

(1) ;; indicate that which follows is a comment
(2) sections are separated by blank lines

Now suppose we want to implement the same function as the previous example using dig, here is how it may look.

Commands

_ ()
{
dig ${1} | grep -v -e '^;' -e '^\s*$' | cut '-f6'
}
_ linuxhint.com

Output

64.91.238.144

Notes

(1) In the example immediately above, our cut delimiter is the default, tab character
(2) In the grep command preceding cut, we filter out formatted lines discussed in Notes on format

Example) Convert CSV to TSV using cut

You are tasked to convert a CSV file to TSV. There is a hoard of ways to accomplish this task. However, we are going to use cut because we can. Here’s how.

Commands

{
csv-file() {
yes | head -n 5 | xargs -i echo 1,2,3,4,5
}
csv2tsv() {
cut '-d,' --output-delimiter="$( echo -n -e '\t')" '-f1-' -
}
csv-file
echo "-->"
csv-file | csv2tsv
}

Output

1,2,3,4,5
1,2,3,4,5
1,2,3,4,5
1,2,3,4,5
1,2,3,4,5
-->
1       2       3       4       5
1       2       3       4       5
1       2       3       4       5
1       2       3       4       5
1       2       3       4       5

Note

(1) The input delimiter we use is ,
(2) We set the output delimiter to the tab character
(3) -f1- means to output all fields

Double cut examples

Some formats require a double cut to get the fields we are looking for. The following examples show a few cases that you are likely to find in the wild.

Example) Cutting out the apache access log path info

In this next example, we are going to dig into some apache access logs and retrieve the path from the URL part. If you are not sure what that means, it’s the part that comes after the domain name in the URL. I’ll color it.

10.185.248.71 - - [09/Jan/2015:19:12:06 +0000] 808840 "GET /inventoryService/
inventory/purchaseItem?userId=20253471&itemId=23434300 HTTP/1.1"
500 17 "
-"
"Apache-HttpClient/4.2.6 (java 1.5)"

Example apache log line (above) from Loggly Apache Logging Basics

Also, here are some log format used in apache logs. Note that it is common that the request field shows up before other composite fields in double quotations. We will use this knowledge to cut out what we need from apache logs.

Common Log Format (CLF)
"%h %l %u %t "%r" %>s %b"
Common Log Format with Virtual Host
"%v %h %l %u %t "%r" %>s %b"
NCSA extended/combined log format
"%h %l %u %t "%r" %>s %b "%{Referer}i" "%{User-agent}i""

Source: Apache Module mod_log_config

Here is how the code would look as a boilerplate.

Boilerplate commands

access-log() {
echo '10.185.248.71 - - [09/Jan/2015:19:12:06 +0000] 808840
"GET /inventoryService/inventory/purchaseItem?userId=20253471&
itemId=23434300 HTTP/1.1" 500 17 "-" "Apache-HttpClient/4.2.6 (java 1.5)"'

}
first-cut() { true ; }
second-cut() { true ; }
paths() {
access-log | first-cut | second-cut
}

Now if we feed the above commands into the terminal or source from a script, you would be able to call the paths function. Initially, it doesn’t do anything but once first-cut and second-cut have been implemented, it will.

The following assumes that the boilerplate commands (above) are loading into context.

In the first-cut, we will need to implement a function to select what is in the first set of double-quotes. Implementation follows.

Commands

first-cut() {
cut '-d"' '-f2' -
}

Notes on above commands

(1) We expect the input to be piped in. That is where the – comes into play at the end of cut. You can get away without it but I think it is easy to read and more explicitly so we’ll use it.

(2) The input delimiter is ”

(3) The 2nd field is selected

Just to exemplify how first-cut works, let’s throw together a quick example.

Commands

echo "A "B C D" E" | first-cut # ? ?

Output

B C D

Okay. It works! Moving on.

In the second-cut we will need to implement a function to select what comes second in a line delimited by the space character.

Commands

second-cut() {
cut '-d ' '-f2' -
}

Notes on above commands

(1) second-cut is identical to first-cut except the input delimiter is the space character instead of a double quote

Just so that we are sure it works, here is a quick example.

Commands

echo "A "B C D" E" | first-cut | second-cut # ?

Output

C

Now that we know everything work, let’s try rerunning paths.

Commands

paths

Output

/inventoryService/inventory/purchaseItem?userId=20253471&itemId=23434300

Wrapping things up, let’s complete the boilerplate with a fully implemented version of first-cut and second-cut.

Commands

access-log() {
echo '10.185.248.71 - - [09/Jan/2015:19:12:06 +0000] 808840
"GET /inventoryService/inventory/purchaseItem?userId=20253471&
itemId=23434300 HTTP/1.1" 500 17 "-" "Apache-HttpClient/4.2.6 (java 1.5)"'

}
first-cut() {
cut '-d"' '-f2' -
}
second-cut() {
cut '-d ' '-f2' -
}
paths() {
access-log | first-cut | second-cut
}

Multiple cut examples

When it comes to command-line voodoo, it doesn’t get much harder than multiple cuts. At this point you should be asking yourself, should I be using cut for everything? Why not. Nevertheless, you will be tempted to cut your way through Linux if it works.

Example) Cut: The Game

:'######::'##::::'##:'########:'##... ##: ##:::: ##:... ##..:: ##:::..:: ##:::: ##::::
 ##:::: ##::::::: ##:::: ##:::: ##:::: ##::::::: ##:::: ##:::: ##:::: ##::: ##: ##::::
 ##:::: ##::::. ######::. #######::::: ##:::::......::::.......::::::..:::::

Trust me. It says cut.

Truth is that while thinking up bash cut command examples, I ran out of ideas. Why not make a game? Now that’s a good idea! How?

Dog ate my script. So, I guess I will have to write it from scratch. Hopefully, it comes out better than before.

Script
#!/bin/bash
## cut-the-game
## version 0.0.1 - initial
##################################################
banner() {
cat << EOF
tttt
ttt:::t
t:::::t
t:::::t
ccccccccccccccccuuuuuu    uuuuuuttttttt:::::ttttttt
cc:::::::::::::::cu::::u    u::::ut:::::::::::::::::t
c:::::::::::::::::cu::::u    u::::ut:::::::::::::::::t
c:::::::cccccc:::::cu::::u    u::::utttttt:::::::tttttt
c::::::c     cccccccu::::u    u::::u      t:::::t
c:::::c             u::::u    u::::u      t:::::t
c:::::c             u::::u    u::::u      t:::::t
c::::::c     cccccccu:::::uuuu:::::u      t:::::t    tttttt
c:::::::cccccc:::::cu:::::::::::::::uu    t::::::tttt:::::t
c:::::::::::::::::c u:::::::::::::::u    tt::::::::::::::t
cc:::::::::::::::c  uu::::::::uu:::u      tt:::::::::::tt
cccccccccccccccc    uuuuuuuu  uuuu        ttttttttttt
THE GAME
v0.0.1
EOF

}
game-over() {
cat << EOF
::::::::     :::    ::::    :::: :::::::::::::::::: :::     ::::::::::::::::::::::
:+:    :+:  :+: :+:  +:+:+: :+:+:+:+:      :+:    :+::+:     :+::+:       :+:    :+:
+:+        +:+   +:+ +:+ +:+:+ +:++:+      +:+    +:++:+     +:++:+       +:+    +:+
:#:       +#++:++#++:+#+  +:+  +#++#++:++# +#+    +:++#+     +:++#++:++#  +#++:++#:
+#+   +#+#+#+     +#++#+       +#++#+      +#+    +#+ +#+   +#+ +#+       +#+    +#+
#+#    #+##+#     #+##+#       #+##+#      #+#    #+#  #+#+#+#  #+#       #+#    #+#
######## ###     ######       #####################     ###    #############    ###
EOF

}
lost() {
cat << EOF
It appears that you have lost your way ...
EOF

}
egg() {
cat << EOF
##################################################
##############/   \\##############################
###########/         \############################
##########/     ^     \###########################
#########/   ^         \##########################
########/        \      | ########################
#######|  ^   ^  \\     | ########################
#######|        \\\\    / ########################
####### \  ^   \\\     / X########################
######## \            /  #########################
######### \\        //  X#########################
#########__-^^^^^^^^-___########################NS
:::::::::::::::::::::::::.........................
EOF

}
egg-in-a-meadow() {
cat << EOF
$( test ${egg_count} -gt 0 && echo -n "Deep in" || echo -n "In" ) a meadow ${meadow}
far far away. $( test ${egg_count} -gt 0 && echo -n "The" || echo -n "A" )
 cautious rabbit hides $( test ${egg_count} -gt 0 && echo -n "another" ||
echo -n "a" ) precious egg ${egg}.
Find the egg.
EOF

}
easter-egg() {
echo "${meadow}" \
| grep -e '[0-9]*' -o \
| sort \
| uniq -c \
| sort -n \
| head -1 \
| cut '-d ' '-f8-'
}
meadow() {
cat /dev/random \
| xxd -ps \
| head -1 \
| sed \
-e 's/0/_/g' \
-e 's/a/,/g' \
-e 's/b/|/g' \
-e 's/c/;/g' \
-e 's/d/:/g' \
-e 's/e/^/g' \
-e 's/f/$/g'
}
cut-the-game() {
local -i egg_count
egg_count=0
banner
read -p "press enter key to start"
while :
do
meadow=$( meadow )
egg=$( easter-egg )
egg-in-a-meadow
while :
do
read -n 1 -p "cut '-d" delimiter
echo -n "' -f"
read fields
test "${delimiter}" || { lost ; game-over ; return ; }
test "${fields}" || { lost ; game-over ; return ; }
meadow=$( echo "${meadow}" | cut "-d${delimiter}" "-f${fields}" )
echo -e "\n${meadow}\n"
test ! "${meadow}" = "${egg}" || {
echo -e "\nYou found the egg!\n"
egg
egg_count+=1
echo -n -e "\nYou now have ${egg_count} egg$( test ! ${egg_count} -gt 1 || echo -n s ).\n"
echo -e "\nIt appears that the rabbit left behind some tracks."
echo -e "\nDo you follow the rabbit deeper into the meadow to uncover more eggs? "
read
case ${REPLY} in
y|yes) break ;;
n|no) true
esac
return
}
test ! $( echo "${meadow}" | grep -e "${egg}" | wc -w ) -eq 0 || {
lost
game-over
return
}
done
done
}
##################################################
if [ ${#} -eq 0 ]
then
true
else
exit 1 # wrong args
fi
##################################################
cut-the-game
##################################################
## generated by create-stub2.sh v0.1.2
## on Thu, 26 Sep 2019 20:57:02 +0900
## see <https://github.com/temptemp3/sh2>
##################################################

Source: cut-the-game.sh

Commands

bash cut-the-game.sh
tttt
ttt:::t
t:::::t
t:::::t
ccccccccccccccccuuuuuu    uuuuuuttttttt:::::ttttttt
cc:::::::::::::::cu::::u    u::::ut:::::::::::::::::t
c:::::::::::::::::cu::::u    u::::ut:::::::::::::::::t
c:::::::cccccc:::::cu::::u    u::::utttttt:::::::tttttt
c::::::c     cccccccu::::u    u::::u      t:::::t
c:::::c             u::::u    u::::u      t:::::t
c:::::c             u::::u    u::::u      t:::::t
c::::::c     cccccccu:::::uuuu:::::u      t:::::t    tttttt
c:::::::cccccc:::::cu:::::::::::::::uu    t::::::tttt:::::t
c:::::::::::::::::c u:::::::::::::::u    tt::::::::::::::t
cc:::::::::::::::c  uu::::::::uu:::u      tt:::::::::::tt
cccccccccccccccc    uuuuuuuu  uuuu        ttttttttttt
THE GAME
v0.0.1
press enter key to start enter
In a meadow 47$141243_7$3;189|65,,5_52,_$^48$265^$|1441:^436459641:^:344
far far away. A cautious rabbit hides a precious egg 141243.
Find the egg.
cut '-d$' -f2
141243_7
cut '-d_' -f1
141243
You found the egg!
##################################################
##############/   \##############################
###########/         \############################
##########/     ^     \###########################
#########/   ^         \##########################
########/        \      | ########################
#######|  ^   ^  \     | ########################
#######|        \\    / ########################
####### \  ^   \\     / X########################
######## \            /  #########################
######### \        //  X#########################
#########__-^^^^^^^^-___########################NS
:::::::::::::::::::::::::.........................
You now have 1 egg.
It appears that the rabbit left behind some tracks.
Do you follow the rabbit deeper into the meadow to uncover more eggs? No
Bottom line

The cut command is not going anywhere. That is to say, familiarity with its usage makes a great addition to your command line toolbox. I hope the above example helped improve your understanding of cut.

About the author

Nicholas Shellabarger

Nicholas Shellabarger

A developer and advocate of shell scripting and vim. His works include automation tools, static site generators, and web crawlers written in bash. For work he tools with cloud computing, app development, and chatbots. He codes in bash, python, or php, but is open to offers.