Guest Post: This is a guest post on using awk
, grep
and piping in the Unix/Linux command line interface (CLI) from Josh Reichardt. Josh is a DevOps Engineer with about.me and the owner of Practical System Administration, where he writes about scripting, devops, virtualization, hardware and policies. Follow him on Twitter at: @Practical_SA.
This is a follow-up post on the previous Using Find, Sed and Grep in Linux post. A majority of the use cases have been covered already, but I’d like to touch on a few more topics, including a very powerful technique for combining the mentioned commands. Additionally, I will quickly cover a handy command line utility and add a few more CLI tips and tricks on top of the last post.
Instead of going into a lot of detail about each topic, I will first focus on one aspect of the Unix/Linux way of doing things from the command line. This allows almost all of the other tools to be built on top of this concept by chaining them together to increase their utility.
This technique is known as “piping” and has been around for a very long time. It is one of the pillars of the command line in Linux and it can be extremely useful. Without piping, the value of the Linux command line would be greatly diminished. I will attempt to demonstrate some of the power and benefits of piping below.
Using Pipes in the CLI
Without getting into too much detail (we’re avoiding redirection), piping is a way to send data (output) from one program as input to another program. The symbol in Linux for the pipe is the “|” symbol. Below is a simple example that shows piping in action.
ls | wc -l
This will run the ls
command and use its output as data (input) to the wc
command. Chaining these two commands together allows you to easily count the number of files and directories in a given location. Already, we can see that this composability is flexible and very powerful.
One of my favorite ways to extract data from a given command is to pipe it through grep
and then through sed
. Take, for example, the docker images command. The command prints out a nicely formatted list, which is perfect for pipes.
Say, we wanted to remove images containing a specific substring (i.e., image name in this example) that are over one day or one week old. We can use pipes to chain a number of commands together to accomplish this. Without pipes, there is not an easy way with Docker alone to filter the images we want with these constraints.
docker images | grep myrepo/ | grep -E 'days|weeks' | awk '{print $3}' | xargs docker rmi
One tip that has served me well when building piped commands is to test each step incrementally. Take the above command, for example. Each of the components can be broken down individually so you can start small when testing the commands and then build on top of them. So, you can take the first part in the example above:
docker images | grep myrepo/
Then add the next pipe on top of it and test again:
docker images | grep myrepo/ | grep -E 'weeks|months'
You can continue building out and composing your command one step at a time until it does what you want and eventually end up with something similar to the original example.
The Awk
Command
I’d like to briefly touch on the awk
command. It wasn’t mentioned in the previous post, but is a great addition to anybody’s skillset. Basically, awk
is a tool for extracting and manipulating data that can also be used as a command line reporting tool. Awk
is an extremely sophisticated tool, like grep
and I haven’t ever come close to mastering it but rather have found a handful of use cases that have worked for me.
The most useful example of awk
is its ability to print columns or fields. This concept was demonstrated in the above Docker example, but it may not have been apparent what exactly was happening. Take the following text and suppose it represents a student’s math score.
$ cat scores.txt Jon Doe 80 Mary Jane 90 Billy Bob 87
If you are only interested in the score, you can use awk
to filter out the other columns.
awk '{print $3}' scores.txt
This will print out the following.
80 90 87
The $3 represents the column we want to filter. Say you wanted to grab the names of the students only.
awk '{print $1 " " $2}' scores.txt
You would simply use $1 for first name and a space character and $2 for last name.
There is obviously a ton more to do with awk
; it’s a very powerful tool. If you’re interested in learning about some of the other features, there are some great resources available. If you are looking for a complete guide, this site does a really good job of covering most aspects of awk
.
The Grep
Command
The previous post had some great grep
examples. Therefore, I won’t dive too much into grep
, instead, I’ll just add a few thoughts. One of my favorite grep
commands to find occurrences of a given pattern/string in a directory is the following.
grep -R "foo" .
This is nice when you want to find a string in a file but can’t remember where it is, or if you want to quickly find where a string is if it appears in multiple files. This can be useful if you are looking through code and are attempting to figure out where and how a given variable is used. You can build on the previous command, once you know which file the pattern is, by displaying the specific line number of your search string with the following command.
grep -n “foo"
Now you can jump right to the line number you want to look at in your favorite text editor without having to dig around in different files, which is a pretty nice time saver.
Grep
is extremely powerful for searching text, I still haven’t uncovered all of its uses and benefits. If you get curious, the grep man page is a great place to go to learn random tidbits you would never otherwise discover, so I definitely recommend checking it out just to see some of the interesting use cases for it.
Conclusion
As demonstrated in this post, you can learn a handful of techniques to accelerate your CLI skills. The tools mentioned in this post (and the previous post) will give you a solid foundation for using the CLI effectively. You can accomplish a surprising amount with just a few tools like awk
grep
sed
and find
along with Linux piping to glue it all together.
Knowing which tool to use for what job is the first step to becoming more proficient at the command line. Therefore, the best way to learn these tools is to practice with them as much as possible and gain as much familiarity as possible with how they work.
Protect Your Files Before Making Changes
While these tools can make it easier to take action on files, you always need multiple backups. Leverage SmartFile to back up files off your file system manually or automatically with scripts. Try it today for free, no credit card required!
The post More Tools from the CLI appeared first on SmartFile.