Archive for October, 2011
GNU/Linux includes many utilities for working with text files through the shell. In this post we take a quick look at accessing and manipulating text files in a “column-wise” mode. Suppose you have the following two files, each with two columns separated by the TAB character. $cat file1 Alice Paris Bob Tokyo Mary London John New York $cat file2 […]
Sed can be used to strip out all HTML or XML tags from a file and get the plain text version. Suppose you have file gnulinux.html with the following contents: <p>The combination of <a href=“/gnu/linux-and-gnu.html“>GNU and Linux</a> is the <strong>GNU/Linux operating system</strong>, now used by millions and sometimes incorrectly called simply “Linux“.</p> Tempting but incorrect […]