dcsimg

Slicing and Dicing on the Command Line

If you don't know text, you don't know Linux. There are a host of methods for reformatting plain text -- including the text used by graphical applications like spreadsheets and email programs.

Plain text is a series of characters delimited into lines by newline (LF, line feed) characters. You can send this text directly to a terminal window with a utility like cat(1). There are no hidden formatting codes; it’s “just the text, ma’am.”

Before the puns get any worse, let’s dig in!

Quick Review

As you saw in last month’s column (if you didn’t see the column, you might want to review it), to start a new line at any point in plain text, simply insert a newline character. To join two lines, remove the newline between them — and maybe add a space or TAB character to separate them.

When a terminal or printer reads a TAB character, it moves the current position to the next tabstop. TAB characters are also used as field separators; you can make a simple database with TABs between the fields and a newline at the end of each record.

Linux utilities can also reformat text that doesn’t contain TABs. We’ll see examples of that, too.

Lots of Possibilities

Many GNU utilities started in the days of Unix — back when a tty really was a teletype. Without a graphical display (or a graphical editor) to rearrange text, programmers came up with many ways to slice, dice, and reassemble data from scripts and the command line.

We’ll see some of those ways: Enough ways, I hope, that people new to this way of handling text will be ready to find other ways — and gurus will still get a few surprises.

Starting with a Spreadsheet

Plain text can come from lots of places, including:

  • The output of a utility (grep, for instance),
  • Text saved from an application (see Figure One for an example),
  • Text pasted into a terminal window from a graphical application, as in Figure Two near the end of this article.

Note that some of this text may not be “plain” characters. For instance, if you’re copying from a web page designed by a Macintosh user, the designer may have unwittingly included the Macintosh encoding of a special character (maybe a “curly quote”) that isn’t recognized on your Linux system.

For the first few examples, let’s use an OpenOffice.org spreadsheet file saved as plain text. (On the File menu, choose Save As, type Text CSV.) Assuming that the data doesn’t contain any TAB characters, you can set the Field Delimiter to TAB and the Text Delimiter to none (delete the default quote mark in that dialog box). Figure One shows this.

Figure One: Saving a spreadsheet as plain text
Figure One: Saving a spreadsheet as plain text

Below are are two views of the resulting file data.txt (renamed from the default data.csv). First, plain cat outputs the TAB characters between fields, which the terminal displays by moving to the next tabstop position. Next, cat -tve shows what’s actually in the file:

$ cat data.txt
STATE	CITY	COUNTY	POP.	GOVT.
AZ	Ely	Gila	123	Mayor
CA	Alma	Lolo	345	Sheriff
TX	Leroy	El Paso	22	Bubba
$ cat -tve data.txt
STATE^ICITY^ICOUNTY^IPOP.^IGOVT.$
AZ^IEly^IGila^I123^IMayor$
CA^IAlma^ILolo^I345^ISheriff$
TX^ILeroy^IEl Paso^I22^IBubba$

Checking the data file with cat -tve or od -c is a good idea. They’ll reveal “hidden” or “non-plain” characters buried in the data. Notice the space character in the field El Paso. Because the field separator is a TAB, the space doesn’t cause any problems.

Utilities that Understand TABs

Scripting languages (Perl, awk, …) can parse and write TAB-separated data. Table One lists some other Linux utilities that handle TABs.

Table One: Some utilities that understand TABs

Utility Description
cut(1) Remove sections from each line of files
echo(1), printf(1) Write arguments to standard output (\t makes a TAB)
expand(1), unexpand(1) Convert TABs to spaces, spaces to TABs
paste(1) Merge lines of files into TAB-separated output
sed(1) Stream editor
sort(1) Sort data by one or more of its fields

Whether your data comes from a spreadsheet or some other source, if you can massage your data into TAB-separated fields, the examples below can help you slice and dice it. Examples toward the end of the article cover other types of data.

Comments on "Slicing and Dicing on the Command Line"

Some truly nice and useful info on this web site, also I conceive the style and design contains excellent features.

Looking forward to reading more. Great article post.Really looking forward to read more. Fantastic.

I’ll immediately snatch your rss as I can’t find your e-mail subscription link or newsletter service. Do you have any? Kindly permit me understand so that I may subscribe. Thanks.

Excellent weblog right here! Also your site rather a lot up fast! What web host are you using? Can I get your associate link for your host? I desire my site loaded up as quickly as yours lol

I am sure this post has touched all the internet people, its really really fastidious article on building up new blog.

Excellent blog here! Also your website loads up fast! What web host are you using? Can I get your affiliate link to your host? I wish my site loaded up as quickly as yours lol

Thanks so much for the article post.Really thank you! Great.

This piece of writing will assist the internet people for setting up new webpage or even a weblog from start to end.

As I web-site possessor I believe the content matter here is rattling great , appreciate it for your hard work. You should keep it up forever! Good Luck.

It’s enormous that you are getting thoughts from this paragraph as well as from our argument made at this time.

Looking forward to reading more. Great blog article.Thanks Again. Keep writing.

It is really a great and useful piece of information. I am satisfied that you shared this helpful information with us. Please stay us up to date like this. Thanks for sharing.

Do you have any video of that? I’d like to find out some additional information.

Hello! Do you know if they make any plugins to help with SEO? I’m trying to get my blog to rank for some targeted keywords but I’m not seeing very good results. If you know of any please share. Kudos!

My brother recommended I might like this web site. He was entirely right. This post actually made my day. You can not imagine simply how much time I had spent for this info! Thanks!

Why people still make use of to read news papers when in this technological globe the whole thing is accessible on web?

Awesome article post. Keep writing.

Currently it sounds like Drupal is the preferred blogging platform available right now. (from what I’ve read) Is that what you are using on your blog?

Hello there! Would you mind if I share your blog with my zynga group? There’s a lot of folks that I think would really appreciate your content. Please let me know. Cheers

I may just I wish to suggest you few fascinating issues or tips.

A big thank you for your blog post.Really looking forward to read more. Great.

I dugg some of you post as I cogitated they were extremely helpful very helpful

Very nice write-up. I just stumbled upon your blog and wanted to mention that I have truly loved browsing your blog posts. After all I’ll be opting-in on your feed and I’m hoping you compose once more soon!

I really like and appreciate your blog post.Really thank you! Really Cool.

Thanks so much for the blog article.Really thank you!

I really like and appreciate your article.Really looking forward to read more. Want more.

Hello! I’m at work surfing around your blog from my new iphone 4! Just wanted to say I love reading your blog and look forward to all your posts! Keep up the excellent work!

Whoa! This blog looks exactly like my old one! It’s on a completely different topic but it has pretty much the same layout and design. Excellent choice of colors!

Appreciate you sharing, great post. Want more.

Thank you for your blog article. Keep writing.

Thanks-a-mundo for the article post.Really thank you! Much obliged.

A big thank you for your article post.Really looking forward to read more. Fantastic.

I am sure this post has touched all the internet people, its really really good post on building up new web site.

I know this web page gives quality depending articles or reviews and additional material, is there any other site which gives these stuff in quality?

I???¦ll right away clutch your rss feed as I can not in finding your e-mail subscription link or e-newsletter service. Do you’ve any? Please allow me recognise so that I may subscribe. Thanks.

Major thanks for the blog article.Really looking forward to read more. Cool.

Wow, this piece of writing is pleasant, my sister is analyzing these things, thus I am going to let know her.

I truly appreciate this blog article.Really thank you! Great.

Everything is very open with a really clear explanation of the challenges. It was truly informative. Your website is very helpful. Thank you for sharing!

Thank you for your article post.Much thanks again. Keep writing.

A round of applause for your blog post.Really thank you! Will read on…

Hi there! I’m at work surfing around your blog from my new iphone 3gs! Just wanted to say I love reading through your blog and look forward to all your posts! Carry on the great work!

Great, thanks for sharing this article post.Much thanks again. Fantastic.

I really enjoy the article post.Really thank you!

[url=http://canadianpharcharmyonlineusa.com/]canadian pharcharmy online[/url]
canadian pharcharmy online
http://canadianpharcharmyonlineusa.com/ canadian pharcharmy online

Great, thanks for sharing this blog article.Much thanks again. Cool.

It???¦s actually a nice and helpful piece of info. I am satisfied that you simply shared this useful info with us. Please stay us informed like this. Thank you for sharing.

Thank you ever so for you blog.Much thanks again. Cool.

Hello! Would you mind if I share your blog with my zynga group? There’s a lot of folks that I think would really enjoy your content. Please let me know. Thanks

Looking forward to reading more. Great article.Thanks Again. Really Great.

Leave a Reply