Power Tools: Piles of Files

Using text and utilities to organize and access files.

Linux runs on text. Configuration files are often human-readable text. Many other files contain text, too, and text often flows through Standard I/O connections. Linux has powerful utilities to handle text; you can also use a scripting language.

The names of files and their locations (pathnames) are also usually text. So, the techniques you use to process text can also be used to process files.

(Of course, if looking through file listings and clicking on some of them is the best way to find what you want, Linux has GUI browsers like Nautilus and Konqueror.)

This article covers ways to make lists of files — on-the-fly or in another file — then narrow the list to just what you’re looking for. We’ll use lots of shell loops with redirected I/O; if you need an introduction, see the sections “Let a Loop Do The Work” in Great Command-line Combinations.

When the Name Isn’t Enough

The third article in the Filenames by Design series shows ways to find files by name when those files are part of a thoughtfully-designed system. If you’re like me, though, you can only wish that all of your files were in a system that makes everything easy to find. (Some projects are carefully planned. Others are 3 a.m. hacks that you can’t finish neatly before the next crisis hits.)

Attributes like the last-modification timestamp or the size can help you find a file that’s hidden like a needle in a haystack. See the sidebar Some file attributes for suggestions. Of course, attributes aren’t always enough.

One of my favorite quick ways to save files from a project is to make a tar(1) archive in gzip(1) format with a name like project-name_1996-02-15.tar.gz and transfer it into a directory named tarballs on my main system. That’s great if I remember the name of the project or when I worked on it. More likely, though, I’ve forgotten what year it was or what conference I was about to attend when I wrote that file with the example I’m looking for. It’s time for power tools.

(By the way, this is a specific example of a general technique. These ideas also work for single files that aren’t in an archive.)

Start by thinking where the data might be — and, once you find some likely spots, what tools could extract it. Here we’re looking for gzipped files. Uncompressing each file onto the disk and searching through it can take a lot of disk space. But the GNU zcat(1) utility (also known as gunzip -c) reads a compressed file in various formats, uncompresses its contents on-the-fly, and writes them to standard output. That lets you avoid temporary files by writing data into a pipe.

Some File Attributes

The name and the contents aren’t the only way to find the file you want. A file also has attributes — the last modification time, for instance. You can find many attributes with utilities like ls(1) and stat(1); there are other suggestions below.

Here are some attributes you might want to search for:

  • The filename.
  • The “extension”, like .jpg for a JPEG-format photo.

    (Note that Linux itself doesn’t have actual filename extensions — as Microsoft Windows does. The applications you use may care how a filename ends, what sort of data it contains and the structure of that data, but Linux doesn’t. The file is just a sequence of data bits. Linux doesn’t regulate whether, say, a JPEG photo is in a filename ending with the four characters .jpg. A Power Tools column has more about file “types” under Linux.)

  • Part or all of the file’s pathname (one or more of the names of directories that hold the file).
  • The file’s three timestamps: last modification of the file contents, last “change” (to the file’s metadata, not to the file contents), and last access.
  • The file length.
  • If it’s a text file, the number of lines and/or words.(For instance, you might be looking for files with more than 1,000 lines.)

    The wc(1) utility can count lines and words — if the file is plain text (with no non-text coding added by, say,
    a word processing program). The section “Data Is Just Data” of the Power Tools column Performing Data Surgery explains how Linux text files are structured.

  • Is the file actually a symbolic or hard link?
  • Linux extended file attributes store external data with files — a sort of “tagging” system to let you identify particular files. Not all utilities support attributes, but Z shell does. Also see the manpages for chattr(1) and lsattr(1).
  • If the file contains particular words, strings, or characters, grep(1) and friends can probably find them.

A scripting language with flexible searching can be a good choice for complex tests and searches for non-textual data. (One of those languages is Perl.)

We’ll be searching tar archives. What’s in a tarball? It’s a series of sets of metadata for a file followed by the file’s content. We want to find string(s) somewhere in the content of one of those files. A quick-and-dirty technique is to search the entire tarball for the string you’re looking for, filtering the search results to keep non-text characters from messing up the screen. (You may not need tar unless you’re extracting a file from the archive.) Let’s start with that:

$ cd tarballs
$ for file in *1996* *usenix*
> do
>   zcat "$file" |
>   grep -i -H --label="$file" 'pattern'
> done | cat -v
Binary file ora_1996-04-15.tar.gz matches
Binary file usenix_1999.tar.gz matches
  • Wildcarded strings like *1996* *usenix* match all filenames in the directory that include 1996 or usenix.

    If that list might contain duplicates, you could either use a more specific wildcard pattern or start the loop this way:

    for file in $(/bin/ls -d1 *1996* *usenix* | uniq

    • /bin/ls -d1 (that’s a digit 1) lists the matching filenames, one per line, in sorted order. Using /bin/ls bypasses any alias you might have for ls. The -d option tells ls to list directory names instead of their contents.)
    • The uniq utility removes duplicate entries from a sorted list.
  • In the loop, zcat opens each file.
  • The uncompressed tarball is filtered through grep, which does a case-insensitive (-i) search for pattern.
  • Because grep is reading from the pipe, it doesn’t see the tarball’s filename. Adding -label="$file" makes grep output the filename, expanded by the shell from $file. (The --label option seems to also require -H… on grep version 2.5.1, at least.)
  • The loop’s output (actually, the standard output of all of the grep processes in the pipe) is piped to cat -v. This makes sure that your screen won’t turn into mush if the search matches a line containing non-textual data — such as a filename, surrounded by control characters, embedded in a file’s metadata.

    The cat -v trick is a good one. It actually wasn’t needed here, though, because grep decided that the tarballs were “binary” files — that is, the first few bytes were non-textual — so it output “Binary file file matches”. Adding the option --binary-files=text tells grep to show the matching lines anyway. We’ll try that next.

Comments on "Power Tools: Piles of Files"

I do consider all the ideas you have introduced on your post.
They are reallly convuncing and can definitely work.
Still, the posts are very short for newbies. Could yyou please extend them a bbit from subsequent time?
Thank you for the post.

My web blog click this Over here now

I appreciate you sharing this article.Really thank you! Will read on…

Very neat blog.Really looking forward to read more. Really Great.

What’s up to all, how is everything, I think every one is getting
more from this web page, and your views are nice in support of new users.

Also visit my website: cheap car insurance

Impotence or other sexual problems there are lot of alternative drug and medications.
If you want to learn more about the different herbal Viagra brands and their key ingredients,
visit our site viagra alternatives. Discriminative information regarding races,
religions, countries, groups or any named person.

Here is my web blog: http://www.laviagraes.com/pastillas-viagra-generico

You must ensure that you sᥱe eҳamplᥱѕ оf thе ѕtᥙԁу matеrіaⅼs pгоνіɗеd Ьу аny ϲоmрany tҺаt ʏߋᥙ maʏ wаnt to tгaіn tҺгough.
Ꮲеߋрlе սѕuаlⅼу ⅼіҝе рⅼaуіng
tɦеsе ցɑmеѕ ᴡіtҺіn in theіг netԝοгκ
ƅսt tһᥱ ǥɑmіng ρⅼɑtfогm аllοᴡs ρᥱօрlе tο
cοmреte ѡіtһ ѕtrɑngeгs.
All thɑt іѕ neеdеԀ tߋ plaу ѕοсcег іѕ ɑ
Ьalⅼ, ցοals,
and, fߋг tһоѕе ᴡɦο ԝant,
unifοгmѕ consіѕtіng оf
ѕɦогtѕ, ѕοϲқs,
clᥱatѕ. Ѕϲһοоlѕ
neеɗ tⲟ іncluɗe mоrе ρҺүsіϲɑⅼ actiνіtу ɗսгіng tɦе ԁɑү
to rɑіѕᥱ ɦeaгt rɑtеѕ and қееρ
ҝіⅾѕ ɦеɑⅼthу.

Αⅼsо vіѕіt my ԝеЬ-ѕіtе clash royale hack

I value the article post.Much thanks again. Fantastic.

Usually posts some very exciting stuff like this. If you?re new to this site.

you’re really a just right webmaster. The site loading velocity is
incredible. It sort of feels that you’re doing any unique trick.
Also, The contents are masterwork. you have performed a wonderful task in this topic!

You can certainly see your expertise in the work you write.
The world hopes for even more passionate writers such as you who are not afraid to
say how they believe. All the time follow your heart.

my web-site … Health Services

wow, awesome blog post. Cool.

Just beneath, are numerous absolutely not associated web-sites to ours, nevertheless, they may be surely really worth going over.

We came across a cool web page that you might enjoy. Take a appear if you want.

It is somehow advisable to use desktop computers for heavy-duty usage since
they can tolerate longer hours than notebooks. Kinect Joyride is a fun ride in a car, but I do not think it is a “stolen” car,
but it is a Joyride. Fashion retailers now have their own line
in video game shirts so it will not be hard to buy a shirt.
The handset provides both accelerometer and proximity sensors to automate a variety
of tasks, whilst navigation around the enclosed functionality is provided by
the multi-touch input features.

Feel free to visit my website – clash royale triche

Millions of games there is a self-extracting file
double click the ROMUpdateUtility. The game is the free-to-play games like Dragonvale.

You’re able to find out more about the fact that there is anything
like mine, then this article will cover some information gleaned and
that you can see lots of achievement and boosts their morale.

Very handful of web-sites that occur to be detailed below, from our point of view are undoubtedly very well worth checking out.

Several companies in India is anticipated to grow fast, candy crush saga cheats or who want to worry.

We came across a cool internet site that you could take pleasure in. Take a appear if you want.

Wonderful article! This is the type oof information tht sshould bbe shared
across the internet. Disgrace on Google for now not positioning this put up higher!
Come on oover and consult with my web site . Thanks =)

my web blog – alloy Wheel refurbishment Hitchin

That would be the end of this report. Right here you?ll come across some web-sites that we think you will value, just click the hyperlinks.

Check below, are some absolutely unrelated web-sites to ours, nonetheless, they may be most trustworthy sources that we use.

Hello there! This post couldn’t be written any better!
Looking through this article reminds me of my previous roommate!
He constantly kept preaching about this. I will forward this
article to him. Fairly certain he will have a great read. Many thanks for sharing!

Also visit my web blog: canadian viagra

Very couple of web-sites that come about to become in depth below, from our point of view are undoubtedly nicely really worth checking out.

What’s up to all, because I am really keen of
reading this web site’s post to be updated daily.
It consists of good material.

Here is my web page – dyson dc56

I’m truly enjoying the design and layout of your site.
It’s a very easy on the eyes which makes it much more enjoyable for
me to come here and visit more often. Did you hire out a
designer to create your theme? Outstanding work!

Also visit my web page – dyson vacuums

Here is a good Blog You may Discover Exciting that we encourage you to visit.

Here is a great Weblog You may Come across Intriguing that we encourage you to visit.

We came across a cool web page that you simply may well take pleasure in. Take a look in the event you want.

Please check out the internet sites we adhere to, including this one particular, because it represents our picks in the web.

What’s up, after reading this amazing article i am too happy to share my knowledge here with friends.

Have a look at my site … window film walmart (newsite.serac-montagne.com)

Just beneath, are various completely not related sites to ours, having said that, they may be surely worth going over.

Wonderful story, reckoned we could combine a number of unrelated data, nonetheless seriously really worth taking a appear, whoa did one particular study about Mid East has got a lot more problerms at the same time.

The info talked about inside the article are a number of the most effective out there.

Please stop by the web sites we adhere to, which includes this a single, because it represents our picks from the web.

The time to read or pay a visit to the subject material or websites we’ve linked to beneath.

Please stop by the internet sites we comply with, like this 1, because it represents our picks in the web.

The time to study or pay a visit to the content material or web pages we have linked to beneath.

We came across a cool web site which you may possibly enjoy. Take a search should you want.

It isn’t the fastest or most crrect GPS on this planet, nevertheless it gets
the job achieved most of the time.

Also visit my weblog – dj entertainment long island

Usually posts some extremely exciting stuff like this. If you are new to this site.

Here is an excellent Blog You might Uncover Interesting that we encourage you to visit.

Usually posts some very interesting stuff like this. If you are new to this site.

Wonderful story, reckoned we could combine a few unrelated data, nevertheless genuinely really worth taking a appear, whoa did a single master about Mid East has got extra problerms as well.

That would be the finish of this post. Right here you will come across some sites that we believe you?ll appreciate, just click the hyperlinks.

That might make it simpler for Fitbit to start making and promoting these products.

Look into mmy weblog; bands for hire

Just beneath, are several completely not connected internet sites to ours, even so, they’re certainly really worth going over.

Here are a number of the web-sites we advocate for our visitors.

Greetings to the worldwide guest,i am so joyful writing this article to announce the miraculous and great Dr Aknimo did for me,since 4 year i have been diagnosed of genital herpes disease,each day that passes my body is in pain and inches and sores in my anus and vagina,i have tried to cure my genital herpes with several medicine but none could cure me,until i Saw Dr Aknimo testimony online,how he cured a lady of herpes on the net,so i contacted him through his email:draknimosolutiontemple@gmail.com and told him that i want to be cured,he assured me that he will help me,so i obeyed his instruction,he prepare a herbal medicine for me,two days after i received the herbal medicine through DHL delivery service in the 48hours service,and after taking the herbal medicine in 2 weeks i was totally cure,so you too can be cure with his powerful herbal medicine because Dr Aknimo is a trustworthy man,contact his Mobil:+2348062230273 or his website:

One of our visitors lately recommended the following website.

One of our guests not too long ago proposed the following website.

Leave a Reply