<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Chris Bunney &#187; terminal</title>
	<atom:link href="http://www.chrisbunney.com/tag/terminal/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.chrisbunney.com</link>
	<description>Chris on Computing</description>
	<lastBuildDate>Mon, 23 Jan 2012 14:34:20 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	
		<item>
		<title>Using Bash to Change the Delimiter in a CSV File</title>
		<link>http://www.chrisbunney.com/2012/01/23/using-bash-to-change-the-delimiter-in-a-csv-file/</link>
		<comments>http://www.chrisbunney.com/2012/01/23/using-bash-to-change-the-delimiter-in-a-csv-file/#comments</comments>
		<pubDate>Mon, 23 Jan 2012 14:34:20 +0000</pubDate>
		<dc:creator>Chris</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[bash]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[scripting]]></category>
		<category><![CDATA[terminal]]></category>

		<guid isPermaLink="false">http://www.chrisbunney.com/?p=171</guid>
		<description><![CDATA[&#160; The Problem I recently had a situation where I had a comma separated value (CSV) file that I wanted to easily parse within a shell script. Unfortunately the CSV data contained some double quoted strings with embedded commas, for example: "Adygeya, Republic",RU-AD,21250,RU,Russian Federation This made parsing the file quite painful, particularly as only the [...]]]></description>
			<content:encoded><![CDATA[<p>&nbsp;</p>
<h3>The Problem</h3>
<p>I recently had a situation where I had a comma separated value (CSV) file that I wanted to easily parse within a shell script. Unfortunately the CSV data contained some double quoted strings with embedded commas, for example:</p>
<blockquote><p><code>"Adygeya, Republic",RU-AD,21250,RU,Russian Federation</code></p></blockquote>
<p>This made parsing the file quite painful, particularly as only the strings with an embedded comma were double quoted like this.<br />
<span id="more-171"></span></p>
<h3>The Solution</h3>
<p>I devised a utility script that can parse the data and replace the delimiter with a new character of the users choice, which makes CSV files far easier to work with if you pick a character you know won&#8217;t be in your data.</p>
<h3>The Script</h3>
<p>This script borrows the core CSV parsing from this rather good post on <a href="http://backreference.org/2010/04/17/csv-parsing-with-awk/">CSV parsing with awk</a>, but I&#8217;ve edited it to allow the substitution of the delimiter and ensure it still outputs a single record per line.</p>
<pre>#!/bin/bash

input=$1
delimiter=$2

if [ -z "$input" ];
then
	echo "Input file must be passed as an argument!"
	exit 98
fi

if ! [ -f $input ] || ! [ -e $input ];
then
	echo "Input file '"$input"' doesn't exist!"
	exit 99
fi

if [ -z "$delimiter" ];
then
	echo "Delimiter character must be passed as an argument!"
	exit 98
fi

gawk '{
	c=0
	$0=$0","
	while($0) {
		delimiter=""
		if (c++ &gt; 0) # Evaluate and then increment c
		{
			delimiter="'$delimiter'"
		}

		match($0,/ *"[^"]*" *,|[^,]*,/)
		# save what matched in f
		s=substr($0,RSTART,RLENGTH)
		# remove extra stuff
		gsub(/^ *"?|"? *,$/,"",s)
		printf (delimiter s)
		# "consume" what matched
		$0=substr($0,RLENGTH+1)
	}
	printf ("\n")
}' $input</pre>
<h3>Sample Input</h3>
<p><code><br />
$ cat testprovinces.csv<br />
Province,ProvinceCode,CriteriaId,CountryCode,Country<br />
Australian Capital Territory,AU-ACT,20034,AU,Australia<br />
Piaui,BR-PI,20100,BR,Brazil<br />
"Adygeya, Republic",RU-AD,21250,RU,Russian Federation<br />
Bío-Bío,CL-BI,20154,CL,Chile<br />
</code></p>
<h3>Sample Output</h3>
<p><code><br />
$ ./change-delimiter testprovinces.csv '^'<br />
Province^ProvinceCode^CriteriaId^CountryCode^Country<br />
Australian Capital Territory^AU-ACT^20034^AU^Australia<br />
Piaui^BR-PI^20100^BR^Brazil<br />
Adygeya, Republic^RU-AD^21250^RU^Russian Federation<br />
Bío-Bío^CL-BI^20154^CL^Chile<br />
</code></p>
]]></content:encoded>
			<wfw:commentRss>http://www.chrisbunney.com/2012/01/23/using-bash-to-change-the-delimiter-in-a-csv-file/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Automatically Deleting Sent Mail Stored By Mutt</title>
		<link>http://www.chrisbunney.com/2009/10/18/automatically-deleting-sent-mail-stored-by-mutt/</link>
		<comments>http://www.chrisbunney.com/2009/10/18/automatically-deleting-sent-mail-stored-by-mutt/#comments</comments>
		<pubDate>Sun, 18 Oct 2009 13:07:55 +0000</pubDate>
		<dc:creator>Chris</dc:creator>
				<category><![CDATA[linux]]></category>
		<category><![CDATA[email]]></category>
		<category><![CDATA[server]]></category>
		<category><![CDATA[terminal]]></category>

		<guid isPermaLink="false">http://www.chrisbunney.com/?p=11</guid>
		<description><![CDATA[I recently discovered over 600mb of sent email stored by Mutt, so I investigated the cause and created a solution that automatically deleted files older than a certain age.]]></description>
			<content:encoded><![CDATA[<p><strong>The Problem</strong></p>
<p>I was recently doing some housekeeping on my websever: removing archived <a href="http://www.chrisbunney.com/wiki/index.php/Software">software</a> downloads that I no longer needed and looking for any problems when I found one that initially had me stumped. Looking at the size of my home directory, I found it was around 600mb with no immediately obvious reason as to why.</p>
<p>I have hardly any files in my home directory, so I quickly located the problem to <code>~/Mailbox/.Sent/cur</code>, which is, as the path suggests, related to my email server. It looked like copies of my sent mail was being stored and not deleted.</p>
<p><span id="more-11"></span></p>
<p>Curiously, the mailbox directory is used by <a href="http://www.dovecot.org/">Dovecot</a>, but Dovecot doesn&#8217;t handle sending mail so why would it be storing sent mail? Particularly as Dovecot wouldn&#8217;t have access to outgoing mail to store it. Some googling turned up nothing, which further suggested Dovecot wasn&#8217;t the issue.</p>
<p>Since the problem was with outgoing mail, I then turned my attention to <a href="http://www.postfix.org/">Postfix</a>, my smtp server, which would have access outgoing mail, but again, as with Dovecot, found no information on configuring Postfix to store outgoing mail.</p>
<p>At this point I had no obvious theory to go with, so I started looking at the files themselves for something to go on. There were 374 of these files (which I counted using the technique in <a href="http://www.chrisbunney.com/2009/09/27/counting-all-files-in-a-linux-directory/">this post</a>) and as I started looking at the contents I realised they were copies of my automated backup emails.</p>
<p>These backup emails are sent using <a href="http://www.mutt.org/">Mutt</a> via a bash script running on a cron job that executes once a day. It just so happens that I&#8217;ve had this set up running around just over a year, so the number of these stored emails supports the theory that Mutt is the perpetrator</p>
<p>To confirm Mutt was creating these files, I used Mutt to send a mail and then checked to see if it appeared in the directory. It did.</p>
<p>Therefore, it seems I have a copy of every single email sent via Mutt since setting up my email system. This arrangement is good for archiving, but bad for disk usage.</p>
<p><strong>The Solution</strong><br />
So I looked into Mutt&#8217;s configuration and found an option in Mutt&#8217;s configuration file (.muttrc) that controls the storing of sent mail: <code>set copy</code>.</p>
<p><code>set copy = yes</code> will enable storing sent mail<br />
<code>set copy = no</code> will disable storing sent mail</p>
<p>However, this isn&#8217;t optimal, as I would like to save Mutt&#8217;s recently sent email (as a lot of it is auto generated by the server and is my only copy if it never reaches the intended recipient) but delete anything older than a certain period.</p>
<p>You can achieve this using the find command</p>
<p><code>find . -ctime  +90 -exec rm -f -v '{}' \;</code></p>
<p>This command searches in the current directory for files that are older than 90 days (using <code>-ctime  +90</code>)  and then executes the command <code>rm -f -v</code> on each result by using <code>-exec rm -f -v '{}' \</code> (the <code>'{}'\</code> tells find to run the command on each result).</p>
<p>Now that I had a command that did what I want, I set up a cron job to run nightly to execute the command in the correct directory:</p>
<p><code>00 03 * * * root cd /home/chris/Maildir/.Sent/cur; find . -ctime  +90 -exec rm -f -v '{}' \;</code></p>
<p>This job runs as root every day at 3am. I added a <code>cd</code> command to enter the correct directory, but I could have just as easily put the path in the argument of the <code>find</code> command. Indeed, if you want to run this command on several directories, it would be better to list each directory as an argument to <code>find</code>, but in my case where I only wanted to run this on a single directory I found using a separate <code>cd</code> command more legible.</p>
<p>With that done, I now have a much smaller archive of sent mail that is automatically cleaned up for me without me having to do anything.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.chrisbunney.com/2009/10/18/automatically-deleting-sent-mail-stored-by-mutt/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Counting All Files in a Linux Directory</title>
		<link>http://www.chrisbunney.com/2009/09/27/counting-all-files-in-a-linux-directory/</link>
		<comments>http://www.chrisbunney.com/2009/09/27/counting-all-files-in-a-linux-directory/#comments</comments>
		<pubDate>Sun, 27 Sep 2009 14:23:17 +0000</pubDate>
		<dc:creator>Chris</dc:creator>
				<category><![CDATA[linux]]></category>
		<category><![CDATA[terminal]]></category>
		<category><![CDATA[tips]]></category>

		<guid isPermaLink="false">http://www.chrisbunney.com/?p=8</guid>
		<description><![CDATA[How to count the number of files in a directory using the Linux terminal]]></description>
			<content:encoded><![CDATA[<p>To count how many files there are in a directory using the terminal on a Linux machine you can combine 2 commands:</p>
<ul>
<li>find</li>
<li>wc</li>
</ul>
<p>We&#8217;ll use the find command to locate all the files (and exclude directories and other non-files) and then the wc command to count the files.</p>
<p><span id="more-8"></span></p>
<p>Find has many options, but we will only be using the type option:<br />
<code>find . -type f</code></p>
<p>The . indicates the directory to search in and can be replaced with any absolute or relative path. This command will also find files in subdirectories. To exclude subdirectories you can use:</p>
<p><code>find . ! -name . -prune -type f</code></p>
<p>The <code>! -name . -prune</code> will ignore any directories that are not the current one.</p>
<p>The find command will give us a list of all the files we want to count, to count the number of entries in that list we must pass the output to wc to count it.</p>
<p>By default, wc will count words, newlines, and bytes. Since the output from find is a list of files, each separated by a newline, we can tell wc to only count newlines by using the -l argument: <code>wc -l</code><br />
Combining this into a single command, we get the final command to find count the number of files in a directory excluding sub-directories:<br />
<code>find . ! -name . -prune -type f | wc -l</code></p>
<p>(Of course, if you want to include sub-directories, simply remove the <code>! -name . -prune</code> arguments and use:<br />
<code>find . -type f | wc -l</code> instead)</p>
]]></content:encoded>
			<wfw:commentRss>http://www.chrisbunney.com/2009/09/27/counting-all-files-in-a-linux-directory/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

