Wednesday, October 12, 2011

Search and Replace Bash-Fu

A little while ago I was searching for a tool that would allow me to perform multiple search and replace operations on multiple files. I didn't look very hard before it dawned on me that this could be quite an easy problem to solve using a simple bash script. It also smelled like a challenge, and I love a good challenge so this is what I came up with:

What the script requires is that you provide a file containing pairs of "from and to" values delimited by a tilde (~)
eg

street~str
mr~mister
addr~address


Then you just call the script, give it a path to the file (above) and a file extension to search for (eg. txt) and it does the job quite nicely.

Here is the script:

#!/bin/bash
# @author Louis van der Merwe mailto:TheMandibleClaw@gmail.com

echo
echo "*********************************************************"
echo "* BULK SEARCH AND REPLACE                               *"
echo "*********************************************************"
echo

if [ $# -eq 0 ]
then
    echo "usage $0 {path to matches file (file must contain pairs of "old value~new value")} file extension]"
    echo
    exit -1
fi

FILE_PATH="$1"
EXTENSION="*"
SEARCH_PATH="."

test $# -eq 2 && EXTENSION="$2"

declare -a MATCHES

i=-1

while read line;
do
    if [ -n "$line" ]
    then
        i=$(($i + 1))
        MATCHES[$i]="$line"
    fi

done < ${FILE_PATH}

find ${SEARCH_PATH} -iname "*.$EXTENSION" | while read f;
do
    F_TOTAL=0
    echo -ne "\n$f "

    if [ -w "$f" ]
    then
        for m in $(seq 0 ${i});
        do
            FROM=$(echo ${MATCHES[$m]} | cut -d~ -f1)
            TO=$(echo ${MATCHES[$m]} | cut -d~ -f2)

            M_TOTAL=$(grep "$FROM" "$f" | grep -v grep | wc -l)

            echo -ne " [$FROM] " && \
            if [ $M_TOTAL -gt 0 ]
            then
                F_TOTAL=$(( $F_TOTAL + $M_TOTAL))
                cat "$f" | sed -e "s/${FROM}/${TO}/g" > "${f}.new" && mv -f "${f}.new" "${f}"

                for s in $(seq $M_TOTAL);
                do
                    echo -ne "."
                done
            fi
        done
    else
        echo -ne " is read-only - skipping"
    fi

    echo -ne " ($F_TOTAL replacements) "

done

echo -ne "\n\n"

exit 0


Whilst there are obviously better languages such as python that one can use to solve such a problem, I always enjoy writing bash scripts. They always remind my of why I love gnu, simple applications which each perform a specific job well; which can be strung together to solve complex problems.

No comments:

Post a Comment