Wile Ethelbert from the ACME company needs to compute some statistics about all the values in a list of CSV
files in the ~/csv
folder.
Wile: "That's easy! Let's start looping all the files."
The produced code is:
for f in $(ls ~/csv/*.csv); do
echo filename: $f
done
By executing the script, the output is:
filename: /Users/wilic/csv/-file
filename: 02.csv
filename: /Users/wilic/csv/file
filename: 99.csv
filename: /Users/wilic/csv/file
filename: **100**.csv
filename: /Users/wilic/csv/file
filename: 01.csv
Wile: "Hmmm. Something is wrong. Why do I see files not ending with .csv
? Let's check the content of the folder"
$ ls ~/csv
-file 02.csv file ?99.csv file **100**.csv file 01.csv
Wile: "Ah! There are files with spaces in the name! I have to double quote it!":
for f in "$(ls ~/csv/*.csv)" ; do
echo filename: $f
done
filename: /Users/wilic/csv/-file 02.csv /Users/wilic/csv/file 99.csv /Users/wilic/csv/file **100**.csv /Users/wilic/csv/file 01.csv
Wile: "No no no... Doing this way ls
output is treated as a single line! Let's change approach"
for f in $(find ~/csv -type f -name '*.csv') ; do
echo filename: $f
done
Wile executes it, but the output disappoints his expectations:
filename: /Users/wilic/csv/file
filename: 99.csv
filename: /Users/wilic/csv/file
filename: 01.csv
filename: /Users/wilic/csv/-file
filename: 02.csv
filename: /Users/wilic/csv/file
filename: **100**.csv
Wile: "Hmmm. Nothing has changed. How can I fix it? Hey Road Runner, can you help me please?"
They start working hard on the issue. The first idea they have together is to change the IFS (Internal Field Separator) value:
IFS=$'\n'
for f in $(ls ~/csv/*.csv) ; do
echo filename: $f
done
They run it:
filename: /Users/wilic/csv/-file 02.csv
filename: /Users/wilic/csv/file
filename: 99.csv
filename: /Users/wilic/csv/file **100**.csv
filename: /Users/wilic/csv/file 01.csv
Wile: "Hmmm. What's happening? What is 99.csv?"
Road Runner: "I think I understand what's happening! The '?' in the 'file ?99.csv' filename is a newline!"
Wile: "What? A filename can contain a newline?"
Road Runner: "Yeah! A filename can contain EVERYTHING but the NUL character"
Wile: "I thought this was an easy job... how can we solve this?"
There are 4 solutions to this issue
find
command with the exec
parameterfind ~/csv -type f -name '*.csv' -exec YOURCOMMAND {} \;
This is very handy if you just have to execute one command on each file. It can be used even if YOURCOMMAND accepts the list of files all at once by changing it like this:
find ~/csv -type f -name '*.csv' -exec YOURCOMMAND {} +
glob
If you need to do more complex elaborations or save values in variables to be used outside the loop, you can use bash glob
:
for f in ~/csv/*.csv ; do
[ -e "$f" ] || continue
echo filename: "$f"
done
The output will be:
filename: /Users/wilic/csv/-file 02.csv
filename: /Users/wilic/csv/file
99.csv
filename: /Users/wilic/csv/file **100**.csv
filename: /Users/wilic/csv/file 01.csv
You will ask: "Why '[ -e "$f" ] || continue' ?". Because if there are no .csv
files in the folder, the for...loop will be executed one time with f=./*.mp3
. With that test, we are simply checking that the file exists.
while
and find
This solution is handy if you need to recurse subdirectories
while IFS= read -r -d '' f; do
echo "filename: $f"
done < <(find ~/csv -type f -name '*.csv' -print0)
This solution works only with bash 4 or newer
shopt -s globstar
for f in ~/csv/**/*.csv; do
echo "filename: $f"
done