If I had to vote for the computer problem that consumes the most time in diagnosis vs. the ease of actually fixing it, I would definitely choose the line termination issue.
It’s the simplest thing to understand, in principle. Different operating systems have different conventions in how they format their files, in particular plain text files. If you open up a plain text file in a editor application like Notepad on Windows, you’ll see the content as simple text characters without any formatting. There are also “invisible” characters, such as tab and “newline”, which affect the formatting of the text but aren’t really text themselves.
Windows uses two invisible characters, called CR (Carriage Return) and LF (Line Feed) to represent each new line. This dates back to the days when computers didn’t have monitors, but instead literally printed their output onto spools of paper. CR+LF were instructions to the printer to bring the printer head back to the origin, and feed the paper forward, in order to begin printing a new line of text.
Unix/Linux and Mac OS X have a different convention. They only use one character, LF, to represent new lines. (Even more confusingly, the classic Mac OS uses only CR). Therefore, when you have a text file from a Windows system on your Linux system, you’ll first have to convert the line endings first, or weird things will happen, and at first you might have no idea what is going on because newline characters are invisible, so the file will look completely normal when you open it in a text editor. Instead you’ll blame yourself, like the 99% of the time when things don’t work, because that’s usually some other fault in your code, like a typo in your elaborate regular expression.
Fortunately it’s happened to me frequently enough before that when things don’t work as they should on the command line, one of the first things I do is to check the file type. In Linux this is easily done with:
You should expect to get something like:
FILENAME.txt: ASCII text
But if you see this:
FILENAME.txt: ASCII text, with CRLF line terminators
Then it’s probably coming from a Windows system and needs to be converted before you work with it on Linux/Unix/Mac OS X.
It’s easy to fix if you have the dos2unix utility that’s bundled on many Linux systems.
dos2unix FILENAME.txt # Silently overwrite original
dos2unix -n FILENAME.txt NEWFILE.txt # Write to new file, keep original
The reverse can be done with unix2dos.
If you don’t have dos2unix, you can use sed. The escape character \r represents CR, so the following simply means “remove the CR character from each line”.
sed -i 's/\r//' FILENAME.txt # Overwrite original
sed 's/\r//' FILENAME.txt > NEWFILE.txt # Write to new file, keep original
Further reading here and on the manual page of dos2unix.