2009-04-16

Perl one liner for dos2unix

So we have all come across the situation: you copy a bunch of files from DOS/Windows to Unix and the newline characters are all messed up (in other words, you see a "^M" at the end of each line). So the scripts you wrote lovingly, are all messed up now.

There are two choices:

Use dos2unix - this will change the newline characters one file at a time. Trouble is, that dos2unix writes to stdout, so you'd have to pipe the output to a temporary file and move it back to your script...

Use the following perl one-liner:

perl -pi -e 's/\r\n/\n/g' *

You guessed it - Windows/DOS uses "\r\n" for newline, while Unix uses only "\n"

10 comments:

  1. Anonymous7:36 am

    very useful

    ReplyDelete
  2. Anonymous11:58 pm

    Now how to make it recursive?

    ReplyDelete
  3. Anonymous6:46 pm

    It already recursive because of the "g" on the end (global). Without the "g", it would only replace the first instance of \r\n with \n.

    ReplyDelete
    Replies
    1. It's change all lines in file but it isn't recursive (don't change all files in directory and sub-directories, it's change only files "visible by *").
      It's change all lines because of '-p' switch:
      $ perl -MO=Deparse -e 's/\r\n/\n/'
      s/\r\n/\n/;
      $ perl -MO=Deparse -pe 's/\r\n/\n/'
      LINE: while (defined($_ = )) {
      s/\r\n/\n/;
      }
      continue {
      die "-p destination: $!\n" unless print $_;
      }

      Delete
  4. Anonymous2:00 pm

    Holy thread revival!!!

    Might be a suggeston to throw a $ in there, just to make sure it only acts on the end of each line.

    I've seen a few examples of dos2unix (older version, admittedly) where it stripped what it thought were Windows control characters from a UTF8 file, when in fact they were content.

    ReplyDelete
  5. Anonymous4:19 pm

    "Now how to make it recursive?"

    find tree_root -type f -print0 | xargs -0 perl -pi -e 's/\r\n/\n/g'

    Where tree_root is the directory you want to start in, use . for the current directory. To only do certain filenames:

    find tree_root -type f -name '*.ext' -print0 ...

    The quotes are to avoid having bash apply the glob in your current directory.

    ReplyDelete
  6. Anonymous8:38 pm

    perl -MExtUtils::Command -e dos2unix file

    ReplyDelete
  7. Nice. great Article Thanks..

    ReplyDelete