Convert windows to unix text file

This article seeks to achieve one objective; the conversion of a Windows File to a UNIX file without changing the format of the resulting file.

You may ask yourself, what is the purpose of this article? Why convert a Windows file to adapt to a UNIX environment like Linux? Isn’t Linux all-powerful? The exceptional capabilities of the Linux operating system do not spare it from incompatible displays of files transferred from other computing platforms.

Just because you can open a file on a Linux environment does not imply that you have full control over how the file’s texts should be displayed.

[ You might also like: How to Find Files Containing Specific Text String in Linux ]

You will encounter instances where a file’s texts or words are jammed together on a single giant line. In other instances, the same file texts’ displays might not have line breaks or sentence spacing.

A common attribute of raw Windows files opened in UNIX systems like Linux is the unavoidable end-of-line display of ^M or Ctrl-M characters.

This article guide seeks to achieve one objective; the conversion of a Windows File to a UNIX file without changing the format of the resulting file.

Ways to Convert Windows File to Unix Format in Linux

We can achieve the objective of our article through several methods. These methods allow us to convert a Windows file to a UNIX file and still retain the original format of the Windows file.

Convert Windows File to Unix Using dos2unix Command

Depending on your Linux operating system distribution, you can install the dos2unix command-line tool from one of the following commands:

$ sudo apt-get install dos2unix     [On Debian, Ubuntu and Mint]
$ sudo yum install dos2unix         [On RHEL/CentOS/Fedora and Rocky Linux/AlmaLinux]
$ sudo emerge -a sys-apps/dos2unix  [On Gentoo Linux]
$ sudo pacman -S dos2unix           [On Arch Linux]
$ sudo zypper install dos2unix      [On OpenSUSE]    

The command syntax for using the dos2unix tool is as follows:

$ dos2unix Your_Windows_File Final_Unix_File

So if you have a sample file created on a Windows computing system and want to open it on a Linux computing system without compromising its format, you would use the following command.

$ dos2unix windows_readme.txt unix_readme.txt

Before we run the above command, we need to create a blank unix_readme.txt file that will accommodate the converted file.

$ touch unix_readme.txt
$ dos2unix windows_readme.txt unix_readme.txt    

Convert Windows File to Unix Using dos2unix

Convert Windows File to Unix Using dos2unix

As per the screen capture, your converted Windows file should comfortably adapt to any Unix environment.

Using AWK Command – Convert Windows File to Unix

The awk command is pre-installed on all modern full-fledged UNIX computing systems like Linux. To convert our sample Windows file, we would implement the awk command in the following manner:

$ awk '{ sub("r$", ""); print }' windows_readme.txt > new_unix_readme.txt

Awk Convert Windows File to Unix

Awk Convert Windows File to Unix

As you have noted, with the awk command, we don’t need a pre-existing blank Linux file to accommodate the converted Windows file. The command creates and populates the Unix file version of the Windows file on its own.

Using tr Command – Convert Windows File to Unix

If most of the Windows files you open on your Linux environment have unnecessary Ctrl-Z and carriage return characters, then you will appreciate what the tr command has to offer.

Supposing our sample Windows file is a victim of such characters, removing them will require implementing the following command:

$ tr -d '' < windows_readme.txt > polished_unix_readme.txt

tr Convert Windows File to Unix

tr Convert Windows File to Unix

The inbuilt nature of the tr command also generates the resulting UNIX file without the need for its pre-creation.

The flexibility of the three discussed approaches to converting any editable Windows file to UNIX file format should save you from the headaches of having to manually edit your downloaded or transferred Windows files to remove unwanted characters and spaces while on a Linux environment.

A Quick Intro to End-Of-Line

Most people don’t realise that when they hit the Enter key to create a new paragraph in a text file, something very different is going on behind the scenes in the three major operating systems: Windows, Macintosh and Linux. The “end-of-line delimiter” (often expressed as “End-Of-Line“, “End of Line“, or just “EOL“) – which some of you know as the “line break” or “newline” – is a special character used to designate the end of a line within a text file.

UNIX-based operating systems (like all Linux distros and BSD derivatives) use the line feed character (n or <LF>), “classic” Mac OS uses a carriage return (r or <CR>), while DOS/Windows uses a carriage return followed by a line feed (rn or <CR><LF>). Now that Mac OS X is based on FreeBSD‘s file system, it follows the UNIX convention.

Now, the reason most people don’t know about all this is because nobody really should have to. But while users of Linux distros and Mac OS can open Windows text files in basically any available editor and not even know the difference, the same can’t be said for Windows users opening files created in one of the other operating systems.

If you type up a simple text file in Ubuntu and save it in the default “Unix/Linux” format, in Windows it will appear as one continuous paragraph, with black squares where the line breaks (or new paragraphs) should be. While you can open the file in a more advanced text editor (or proper word processor) to view it as it should look, others you’ve sent it to are just likely to double-click it and let it open in Notepad (which can only handle MS-DOS EOL).

Occasionally, the reverse is the issue, but you can convert Windows text files to UNIX easily with Gedit, as well as convert them via the terminal, so hopefully the following guide will be of use.

For more detailed info on End-Of-Line, go to the Wikipedia page.

Or if you’re wanting to do the reverse, check out how to convert to Windows format via the terminal and with Save As… in Gedit.

Converting Windows EOL to Linux via the Terminal

If you find the text editor you’re using to display Windows files in Ubuntu shows ^M instead of a line break (not very likely with even the most lightweight text editors, but something you’ll probably come across if you display the text in a terminal), don’t worry – just convert them to Unix/Linux format.

While you can actually open them in Gedit and use Save As… to save over them (or to create copies) in the correct format, for more than a couple of files this would be the long, complicated solution.

By far the quickest and easiest approach is to convert the offending files via the command-line. This way, you could batch-convert hundreds of such files at once, not have to do them individually.
There are actually quite a few ways to do this, but we’ll look at a couple of tiny packages you can install, and the related commands to use.

The first – the tofrodos package – is undoubtedly the most widely-used, so we’ll look at that in detail – especially since many of the guides out there are outdated, since the commands it contains have been renamed.

The second is a little package called flip, and since it’s tiny and won’t cause any issues, it’s worth installing as a backup (just in case. I found it useful after trying to get tofrodos going on a new system, before I found out the commands were changed).

There is no actual command tofrodos, as it is just the package that contains the commands todos and fromdos. Currently, the vast majority of online guides will list the commands as unix2dos and dos2unix, but as the developer states:

With this release the symlinks “unix2dos” and “dos2unix” are dropped from the package. This will allow the introduction of the original dos2unix package, which also supports conversion to MacOS style files.

So now you can choose to use either todos (to convert to Windows) and fromdos (to convert to Linux), or just fromdos with options (fromdos -u to convert to DOS, and fromdos -d to convert to UNIX, though obviously the -d option really isn’t needed, as it is the default behaviour for the fromdos command).

We’ll use fromdos, as it is easier to remember, and show how to alter a single file, or all text files in a given folder. When you’re ready to proceed, open a terminal in the folder containing the text file(s) and use one of the following commands (note that for the purpose of illustration, the .txt suffix is used, but you can specify any other extension for your text files).

To Convert to UNIX/Linux via Terminal:

Single file (remember to replace filename.txt with the actual name of the file)

fromdos filename.txt

All text files in a folder (if the extension differs to .txt, simply replace it in the command)

fromdos *.txt

Similarly, flip is easy to use:

flip -u filename.txt (or flip -u *.txt for multiple files)

Converting Windows EOL to Linux with Gedit

It’s actually very easy to convert text files with Windows EOL to Unix/Linux in Ubuntu using the default Text Editor, Gedit. Simply open the files, choose Save As…, go to Line Ending in the dialogue box and choose Unix/Linux instead of Windows. While that is easy enough, for more than one or two you’d really want to save yourself some time and hassle and perform a batch-conversion via the terminal.

☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻

Did this information make your day? Did it rescue you from hours of headache? Then please consider making a donation via PayPal, to buy me a donut, beer, or some fish’n’chips for my time and effort! Many thanks!

Buy Ubuntu Genius a Beer to say Thanks!

Any good text editor on Windows supports saving text files with just line-feed as line termination.

For an automated conversion of text files from DOS/Windows to UNIX line endings the batch file JREPL.BAT can be used which is written by Dave Benham and is a batch file / JScript hybrid to run a regular expression replace on a file using JScript working even on Windows XP.

A single file can be converted from DOS/Windows to UNIX using for example:

jrepl.bat "r" "" /M /F "Name of File to Modify" /O -

In this case all carriage returns are removed from the file to modify. It would be of course also possible to use "rn" as search string and "n" as replace string to remove only a carriage return left to a line-feed if the file contains carriage returns also somewhere else which should not be removed on conversion of the line terminators.

Multiple files of a directory or an entire directory tree can be converted from DOS/Windows to UNIX text files by using command FOR to CALL batch file JREPL.BAT on each file matching a wildcard pattern.

Batch file example to convert all *.sh files in current directory from DOS/Windows to UNIX.

@for %%I in (*.sh) do @call "%~dp0jrepl.bat" "r" "" /M /F "%%I" /O -

The batch file JREPL.BAT must be stored in same directory as the batch file containing this command line.

For understanding the used commands and how they work, open a command prompt window, execute there the following commands, and read entirely all help pages displayed for each command very carefully.

  • jrepl.bat /?
  • call /?
  • for /?

Понравилась статья? Поделить с друзьями:
  • Convert utf8 to windows 1251 powershell
  • Convert utf8 to windows 1251 php
  • Convert utf 8 to windows 1251 online
  • Convert unix path to windows path
  • Convert to gpt при установке windows