Convert .XLS to tab separated .TXT

By | February 27, 2018
Questions:

Can I somehow convert Excel .XLS file to txt-tsv (tab-separated-values) file, using C#?

Answers:

You may read that XLS file easily via OleDb (ADO.NET provider) and create a StreamWriter object to write data into the Text/TSV file.

using (OleDbConnection cn = new OleDbConnection())
        {
            using (OleDbCommand cmd = new OleDbCommand())
            {
                cn.ConnectionString = @"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + @"C:\path\file.xls" + ";Extended Properties=\"Excel 8.0;HDR=Yes;IMEX=1\";";
                cmd.Connection = cn;
                cmd.CommandText = "select * from [Sheet1$]";
                using (OleDbDataAdapter adp = new OleDbDataAdapter(cmd))
                {
                    DataTable dt = new DataTable();
                    adp.Fill(dt);
                    using (StreamWriter wr = new StreamWriter(@"C:\path\flie.tsv"))
                    {
                        foreach (DataRow row in dt.Rows)
                        {
                            wr.WriteLine(row[0] + "\t" + row[1]);
                        }
                    }
                }
            }
        }

Questions:
Answers:

Using OleDb can be tricky and can cause issues depending on the version of excel the spreadsheet was created on. For example the above example would work with .xls, but not .xlsx. You would have to change your connection string from “Microsoft.Jet.OLEDB.4.0” to “Microsoft.ACE.OLEDB.12.0” in order to compensate. However, it would still not be universal to all excel sheets. I would use Microsoft.Office.Interop.Excel like follows

Microsoft.Office.Interop.Excel.Application myExcel;
Microsoft.Office.Interop.Excel.Workbook myWorkbook;
Microsoft.Office.Interop.Excel.Worksheet worksheet;

myExcel = new Microsoft.Office.Interop.Excel.Application();
myExcel.Workbooks.Open(inputFileName.xls, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value);
myWorkbook = myExcel.ActiveWorkbook;
worksheet = (Microsoft.Office.Interop.Excel.Worksheet)myWorkbook.Worksheets[1];
myWorkbook.SaveAs(outputFileName.txt, Microsoft.Office.Interop.Excel.XlFileFormat.xlTextWindows, Missing.Value, Missing.Value, Missing.Value, false, Microsoft.Office.Interop.Excel.XlSaveAsAccessMode.xlNoChange, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value);

myWorkbook.Close(false, Missing.Value, Missing.Value);
myExcel.Quit();

No looping, no bs. Just copy, paste, and change your filenames. The only problem I have seen is sometimes myExcel.Quit() doesn’t seem to work correctly and instances of excel stay open I the background. The way around this is to kill the process manually by your program, but I will save that for a different discussion.

Leave a Reply

Your email address will not be published. Required fields are marked *