Tuesday, March 10, 2015

Loading CSV files with special characters in Oracle DB

I often need to load the data of Excel or CSV files into the Oracle Database.

Ever got those annoying question marks when you try to load the data? or instead of question marks you just get empty blanks when the file is using special characters? Here's an example:


My database characterset is UTF-8, so ideally you want to load your data UTF-8 encoded.

With Excel I've not found an easy way to specify the encoding to UTF-8 when saving to a CSV file.
Although in Excel (OSX) - Preferences - General - Web Options - Encoding, I specified UTF-8, it still saves the file as Western (Mac OS Roman).

I've two workarounds I use to get around the issue. Open the file in a text editor e.g. BBEdit and click the encoding option and select UTF-8.


Another way is to open Terminal and use the iconv command line tool to convert the file

iconv -t UTF8 -f MACROMAN < file.csv > file-utf8.csv

If you get a CSV file and you want to import it in Excel first, the best way I found is to create a new Workbook and import the CSV file (instead of opening directly). You can import either by using File - Import or Data - Get External Data - Import Text File. During the import you can specify the File origin and you can see which data format works for you.


After the manipulations in Excel you can save again as CSV as outlines above to make sure you resulting CSV file is UTF-8 encoded.

Finally to import the data you can use APEX, SQL Developer or SQLcl to load your CSV file into your table.

No comments: