Inserting data in SQL*Plus correctly

When inserting data into the database, it is occasionally forgotten (especially by English-speakers) that we need to take steps to ensure we are inserting data correctly and without unexpected character translation.

For example, in SQL*Plus we need to ensure we set the NLS_LANG environment variable to the correct setting for our database before we initiate SQL*Plus.

Here’s a quick example showing what can go wrong:

[oracle@ORA122 ~]$ echo $NLS_LANG

[oracle@ORA122 ~]$ sqlplus neil/neil
SQL*Plus: Release 12.2.0.1.0 Production on Thu Dec 22 13:56:23 2016
Copyright (c) 1982, 2016, Oracle.  All rights reserved.
Last Successful login time: Thu Dec 22 2016 13:46:58 +00:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

NEIL @ pdb1 > create table lang (col1 varchar2(50 CHAR));
Table created.

NEIL @ pdb1 > insert into lang values (q'{J'ai cassé l'insert}')
  2  /
1 row created.

NEIL @ pdb1 > commit;
Commit complete.

NEIL @ pdb1 > select * from lang;

COL1
--------------------------------------------------
J'ai cass? l'insert

As we can see, the accented ” é ” has been lost in translation somewhere. This is not good.

If we set NLS_LANG correctly and repeat the insert, we get a different result:

[oracle@ORA122 ~]$ export NLS_LANG=AMERICAN_AMERICA.AL32UTF8
[oracle@ORA122 ~]$ sqlplus neil/neil
SQL*Plus: Release 12.2.0.1.0 Production on Thu Dec 22 13:56:23 2016
Copyright (c) 1982, 2016, Oracle.  All rights reserved.
Last Successful login time: Thu Dec 22 2016 13:46:58 +00:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

NEIL @ pdb1 > insert into lang values (q'{J'ai cassé l'insert}')
  2  /
1 row created.

NEIL @ pdb1 > commit;
Commit complete.

NEIL @ pdb1 > select * from lang;

COL1
--------------------------------------------------------------------------------
J'ai cass� l'insert
J'ai cassé l'insert

Note how the original insert is now returning even more troublesome nonsense that it was previously!

So, how do we know what to set the NLS_LANG variable to avoid this? The answer is partly in the database.

NEIL @ pdb1 > select * from v$nls_parameters 
              where parameter in ('NLS_LANGUAGE','NLS_TERRITORY','NLS_CHARACTERSET');

PARAMETER                                VALUE                    CON_ID
---------------------------------------- -------------------- ----------
NLS_LANGUAGE                             AMERICAN                      3
NLS_TERRITORY                            AMERICA                       3
NLS_CHARACTERSET                         AL32UTF8                      3

Which gives us:

export NLS_LANG=AMERICAN_AMERICA.AL32UTF8

However, this just tells you about the database character set.
That may not be what we are after!

The NLS_LANG should reflect the setting of the operating system character set of the client. In Windows there is no UTF-8 client, so you need to select the correct client code page.

On Linux, check out the value of $LANG

[oracle@ORA122 ~]$ echo $LANG
en_US.UTF-8

Chances are it’s UTF8, so it all matches up.

export NLS_LANG=AMERICAN_AMERICA.UTF8

If it is Windows, well that’s more complex as you need to match the code page of the client correctly. Windows generally matches the code page to the Unicode via its API’s, but you may have different code pages based upon whether you are on the command line (sqlplus) or OEM (SQL Developer)

A list of Windows code pages is held here.

On an English Windows client, the code page is 1252.

If you are using SQL Developer, in Tools => Preferences, check the encoding.

sqldev_encoding

You can find out what command-line code page you are running using “chcp”

Microsoft Windows [Version 10.0.14393]
(c) 2016 Microsoft Corporation. All rights reserved.

C:\Users\neil>chcp
Active code page: 850

NOTE: This is the DOS (sqlplus) codepage, not the GUI Codepage.

MS-DOS codepage Oracle Client character set (3rd part of NLS_LANG)
437 US8PC437
737 EL8PC737
850 WE8PC850
852 EE8PC852
857 TR8PC857
858 WE8PC858
861 IS8PC861
862 IW8PC1507
865 N8PC865
866 RU8PC866
set NLS_LANG=american_america.WE8PC850

sqlplus ...

 

It is also worth bearing in mind that the new database characterset default in Oracle 12.2 is now AL32UTF8.

Don’t forget to ensure your terminal session won’t mess up any character translation too. Set it to the correct server character set.

putty_utf8

Happy Inserting!

Use Unicode character sets, and go read the manual on this. It’s not straightforward!

 

%d bloggers like this: