Details

    • Type: Workaround
    • Status: Prepared
    • Affects Version/s: EXASolution 4.2.0, EXASOL 6.0.0, Exasol 6.1.0, EXASolution 5.0.0, Exasol 6.2.x
    • Fix Version/s: None
    • Component/s: 3rd Party Tools, ODBC
    • Labels:
      None
    • Solution:
      Hide

      Note: This solution is no longer maintained. For the latest information, please refer to our documentation:

      https://community.exasol.com/t5/database-features/correct-settings-for-odbc-driver/ta-p/1440

      Background

      Reading special characters with Informatica, beyond the German or English ANSI codepage, results in character strings like "Ã, ¨". 

      Informatica allows us to configure whether the ANSI interfaces or Unicode interfaces of the ODBC driver are to be used.
      On the technical level of the ODBC-API, this means whether to call function SQLPrepareA (ANSI) or SQLPrepareW (Unicode).
      Other examples would be SQLConnectA/SQLExecDirectA versus SQLConnectW/SQLExecDirectW.

      The W-functions are using wchar_t data types. In Microsoft Windows, these are encoded by UTF-16 little-endian (previously, for Windows NT, it was UCS-2). Thus, there should not occur any problems under Windows as long as the Unicode interfaces are used, because each application under Windows should respect this "convention". 

      Prerequisites

      Informatica PowerCenter. Windows 2008R2.  Informatica has to be configured to use Unicode interfaces via UTF-8 encoding. Additionally, it is necessary to enforce the ODBC driver to apply non-standard UTF-8 encoding for its W-interfaces and to avoid any further issues for its the A-interfaces as well.

      How to set ODBC settings for special characters

      Step 1

      This could be done via ODBC connection string:

      AnsiArgEncoding=CP_UTF8;AnsiDataEncoding=CP_UTF8;UnicodeArgEncoding=CP_UTF8;UnicodeDataEncoding=CP_UTF8
      

      Alternatively, the Windows registry can be used to apply this setting on the system level for the Informatica Windows server.
      For an exemplary ODBC DSN "myexaodbcdsn", the following REG file would do the job:

      Windows Registry Editor Version 5.00
      
      [HKEY_LOCAL_MACHINE\SOFTWARE\ODBC\ODBC.INI\myexaodbcdsn]
      "AnsiArgEncoding"="CP_UTF8"
      "AnsiDataEncoding"="CP_UTF8"
      "UnicodeArgEncoding"="CP_UTF8"
      "UnicodeDataEncoding"="CP_UTF8"
      

      Step 2

      For testing the correct behavior, a vanilla (plain, ordinary)  table can be used.
      The table was created via JDBC, with EXAplus/JDBC/Java there had not been any Unicode problems.

      create schema s;
      open schema s;
      create table t(v varchar(100));
      insert into t values 'öäü';
      insert into t values 'è';     -- http://www.i18nqa.com/debug/bug-utf-8-latin1.html 
      insert into t values '‰€œ';  -- http://www.i18nqa.com/debug/bug-iso8859-1-vs-windows-1252.html
      insert into t values 'ÁÍÏÐÝ'; -- http://www.i18nqa.com/debug/bug-double-conversion.html
      select * from t;
      
      Show
      Note: This solution is no longer maintained. For the latest information, please refer to our documentation: https://community.exasol.com/t5/database-features/correct-settings-for-odbc-driver/ta-p/1440 Background Reading special characters with Informatica, beyond the German or English ANSI codepage, results in character strings like "Ã, ¨".  Informatica allows us to configure whether the ANSI interfaces or Unicode interfaces of the ODBC driver are to be used. On the technical level of the ODBC-API, this means whether to call function SQLPrepareA (ANSI) or SQLPrepareW (Unicode). Other examples would be SQLConnectA/SQLExecDirectA versus SQLConnectW/SQLExecDirectW. The W-functions are using wchar_t data types. In Microsoft Windows, these are encoded by UTF-16 little-endian (previously, for Windows NT, it was UCS-2). Thus, there should not occur any problems under Windows as long as the Unicode interfaces are used, because each application under Windows should respect this "convention".  Prerequisites Informatica PowerCenter. Windows 2008R2.  Informatica has to be configured to use Unicode interfaces via UTF-8 encoding. Additionally, it is necessary to enforce the ODBC driver to apply non-standard UTF-8 encoding for its W-interfaces and to avoid any further issues for its the A-interfaces as well. How to set ODBC settings for special characters Step 1 This could be done via ODBC connection string: AnsiArgEncoding=CP_UTF8;AnsiDataEncoding=CP_UTF8;UnicodeArgEncoding=CP_UTF8;UnicodeDataEncoding=CP_UTF8 Alternatively, the Windows registry can be used to apply this setting on the system level for the Informatica Windows server. For an exemplary ODBC DSN "myexaodbcdsn", the following REG file would do the job: Windows Registry Editor Version 5.00 [HKEY_LOCAL_MACHINE\SOFTWARE\ODBC\ODBC.INI\myexaodbcdsn] "AnsiArgEncoding"="CP_UTF8" "AnsiDataEncoding"="CP_UTF8" "UnicodeArgEncoding"="CP_UTF8" "UnicodeDataEncoding"="CP_UTF8" Step 2 For testing the correct behavior, a vanilla (plain, ordinary)  table can be used. The table was created via JDBC, with EXAplus/JDBC/Java there had not been any Unicode problems. create schema s; open schema s; create table t(v varchar(100)); insert into t values 'öäü'; insert into t values 'è'; -- http://www.i18nqa.com/debug/bug-utf-8-latin1.html insert into t values '‰€œ'; -- http://www.i18nqa.com/debug/bug-iso8859-1-vs-windows-1252.html insert into t values 'ÁÍÏÐÝ'; -- http://www.i18nqa.com/debug/bug-double-conversion.html select * from t;
    • Category 1:
      3rd Party Tools
    • Category 2:
      Clients, Interfaces & Drivers - ODBC

      Attachments

        Issue Links

          Activity

            People

            • Assignee:
              joas John Askew
              Reporter:
              CaptainEXA Captain EXASOL
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated: