Unicode and ANSI string operation

In windows Unicode concept used a lot in application.

Unicode is worldwide character-encoding standard.Used for string and character manipulation.

Difference between different types is given in below link:

Unicode, UTF, ASCII, ANSI format differences

Size:

Wide character : 2 bytes

ASCII character: 1 byte

Problem:

In programming side sometimes we have problem when dealing with string manipulation.

When code uses wide character and ASCII character set both then it might create problem when doing string operation such as copy,append,initialize operation.

To support conversion of old code(ASCII) to Unicode, Microsoft took the TCHAR route.

  • It works according to UNICODE identifier.
      • If UNICODE identifier is defined then TCHAR works as wchar_t (wide character)

If UNICODE identifier is not defined then TCHAR works as char (ASCII character)

Print message in Application:

  • Many times we use printf in application to print message.But when WCHAR and char both are used then sometimes it doesn’t print proper message.
  • For example,

If UNICODE character set is selected,

#define MAX_BUFFER_SIZE 32767

WCHAR buf[MAX_BUFFER_SIZE];

GetSystemDirectory(buf, MAX_BUFFER_SIZE);

printf(“Buffer= %s\n”,buf); // %S should use

Above code gives,output as:

Buffer= C

To print correct output,in code changed %s to %S

printf(“Buffer= %S\n”,buf); //

That gives output as:

Buffer= C:\Windows\System32

We need to keep watch that whether UNICODE character set is selected or not and depending on that we need to write functions.

Use of TCHAR to overcome this problem:

Above example can be rewrite as below:

#ifdef UNICODE
typedef WCHAR TCHAR;
#else
typedef char TCHAR;
#endif

#ifdef UNICODE
#define __TEXT(x) L##x
#else
#define __TEXT(x) x
#endif

#define TEXT(x) __TEXT(x)
typedef TCHAR *LPTSTR;
typedef const TCHAR *LPCTSTR;

Above section should define at starting of code for type conversation.

#define MAX_BUFFER_SIZE 32767

TCHAR buf[MAX_BUFFER_SIZE];

GetSystemDirectory(buf, MAX_BUFFER_SIZE);

_tprintf(TEXT(“Buffer = %s\n”),buf);

Now _tprintf works same as printf and its typedef according to UNICODE defined or not.

If UNICODE is defined then TCHAR works as WCHAR,otherwise as char.

Same as printing string other operation such as copy and concat can be easily done.

Copy:

TCHAR path[MAX_BUFFER_SIZE];

_tcscpy(path,buf);

Concat:

_tcscat(path,TEXT(“\\browser.dll”));

Please give suggestion so that i can make this even better.Hope this will help you!!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s