Globalization and Localization...
The .NET Framework provides support for globalization and localization. Localization is the custumization of data and resources for specific 'locales' or languages. Whereas in pre-Internet days it wasn't unusual for an application to be designed for use in a single country, now an Internet application can be accessed by users from anywhere in the world.
The .NET Framework provides support for globalization and localization. Localization is the customisation of data and resources for specific 'locales' or languages. Whereas in pre-Internet days it wasn't unusual for an application to be designed for use in a single country, now an Internet application can be accessed by users from anywhere in the world.
A locale categorises a collection of data and rules specific to a language and geographical area. These include information on sorting rules, date and time formatting, numeric and monetary conventions and symbols and character encoding. Thus you can see localization isn't simply about language and translating the text of the user interface.
ASP.NET uses Unicode which makes our life easier. The use of Unicode with culture encodings allows us to tailor response data. The pertinent namespaces for Localization are System.Globalization and System.Threading.
When writing an application for multiple locations around the world you could write completely different sets of source code for each location. That wouldn't be a very efficient approach however, with much overlap between each application. More sensibly you would write one set of source code and build into the solution the ability to customise the application for different locations. This is achieved in VB.NET via techniques such as locale-aware formatting functions and resource files.
Microsoft divides the process of preparing a 'world ready application' into three phases:
The obvious item is that any text should be translated; others are:
Careful judgement should be used to assess how far to proceed with localization … it can become a very time consuming and expensive enterprise where the costs start outweighing the benefits.
The System.Globalization namespace provides most of the support in .NET for localization in VB.NET applications. Key concepts are cultures and resource files. A culture is an identifier for a particular locale. A resource file is a place where you can place culture-dependent resources that .NET cannot handle automatically.
Pertinent class libraries include various calendar classes including GregorianCalendar, HebrewCalendar, JulianCalendar, etc. as well as the CultureInfo class. The latter class provides all the basic functionality for changing the response encoding format, which in turn controls structures such as the language, writing systems and calendar used by a particular culture.
In particular, the CultureInfo class maintains the values for CurrentCulture and CurrentUICulture.
CurrentCulture indicates the encoding culture used for locale-dependent formatting such as with dates. CurrentUICulture indicates the encoding culture used for resource lookups.
The current settings can be obtained via the properties:
| CultureInfo.CurrentCulture.EnglishName |
and
| CultureInfo.CurrentUICulture.EnglishName |
You have a host of other information additionally at your disposal via the CultureInfo class, including date and time format, number format and calendar information. Thus you can see that use of the CultureInfo class will be key to localizing applications.
There's one more culture to know about – the invariant culture. This is a special culture used where there is no interaction with end users: Its use can be categorised into 2 areas:
Finally, the .NET framework handles localization on a thread by thread basis where each thread has accessible CurrentCulture and CurrentUICulture properties via the Thread.CurrentThread object. I don't want to go into further detail regarding threading in this article. Suffice to say that multithreading support is a good thing and this is why you'll see and use references to the threading namespace in localization code.
Time for some code to reinforce the concepts introduced thus far and to see what the allied code looks like. I've implemented an ASP.NET web form to allow you to select a specific culture and view corresponding information about it.
|
<%@ Page Language="VB" %> <%@ import Namespace="system.Globalization" %> <%@ import Namespace="system.Threading" %> <script runat="server"> Sub Page_Load if not ispostback then dim ci as CultureInfo for each ci in CultureInfo.GetCultures(CultureTypes.SpecificCultures) ddlCultures.Items.Add(ci.Name) next end if End Sub Sub ddlCultures_SelectedIndexChanged(byval s as object, e as eventargs) Thread.CurrentThread.CurrentCulture = new CultureInfo(ddlCultures.SelectedItem.Text) tbCultureName.text = Thread.CurrentThread.CurrentCulture.EnglishName dim dtNow as date = DateTime.Now dim dblCurrency as double = 123456.78 dim dblNumber as double = 12345678.90123 tbDate.text = dtNow.ToLongDateString() tbCurrency.text = dblCurrency.ToString("c") tbNumber.text = dblNumber.ToString("n") End Sub </script> <html> <head> </head> <body> <form runat="server"> <p> Select Culture: </p> <p> <asp:DropDownList id="ddlCultures" runat="server" AutoPostBack="true" OnSelectedIndexChanged="ddlCultures_SelectedIndexChanged"> </asp:DropDownList> </p> <p> <asp:TextBox id="tbCultureName" runat="server"></asp:TextBox> </p> <p> <asp:TextBox id="tbDate" runat="server"></asp:TextBox> </p> <p> <asp:TextBox id="tbCurrency" runat="server"></asp:TextBox> </p> <p> <asp:TextBox id="tbNumber" runat="server"></asp:TextBox> </p> </form> </body> </html> |
The above is all fairly straightforward so I’ll leave you to investigate.
In terms of setting the culture for each user you can do so based on their system configuration or you can let them explicitly set their culture. Things become a mite more complicated with ASP.NET as normally the .NET Framework would automatically default to the culture currently in use on the system on which it is running. In the web scenario of course this would be the web server on which your application is running which wouldn't be much good!
You can however code your ASP.NET application to retrieve the user's languages as set up in the users web browser software via request.UserLanguages with the first language of the returned array being the default language of the browser. Thus we have automated culture detection! There are a few caveats that mean you may want to opt for a non-automated culture selection solution however:
The .NET framework offers support for user interface localization through the ability to select a set of user interface resources at runtime. These resources are contained in assembly resource files which are XML structured files that contain the pertinent text versions.
Visual Studio .NET provides strong support for working with resource files, allowing you to work directly with assembly resource files. I won't go into great detail here but briefly amongst the items you may add to the project via the 'Add New Item' dialog are resource file templates which are created with the .resx extension. The tool that opens such resource files allows you to enter names and values to identify all of the text strings in the user interface. You would create one of these resource files for each language / culture you wish to support.
In your code you'll need to add a reference to the System.Resources namespace. You then may create and use a ResourceManager object which allows you to retrieve text from the assembly resource files based on the current setting of CurrentUICulture. You may then use the GetString method of the ResourceManager object to retrieve the text values associated with the name you’ve previously defined.
You don't have to use assembly resource files by the way – resources can be stored in text files as well. For those without Visual Studio .NET the .NET Framework provides the ResGen utility which takes a .txt or .resx resource file and creates a .resources file which allow direct access to resource files. If interested in exploring further take a look at the SDK documentation.
There are many different approaches to representing characters of a language as numeric codes within computer memory. For example, the ASCII encoding represents common Latin characters via numeric codes 0 through 127. .NET provides support for this and other encodings via the classes of System.Text.Encoding.
Windows and .NET support Unicode encoding. The .NET Framework's encoding of choice is UTF-16 – the 16bit Unicode standard providing the ability to represent approximately 65,000 distinct characters. Other Unicode standards are even more powerful.
Hopefully, you won’t need to worry about conversion to other encoding standards. If you do meet this as a requirement, the System.Text namespace provides a number of classes to support conversion to and from UTF-16. When might you need to look at this? If you need to connect or interact with other systems, or their data, which use more restrictive encodings.
A further issue that relates to localization that we haven't covered thus far is mirroring. The requirement for mirroring occurs because some languages are read from right to left rather than left to right. Arabic is an example. Hence a key facet to mirroring is reversing the presentation of text strings. However, you'd like to reverse (or mirror) the entire UI if possible for complete consistency. .NET provides partial and indirect support for mirroring in ASP.NET via the HTML dir attribute which is simply added to the documents HTML tag, as follows:
| <HTML dir=”rt1”> |
If you do this you will see that partial mirroring occurs: controls will fill from left to right as you enter text and drop down lists, buttons, checkboxes etc. will reverse their appearance. The mirroring is imperfect however as menus and buttons won't change position automatically and controls will not be mirrored to the opposite position on the form from their initial design. These you will have to implement in code yourself if so desired.
This is our final topic looking at globalization and localization. Further code changes may be required in dealing with different alphabets when it comes to string indexing and data sorting.
String indexing is extracting a single character from a longer string. Remembering that our textual data is being held in UTF-16 in .NET you might think you could simply iterate through the data 16 bits at a time, treating each 16 bit sequence as separate character data for comparison. Unfortunately matters aren’t that simple.
Unicode supports surrogate pairs and combining character sequences:
So, how to we deal with this non-determinism? We use the inherent support provided by the .NET framework. In this instance the System.Globalization.StringInfo class is designed to be able to iterate through such string elements.
The GetTextElementEnumerator of the StringInfo class provides a mechanism for iterating through a string that properly handles both surrogate pairs and combining characters. The Current property of the class returns the single character at the current position for comparison.
Different cultures use different alphabetical orders to sort strings and different cultures also compare strings differently. Fortunately the culture aware .NET Framework will handle most situations automatically for you via the following features:
I hope this has provided an overview of some of the issues involved when it comes to globalization and localization of applications and .NET web applications in particular. The above pointers to the involved classes should allow you to implement some of the ideas presented in your .NET code and allow you to develop 'world ready' applications.
Developing and implementing web applications with VB.Net and VS.Net
Mike Gunderloy
Que
ASP.NET: Tips, Tutorial and Code
Scott Mitchell et al.
Sams
.NET Framework SDK documentation.
You may run the example program here.
You may download the code here.