Passing foreign language characters to/from a database

I am trying to allow users to enter Hebrew characters into certain fields in an HTML form (processed using java). I did some research, and it is apparent that the following tag needs to be part of the HTML document:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

That being done, I am having the following result: When the user enters Hebrew text in the input field, it will save and display on the screen properly, in Hebrew. However, the if I view the data in the database, it is unintelligible. Furthermore, if I try to output it to a file (using iText), it is gibberish. However, if I input the data straight into the database, it is readable in Hebrew in the database, as well as in the output file, but it is gibberish on the screen.

Sample: If the user entered it in the browser, it appears like this: עִבְרִית

The same string, when inputted in the database, appears like this on the screen: �Ѱ���

When looking at the database, the browser-inputted string looks like this: ×¢Ö´×ְרִ×ת

the manually entered string appears like this: עִבְרִית (although it appears left-to-right, whereas Hebrew is a right-to-left language; when copied and pasted here, it appears correctly, right-to-left)

Obviously, the database and browser are not "talking" the same language with this encoding. I am using SQL Server and did not make any changes to the database, other than ensuring that the field in question is defined as an nvarchar field. What am I missing?


It sounds like the database encoding is not set correctly. If the database is only expecting 8859-1 (a common default encoding scheme) then it will try to turn the utf-8 into 8859-1. This often doesn't work well.

Here is an article from MS on the issue:



 ? R on Windows: character encoding hell
 ? knitr and UTF8 encoding
 ? Matrix multiplication in knitr code chunk
 ? Display the knitr code chunk source in document
 ? hook to time knitr chunks
 ? Write latex equation inside knitr chunks
 ? Decreasing space between commands and output in knitr chunks
 ? Eval LaTeX code in R chunk (Knitr)
 ? Python how to encode 0x90(\x90) as macOS Roman Encoding to \xc3\xaa
 ? Python: How do I force iso-8859-1 file output?