You are currently browsing the monthly archive for December 2009.
Yeah, winters back. Besides the really short days of sunlight I am really psyched.
I like to thank wordpress for the falling snow option to help all get in the spirit of the fun white fluffy stuff.
Sledding anyone?
Unicode Characters converted to ASCII string
I hacking together a report today and discovered the Unicode text I received was actually in Unicode not ASCII.
Basically I have this: こんにちは
By using AscW(Char) you can convert a Unicode character into an integer value. Add some delimiters to encode the string and you have a Unicode HTML Entity Reference. It isn’t perfect, as AscW(Char) sometimes returns a negative number, which isn’t allowed, though this is an easy work around explained here. It is used below.
Public Function UnicodeToAscii(sText As String) As String Dim x As Long, sAscii As String, ascval As Long If Len(sText) = 0 Then Exit Function End If sAscii = "" For x = 1 To Len(sText) ascval = AscW(Mid(sText, x, 1)) If (ascval < 0) Then ascval = 65536 + ascval ' http://support.microsoft.com/kb/272138 End If sAscii = sAscii & "&#" & ascval & ";" Next UnicodeToAscii = sAscii End Function
Now lets go the other way: ASCII string to Unicode
And I want this: こんにちは
I remembered that ChrW(int) will convert character codes to their associated character. I really wasn’t in the mood to write parsing logic and test it, but luckily I came across a class which does this. I ripped out the method I needed and it worked great in all it’s simplicity. I have included this function below:
Public Function AsciiToUnicode(sText As String) As String Dim saText() As String, sChar As String Dim sFinal As String, saFinal() As String Dim x As Long, lPos As Long If Len(sText) = 0 Then Exit Function End If saText = Split(sText, ";") 'Unicode Chars are semicolon separated If UBound(saText) = 0 And InStr(1, sText, "&#") = 0 Then AsciiToUnicode = sText Exit Function End If ReDim saFinal(UBound(saText)) For x = 0 To UBound(saText) lPos = InStr(1, saText(x), "&#", vbTextCompare) If lPos > 0 Then sChar = Mid$(saText(x), lPos + 2, Len(saText(x)) - (lPos + 1)) If IsNumeric(sChar) Then If CLng(sChar) > 255 Then sChar = ChrW$(sChar) Else sChar = Chr$(sChar) End If End If saFinal(x) = Left$(saText(x), lPos - 1) & sChar ElseIf x < UBound(saText) Then saFinal(x) = saText(x) & ";" 'This Semicolon wasn't a Unicode Character Else saFinal(x) = saText(x) End If Next sFinal = Join(saFinal, "") AsciiToUnicode = sFinal Erase saText Erase saFinal End Function
I didn’t always understand why you wouldn’t just want to work with the Unicode characters themselves. Well is seems that not all applications treat Unicode the same way and the characters may be changed. If you are storing and passing around a text representation of the characters there is no way for them to be misinterpreted.
One of the neatest things I like about this is that I can just put the text represented Unicode in a web page and the browser will automatically convert it to Unicode characters. This is the reason I needed to use an image above to show what the text represented Unicode looks like. If I just put the string there, it is converted by the browser when displayed.
If you have been to this post in the past, you have probably noticed that it has changed a bit. That is because I had it all backwards! Yeah well it happens. I said I want wanted to change Unicode characters to Ascii string, but the code actually was for the other way around. Well I finally got around to fixing this and made sure that code worked before displaying it. I hope this helps someone out there.
This has driven me crazy for weeks, I just haven’t been able to access web_dav I setup at dreamhost.com.
I found a perfect article on how to do it at Geek Boy’s Blog. It’s so simple,…
Make sure you add the port number to the url you provide for the network place.
E.g. http://www.mydomain.com:80/foo
Once I did that, I connected instantly. No more need for third party apps, I can just access it. 🙂