You are currently browsing the monthly archive for December 2009.
Yeah, winters back. Besides the really short days of sunlight I am really psyched.
I like to thank wordpress for the falling snow option to help all get in the spirit of the fun white fluffy stuff.
Sledding anyone?
Unicode Characters converted to ASCII string
I hacking together a report today and discovered the Unicode text I received was actually in Unicode not ASCII.
Basically I have this: こんにちは
By using AscW(Char) you can convert a Unicode character into an integer value. Add some delimiters to encode the string and you have a Unicode HTML Entity Reference. It isn’t perfect, as AscW(Char) sometimes returns a negative number, which isn’t allowed, though this is an easy work around explained here. It is used below.
Public Function UnicodeToAscii(sText As String) As String
Dim x As Long, sAscii As String, ascval As Long
If Len(sText) = 0 Then
Exit Function
End If
sAscii = ""
For x = 1 To Len(sText)
ascval = AscW(Mid(sText, x, 1))
If (ascval < 0) Then
ascval = 65536 + ascval ' http://support.microsoft.com/kb/272138
End If
sAscii = sAscii & "&#" & ascval & ";"
Next
UnicodeToAscii = sAscii
End Function
Now lets go the other way: ASCII string to Unicode
And I want this: こんにちは
I remembered that ChrW(int) will convert character codes to their associated character. I really wasn’t in the mood to write parsing logic and test it, but luckily I came across a class which does this. I ripped out the method I needed and it worked great in all it’s simplicity. I have included this function below:
Public Function AsciiToUnicode(sText As String) As String
Dim saText() As String, sChar As String
Dim sFinal As String, saFinal() As String
Dim x As Long, lPos As Long
If Len(sText) = 0 Then
Exit Function
End If
saText = Split(sText, ";") 'Unicode Chars are semicolon separated
If UBound(saText) = 0 And InStr(1, sText, "&#") = 0 Then
AsciiToUnicode = sText
Exit Function
End If
ReDim saFinal(UBound(saText))
For x = 0 To UBound(saText)
lPos = InStr(1, saText(x), "&#", vbTextCompare)
If lPos > 0 Then
sChar = Mid$(saText(x), lPos + 2, Len(saText(x)) - (lPos + 1))
If IsNumeric(sChar) Then
If CLng(sChar) > 255 Then
sChar = ChrW$(sChar)
Else
sChar = Chr$(sChar)
End If
End If
saFinal(x) = Left$(saText(x), lPos - 1) & sChar
ElseIf x < UBound(saText) Then
saFinal(x) = saText(x) & ";" 'This Semicolon wasn't a Unicode Character
Else
saFinal(x) = saText(x)
End If
Next
sFinal = Join(saFinal, "")
AsciiToUnicode = sFinal
Erase saText
Erase saFinal
End Function
I didn’t always understand why you wouldn’t just want to work with the Unicode characters themselves. Well is seems that not all applications treat Unicode the same way and the characters may be changed. If you are storing and passing around a text representation of the characters there is no way for them to be misinterpreted.
One of the neatest things I like about this is that I can just put the text represented Unicode in a web page and the browser will automatically convert it to Unicode characters. This is the reason I needed to use an image above to show what the text represented Unicode looks like. If I just put the string there, it is converted by the browser when displayed.
If you have been to this post in the past, you have probably noticed that it has changed a bit. That is because I had it all backwards! Yeah well it happens. I said I want wanted to change Unicode characters to Ascii string, but the code actually was for the other way around. Well I finally got around to fixing this and made sure that code worked before displaying it. I hope this helps someone out there.
This has driven me crazy for weeks, I just haven’t been able to access web_dav I setup at dreamhost.com.
I found a perfect article on how to do it at Geek Boy’s Blog. It’s so simple,…
Make sure you add the port number to the url you provide for the network place.
E.g. http://www.mydomain.com:80/foo
Once I did that, I connected instantly. No more need for third party apps, I can just access it. 🙂

