utf 8 - Scala convert string between two charsets -
i have misformed utf-8 string consisting should written "michèle huà" outputs "michèle huÃ"
according table problem between windows-1252 , utf-8 http://www.i18nqa.com/debug/utf8-debug.html
how make conversion?
scala> scala.io.source.frombytes("michèle huÃ".getbytes(), "iso-8859-1").mkstring res25: string = michèle huà scala> scala.io.source.frombytes("michèle huÃ".getbytes(), "utf-8").mkstring res26: string = michèle huà scala> scala.io.source.frombytes("michèle huÃ".getbytes(), "windows-1252").mkstring res27: string = michèle huÃ
thank you
you don't have complete string there, due unfortunate issue 1 character printing blank. "michèle huà" when encoded utf-8 read windows-1252 "michèle huà ", last character 0xa0 (but typically pastes 0x20, space).
if can include character, can convert successfully.
scala> fixed = new string("michèle huÃ\u00a0".getbytes("windows-1252"), "utf-8") fixed: string = michèle huà
Comments
Post a Comment