• AVR Freaks

Hot!Strings in Russian Language

Author
ippascual
New Member
  • Total Posts : 3
  • Reward points : 0
  • Joined: 2018/04/24 03:14:55
  • Location: 0
  • Status: offline
2018/05/02 01:31:54 (permalink)
0

Strings in Russian Language

Hi,
 
I'm trying to introduce Russian language on a firmware that is already working on a PIC24F.
 
I used the font guide, but couldn't find how to automatically convert a cyrillic character into an hexadecimal value to have the string well represented in its alphabet.
 
Any idea on how to convert it ?
 
Best Regards
 
 
#1

11 Replies Related Threads

    maxruben
    Super Member
    • Total Posts : 3392
    • Reward points : 0
    • Joined: 2011/02/22 03:35:11
    • Location: Sweden
    • Status: offline
    Re: Strings in Russian Language 2018/07/07 10:10:30 (permalink)
    3.5 (2)
    Use Russian language on what? An LCD, UART or other communication?
     
    /Ruben
    #2
    DarioG
    Allmächtig.
    • Total Posts : 54081
    • Reward points : 0
    • Joined: 2006/02/25 08:58:22
    • Location: Oesterreich
    • Status: offline
    Re: Strings in Russian Language 2018/07/07 14:37:12 (permalink)
    5 (1)
    try
    Хорватия
    grin (sorry, just a joke)

    GENOVA :D :D ! GODO
    #3
    Mysil
    Super Member
    • Total Posts : 3670
    • Reward points : 0
    • Joined: 2012/07/01 04:19:50
    • Location: Norway
    • Status: offline
    Re: Strings in Russian Language 2018/07/08 05:36:49 (permalink)
    4 (1)
    In Legacy MLA, there is Graphics Resource Converter, that is able to make a resource file with font characters for a selection of Unicode characters for any character set or language that is available in the font file selected.
     
    In current MLA,  it is in:   .../ MLA/ framework/gfx/utilities/grc
    There is a: Graphics Resource Converter Help.pdf     in the same directory.
     
        Mysil
    #4
    malaugh
    Super Member
    • Total Posts : 408
    • Reward points : 0
    • Joined: 2011/03/31 14:04:42
    • Location: San Diego
    • Status: offline
    Re: Strings in Russian Language 2018/07/08 07:16:47 (permalink)
    0
    We use the graphics resource converter to make the fonts for our projects, including Russian translation. The way we use this is to make a font filter, which is a list of characters to be included. So for Russian you make file that includes English and Russian characters, then have the GRC convert those. The result will be a file that includes English and Russian characters. The problem is to use the characters you need to know the index value in the character array. We use custom firmware to convert from a UTF8 value to the index. We also need to handle character values of varying length since the English characters are one byte, and the Russian characters are 2 bytes.
    #5
    MBedder
    Circuit breaker
    • Total Posts : 6876
    • Reward points : 0
    • Joined: 2008/05/30 11:24:01
    • Location: Zelenograd, Russia
    • Status: offline
    Re: Strings in Russian Language 2018/07/08 08:48:30 (permalink)
    5 (2)
    malaughWe use custom firmware to convert from a UTF8 value to the index. We also need to handle character values of varying length since the English characters are one byte, and the Russian characters are 2 bytes.
    Bullsh¡t. The true Russian characters are 65536 bytes wide.
    If you do not believe this you can travel to Russia and ask any bear walking by the street drinking vodka and playing balalaika LoL
    #6
    Mysil
    Super Member
    • Total Posts : 3670
    • Reward points : 0
    • Joined: 2012/07/01 04:19:50
    • Location: Norway
    • Status: offline
    Re: Strings in Russian Language 2018/07/08 15:04:27 (permalink)
    4 (2)
    Hi,
    There are at several ways to address a extended character set in a font / resource file. 
     
    A:
    16 bit wide character codes, giving possibility to use 65536 different characters, including all European and Russian languages, and also modern Chinese, Japanese and Korean script, without having to handle extension codes.
    In Windows, and some other systems this is called widechar type, and need some extensions in C library print formatting functions:
    These are available in XC32 as: 
        wprintf (const wchar_t *, ...)
        fwprintf(__File *, const wchar_t *, ...)
        swprintf (wchar_t *, size_t, const wchar_t *, ...)
    There is  <wchar.h>  that may need to be included.
    In Legacy MLA there is a macro  USE_MULTIBYTECHAR that mean to use 16-bit characters.
    This is a mistake and misnaming, and should really be changed to: USE_WIDECHAR.
    I do not know if it have been corrected in current MLA for PIC24.
     
    B:
    USE_MULTIBYTECHAR should be reserved to mean Variable Width Character representation.
    Unicode UTF8  use between 1 and 3 bytes to represent any Unicode character.
    It is defined in such a way that the 128 most common characters in american and european languages,
    are the same as ASCII characters, so C language source code will need 1 byte characters only.
    Then there are byte codes that indicate that there are more bytes following to represent the character.
    Look up Wikipedia or Unicode documentation that explain how UTF8 is coded.
    Using UTF8 multibyte character representation, there is no need for special handling by the Compiler:
    All exotic character codes are just text data bytes as seen by the compiler.
    The tricky work have to be done by Application code when rendering text to Display or Print.
    Then Multibyte character codes must be decoded into pointers/indices to address Font data, stored by the  Graphics Resource Converter.
    There exist library functions to translate UTF8 Multibyte text  into wide character 16 bit character codes,
    that may be more convinient when addressing the Font resource file.
    See:   mbstowcs (...);     and:    mbtowc(wchar_t *pwc, const char *s, size_t n);
    it is unclear to me, if the functions provided with XC16 really work as they are supposed to.
     
    C:
    Using Font Filter.
    This is scrambling the character representation for a closed selection of text,
    such that all characters that are actually present, are given new character codes within the range: 0 to 255. 
    This may work for a set of menu texts, or other fixed user interface work.
    It probably  isn't usable in a text processing environment where general text is imported from file or other external sources.
     
       Mysil
     
     
     
     
     
     
    post edited by Mysil - 2018/07/08 15:47:20
    #7
    kuku
    Senior Member
    • Total Posts : 141
    • Reward points : 0
    • Joined: 2012/03/03 08:05:54
    • Location: 0
    • Status: offline
    Re: Strings in Russian Language 2020/03/17 16:19:03 (permalink)
    0
    Mysil. I work under harmony 2 (XC32 2.40) I have many place where I make dynamic text via printf function. Now I need to add some special sign, who as I think need two bytes. Of cource printf will not work and I find this topic.

    So, I include <wchar.h> and I can open this file from mplab so they exist. But compiler not see wprintf definition. 
    /firmware/src/my_screen_function.c:46: undefined reference to `swprintf'

    I need to include or do something more to use wprint?.
    #8
    NKurzman
    A Guy on the Net
    • Total Posts : 18687
    • Reward points : 0
    • Joined: 2008/01/16 19:33:48
    • Location: 0
    • Status: online
    Re: Strings in Russian Language 2020/03/17 16:42:45 (permalink)
    0
    If you want a character set greater than 256 characters then you must enable it in MHC.
    Then import a font that contains it. Remember that 65536 (16bit) will take up a lot of space. So do not import more than you need.
    #9
    alj
    New Member
    • Total Posts : 14
    • Reward points : 0
    • Joined: 2015/02/14 13:31:44
    • Location: 0
    • Status: offline
    Re: Strings in Russian Language 2020/03/30 20:33:24 (permalink)
    0
    THere are several legacy encoding for Russian characters - koi8r, win1251, cp866, iso-somethingsomething. THey all got replaced by Unicode. And Unicode can be different. UTF16 is 16 bits per character, and in UTF8 character length in bytes is not constant (can be 1 byte or 5 bytes). So there is not enough information in your question to have any answer.
     
    If you JUST want to have russian (and English) language and nothing else (i e no other language) then the easiest way is to go with koi8-r. It is 1 byte per character, and you can find free koi8-r fonts in unix world because that encoding was defacto in UNIX prior to Unicode. That is easiest, but the correct one is to go Unicode (utf8) and for that you need some library, it is not easy to code by yourself. And everything will take more space and will become slower.
    #10
    MBedder
    Circuit breaker
    • Total Posts : 6876
    • Reward points : 0
    • Joined: 2008/05/30 11:24:01
    • Location: Zelenograd, Russia
    • Status: offline
    Re: Strings in Russian Language 2020/03/31 02:30:57 (permalink)
    5 (1)
    The correct way to use an 8-bit Russian character encoding is using Win1251 charset because it is native to Windows as opposed to an ancient and forgotten koi8-r. Trust me, I'm a lawyer Russian LoL
    #11
    alj
    New Member
    • Total Posts : 14
    • Reward points : 0
    • Joined: 2015/02/14 13:31:44
    • Location: 0
    • Status: offline
    Re: Strings in Russian Language 2020/03/31 02:37:43 (permalink)
    0
    Windows uses Unicode now as well so they are both forgotten.
    But almost all koi8-r fonts are free/opensource where with win1251 you can get unto licensing troubles.
     
    About conversion - there are plenty of opensource tools (google will find) as well as online converters
    https://2cyr.com/decode/
     
     
    PS: another benefit of koi8-r - if you strip 8th bit by incorrectly configuring USART - all text will still be readable in latin characters :-P 
    #12
    Jump to:
    © 2020 APG vNext Commercial Version 4.5