String to Char, Char to String, Uchar to Char behavior.

Ting 2024.07.06 06:54

After long inspection, I found many behaviors related to char, are not documented. I'd probably miss out them and I chose to experiment.

Suspect I: when a string contains hexadecimal code, the hex code always become 2 bytes long, \xFF becomes \xFF00. Instead of prepend, it appends, losing its integrity as hex.

      printf("%d",0xF0);//prints 240 (F0
      printf("%d","\xF0");//prints 0
      printf("%d","\xF0"[0]);//prints 240 (F0

Solution II: therefore, when trying to convert this weirdo string to char array, let's say, using my function, it will makes iterations, each iteration points to the starting byte, and always read only first byte long, discard/ignoring second byte, then next iteration do the same thing; the pointer shift 2bytes long each iteration.

Question III: how CharToStr() convert the uchar to ushort, what \xFF became after this, Suspect I looks doesn't apply here (haven't find a way to test).

I'm not sure whether which claims are correct which are not, but it seems legit to me now, and partially explains why my telegram bot message containing an UTF-8 emoji was rejected. And, since every portion of UTF-8 char("\xF0\x9F\x93\x89") are within 1byte (0x00 to 0xFF), convert request data using code page CP_ACP works fine on this machine (win10) before sending out. Below is the function for me to safely put any UTF8 char within a string.

string oneByteToTwoByteNicely(string ori){
   string twoBytes = "";
   for(int i=0; i<StringLen(ori); i++){
      twoBytes += CharToStr(ori[i]);
   }
   return twoBytes;
}

I hereby summon the Ancient Grand Programmer, please lead an ignorant lamb like me, to sleep better next day, tell me what's the truth behind. Or if you are like me, having little to no clue on it, pls share your thought on it, we could check these together.

How to create bots for Telegram in MQL5

www.mql5.com

This article contains step-by-step instructions for creating bots for Telegram in MQL5. This information may prove useful for users who wish to synchronize their trading robot with a mobile device. There are samples of bots in the article that provide trading signals, search for information on websites, send information about the account balance, quotes and screenshots of charts to you smart phone.

Ting 2024.07.06 09:31 #1

Suspect II : Hexadecimal code turns into ushort(single character in string) using prepend; while CharToStr() || CharToString() will be using append

Taras Slobodyanik 2024.07.06 10:44 #2

As for me, everything is in the site documentation.
Different functions, different data types.

string  CharToString(
   uchar  char_code      // numeric code of symbol
   );

string  ShortToString(
   ushort  symbol_code      // symbol
   );

uchar

255

unsigned char, BYTE

ushort

65 535

unsigned short, WORD

Documentation on MQL5: Language Basics / Data Types / Integer Types / Char, Short, Int and Long Types

www.mql5.com

The char type takes 1 byte of memory (8 bits) and allows expressing in the binary notation 2^8=256 values. The char type can contain both positive...

Ting 2024.07.06 16:30 #3

Ting #:
Suspect II : Hexadecimal code turns into ushort(single character in string) using prepend; while CharToStr() || CharToString() will be using append

Edit: sorry, in reverse, my brain was not braining;

For a hex within a string like "\xFF" (255 in dec), I suspected, it turns into \xFF00 (append 00, no longer 255, integrity lost); while CharToStr() would retain the hex integrity using prepend (\x00FF still 255) The reason why this was post, or why I wonder, is because, putting the hex code inside a string, it becomes a totally different character. I couldn't get my emoji sent.

I did doubt it is all about my machine problem. Because I seen some forum posts saying way to put emoji is, put the bytes into the string and convert as request data, then it works just fine (none of them mention the integrity problem I hit).

I was exhausted and need a good sleep. Hopefully I could prove this point later.

void OnStart()
  {
      string chart_upward_utf8 = "Price went down after spike up \xF0\x9F\x93\x89";
      api_sendMessage(chart_upward_utf8);
  }
//+------------------------------------------------------------------+
      string simpleJsonForm(string key, string val){
         string doubleQ = "\"";
         return StringConcatenate(doubleQ,key,doubleQ,":",doubleQ,val,doubleQ);
      }
      string oneByteToTwoByteNicely(string ori){
         string twoBytes = "";
         for(int i=0; i<StringLen(ori); i++){
            twoBytes += CharToStr(ori[i]);
         }
         return twoBytes;
      }
      
      //need nicely formatted 2byte wide string literal instead of 1byte weirdo
      bool api_sendMessage(string msg){
         string apiName = "sendMessage";
         string requestHeaders = "";
         string requestUrl = "";
         uchar requestData[];
         char response[];
         string responseHeaders;
         
         string token = "xxxtokenxxx;
         string chatId = "-1002220888888";
         string endpoint = "https://api.telegram.org";
         int timeout = 4000;
         
         string comma = ",";
         
         ResetLastError();
         requestUrl = StringFormat("%s/bot%s/%s",endpoint,token,apiName);
         requestHeaders = StringConcatenate("Content-Type:application/json");
         string oriData = "{" + simpleJsonForm("chat_id",chatId) + comma + simpleJsonForm("text",msg) + "}";
         StringToCharArray(oneByteToTwoByteNicely(oriData),requestData,0,StringLen(oriData),CP_ACP);
         //without calling my function, it doesn't become a correspond ANSI code, or whatever it should (integrity), it just turns into unknown -> ???
         Print(msg);//prints - Price went down after spike up x??? 
         //it simply fked up :(
         
         int code = WebRequest("POST",requestUrl,requestHeaders,timeout,requestData,response,responseHeaders);
         if(code != 200){
            Print("error when sendMessage, ",code, GetLastError(),", ",CharArrayToString(response));
         }
         if(code == 200){
            return true;
         }
         return false;
      }

I hope this clarify the situation.

The biggest unknown is, how MQL actually convert the "\xFF" (1byte, append 00 or prepend 00?) or \xFFF or \xFFFF (this 2bytes should left untouched, right? right?) to a 2bytes long ushort.

Data types and values Implicit type conversion Statements, code blocks, and

Taras Slobodyanik 2024.07.08 15:59 #4

Ting #:
The biggest unknown is, how MQL actually convert the "\xFF" (1byte, append 00 or prepend 00?) or \xFFF or \xFFFF (this 2bytes should left untouched, right? right?) to a 2bytes long ushort.

You cannot convert one part of the text into one byte (ANSI), and another part of the text into 4 bytes (UTF-8).
It must be in the same format.

string   chart_upward_utf8 = "Price went down after spike up 📉";
uchar    arr_char[];

StringToCharArray(chart_upward_utf8,arr_char,0,WHOLE_ARRAY,CP_UTF8);
Print(CharArrayToString(arr_char,0,WHOLE_ARRAY,CP_UTF8));

New comment