How to fix type mismatch in mql strings - General

Maxim Kuznetsov 2020.01.27 01:16 #1621

Nikolai Semko:

no I would have noticed. Although I don't exclude that in some cases (when working with Unicode) this is possible. In Java, for example, char type is 2 bytes.
I tried to parse data from crypto-exchange in two variants: via this JSON library and via working with char array.
The difference turned out to be 700(!!!) times by speed. I was shocked. Perhaps it was far from the best JSON implementation.

character is 16LE and strings are obviously from pascal . By the way and arrays from Fortran

Roman 2020.01.27 01:16 #1622

Nikolai Semko:

no I would have noticed. Although I don't exclude that in some cases (when working with Unicode) this is possible. In Java, for example, char type is 2 bytes.
I tried to parse data from crypto-exchange in two variants: via this JSON library and via working with char array.
The difference turned out to be 700(!!!) times by speed. I was shocked. Perhaps it was far from the best JSON implementation.

When passing mql string to dll, on dll side, mql string type is taken as wchar_t*.
And type size mismatch is not only found in Java, it depends on architecture type, I don't remember what, or operating system, or iron.

700 times? Wow, I was just putting this library aside for JSON parsing, it's not worth it?
And it's better to translateStringToCharArray and parse array in loop?

MetaTrader 5 web platform Development of applications in MetaTrader 5 for your

Nikolai Semko 2020.01.27 01:52 #1623

Roman:

700 times? Wow, I just put this library aside for JSON parsing, so it's not worth it?
And it's better to translateStringToCharArray and parse array in loop?

I think so, yes. Although you should always check it. Do some measurements. I don't rule out that the string functions weren't written in the best way, and now they've been fixed.
I took these measurements more than a year ago.

The code will of course be larger when working with char arrays, but it is more flexible.

MQL5 Wizard: Development of MetaEditor - Professional editor MetaTrader 5 Virtual Hosting

[Deleted] 2020.01.27 12:40 #1624

Roman:

And most likely under mql string there is short[] or wchar_t[] or wchar_t*.
After all, mql strings are in Unicode, while utf is 2 bytes.
And StringToCharArray converts from short[] to char[].

unicode != utf && utf != 2 bytes (utf is not the same as utf) && MSVC is not a standard

The point of wchar_t is to fit any supported character into a single wchar_t (well, about smallsoft their way), and the input output streams convert to/from locale encoding themselves. No size/encoding guarantees. When accepting wchar_t in dll, think about whether it's correct. Unless, of course, it's interesting to look beyond the sandbox into the adult world.

MetaTrader 5 Built-in Trading MetaTrader 5 for your Flexible MetaTrader 5 trading

Roman 2020.01.27 13:45 #1625

Vict:

unicode != utf && utf != 2bytes (utf utf'y is different) && MSVC is not a reference

The point of wchar_t is to fit any supported character into a single wchar_t (well, about smallsoft their way), and the input output streams convert to/from locale encoding themselves. No size/encoding guarantees. When accepting wchar_t in dll, think about whether it's correct. Unless, of course, it's interesting to look beyond the sandbox into the adult world.

Yes, I know that Unicode and UTF are different encodings, and they're supposed to be different.
I just wanted to write and abbreviate the word Unicode, so I guess I didn't get it right.

Although the Unicode reference says that the standard includes characters from almost every written language in the world.
The standard consists of two main parts: the Universal character set (UCS) and the Unicode transformation format (UTF).

Because Unicode already contains a UTF encoding, I put it that way to make the word shorter.

I don't know if wchar_t* is correct or not.
Used what's in Renat's examples, from the article how to write dll.
mql5 strings are in Unicode, which contains UTF, therefore I think it is logical to use wchar_t * in example of the article.
To accommodate any supported character in one wchar_t.

About no size/encoding guarantees, didn't even know about it, maybe use Cish short* for purity then ?
If it will be correctly supported by MSVC IDE, of course.
Because usual true will be swallowed by environment and give it TRUE.

Algorithmic (automated) trading in MetaTrader 5 Built-in Trading When MetaTrader 5 Web

Edgar Akhmadeev 2020.01.27 13:57 #1626

UTF-8 and UTF-16 have the appropriate bit depth.

In UTF-8 the language pages are switched by special codes.

UTF-16 includes the full variety of characters at the same time.

Roman 2020.01.27 14:13 #1627

Edgar Akhmadeev:

UTF-8 and UTF-16 have the appropriate bit depth.

In UTF-8 the language pages are switched by special codes.

UTF-16 includes the full variety of characters at the same time.

Well, as I understand from what many people write on the forum, mql5 strings are just in UTF-16
And in the mql documentation they write:
A text string is a sequence of characters in Unicode format with a trailing zero at the end.
Because of this, it is hard to understand which encoding is actually mql5 string.
And if Unicode already contains all families of UTF, then why even use the word UTF, and introduce confusion.
Unicode is all, plain and simple.
Or should we say so?
Unicode with a bitrate of UTF-16?

Actually someone from developers earlier wrote that
mql string type consists of two parts, buffer 8 bytes and pointer 4 bytes, resulting in 12 bytes.

Copilot coding assistant - MQL5 Wizard: Development of Flexible MetaTrader 5 trading

[Deleted] 2020.01.27 14:36 #1628

Roman:

I know that Unicode and UTF are different encodings.
Just as it happens, I wanted to write and abbreviate the word unicode, probably not luck.

Although the Unicode reference says that the standard includes characters from almost every written language in the world.
The standard consists of two main parts: the Universal character set (UCS) and the Unicode transformation format (UTF).

Because Unicode already contains a UTF encoding, I put it that way to make the word shorter.

I don't know if wchar_t* is correct or not.
Used what's in Renat's examples, from the article how to write dll.
mql5 strings are in Unicode, which contains UTF, therefore I think it is logical to use wchar_t * in example of the article.
To accommodate any supported character in one wchar_t.

You are confused. Unicode is a table of characters with codes, it used to fit in 0-65535(2 bytes), then it grew. And spending 4 bytes per character is fat. That's where utf, an encoding with variable length, came in handy (for example, utf-8 encodes ASCII characters with one byte). Therefore the Unicode (table) does not contain any utf.

About no size/encoding guarantees, didn't even know about it, maybe use Cish short* for purity then ?
If it will be correctly supported by MSVC IDE, of course.
Because usual true will be swallowed by environment and give it TRUE.

The standard includes char16_t, char32_t, fixed size types. Wchar_t has a different meaning.

MQL5 Wizard: Development of MetaEditor - Professional editor Flexible MetaTrader 5 trading

Edgar Akhmadeev 2020.01.27 14:42 #1629

Roman:

As I understood from what many people write on this forum, mql5 strings are in UTF-16.
And in the mql documentation they write:
A text string is a sequence of characters in Unicode format with a trailing zero at the end.
Because of this, it is hard to understand which encoding is actually mql5 string.
And if Unicode already contains all families of UTF, then why even use the word UTF, and introduce confusion.
Unicode is all, plain and simple.
Or should it be said that way?
Unicode with UTF-16 bit rate ?

That's not all.

As ANSI Cyrillic = CP1251, so

Unicode:

UTF-8 = CP65001, // UNIX/Linux

UTF-16LE = CP1200, // Windows

UTF-16BE = CP1251,

UTF-32LE = ?

UTF-32BE = ?

ISO10646:

UCS-2 ~ UTF-16

UCS-4 = UTF-32

Confusion? Nope, haven't heard.

Any rookie question, so Data Center information retrieval Problems with UTF-8 character

[Deleted] 2020.01.27 14:42 #1630

Edgar Akhmadeev:

UTF-8 and UTF-16 have the appropriate bit depth.

In UTF-8 the language pages are switched by special codes.

UTF-16 includes the full variety of characters at the same time.

What code pages, what are you talking about? The "special codes" define the number of bytes to encode a character because the encoding is of variable length. UTF-8 can encode any Unicode character as well as UTF-16. And utf-16 with variable length (surrogate pairs).

MetaEditor - Professional editor MQL5 Wizard: Development of Copilot coding assistant -

Features of the mql5 language, subtleties and tricks - page 163