Github: sometimes seeing mq5 and mqh files as "binary"

 

Hi all, I ran into a problem that seems to have surfaced for me recently.

Some of my .mq5 and .mqh files are incorrectly seen by Github as binary.  I have not changed anything in my ea-creation workflow (ie: using a third party text editor/IDE).  Once this happens to a file on Github, I cannot get github to see it any differently.

Per instructions, I created a .gitattributes file that specifies text for the files incorrectly identified by git, but it doesn't seem to matter on these particular files

*.ex5 binary
*.ex4 binary
*.mq5 text
*.mq4 text
*.mqh text

The only way to get Git to see those files as text again is to make a new file, copy paste the code from old file to new file, and use that one from there on.  Its a workaround, and kind of a PITA, but doable. I have tried saving the files in metaeditor as ANSI  I don't see any options in the metaeditor to ensure I am not using UTF-16 or UTF-8, which per some stackoverflow comments might cause files to be incorrectly ID'd as binary.  Also, not sure what made Git see those as binary in the first place

Anyone else notice this?  Is there a better way to manage it? 

Deep Neural Networks (Part VII). Ensemble of neural networks: stacking
Deep Neural Networks (Part VII). Ensemble of neural networks: stacking
  • www.mql5.com
We continue to build ensembles. This time, the bagging ensemble created earlier will be supplemented with a trainable combiner — a deep neural network. One neural network combines the 7 best ensemble outputs after pruning. The second one takes all 500 outputs of the ensemble as input, prunes and combines them. The neural networks will be built using the keras/TensorFlow package for Python. The features of the package will be briefly considered. Testing will be performed and the classification quality of bagging and stacking ensembles will be compared.
 

Yes, unfortunately MetaEditor by default saves files as Unicode UTF-16, and it does not seem that one can change this default behaviour in the settings or configuration file.

But, there is no need to create a new file and copy/paste. Just re-save the file from MetaEditor (Save As) and choose a different encoding.



 

Also, don't use any Unicode or ANSI characters in your code, only plain ASCII and you should be fine.

If you need to use Unicode text, just use the character code instead. For example, I use "\x00A9" instead of the "©" Copyright symbol.

#define MCopyright "Copyright \x00A9 2022, Fernando M. I. Carreiro, All rights reserved"
 
Fernando Carreiro #:

Also, don't use any Unicode or ANSI characters in your code, only plain ASCII and you should be fine.

If you need to use Unicode text, just use the character code instead. For example, I use "\x00A9" instead of the "©" Copyright symbol.

Thank you for the kind replies.

FYI, I found out the hard way (thankfully on a test file) that if I oversave to ANSI, then it is permanently converted to a two byte document, showing the code in Chinese.  I won't make that mistake.

After converting to utf 8, it now is visible in github as normal text. 

BUT - it also seems that I must change the name of the file, or else GitHub will "revert" back to showing it as a binary.  Crazy.


But, thank you Fernando!  This will help

 
Talky_Zebra #: But, thank you Fernando!  This will help

You are welcome!

 
Here is the PS script to convert all UTF-16 files to UTF-8. I had the same problem using github and having a lot of UTF-16 source files - this script did the trick. Use at your own risk.


function ConvertToUTF8 {
    param(
        [Parameter(Mandatory = $false)]
        [string]$path = "C:\path\to\your\files\",

        [Parameter(Mandatory = $false)]
        [string[]]$extensions = @('*.txt', '*.md'),

        [Parameter(Mandatory = $false)]
        [string]$logPath = "C:\path\to\your\log.txt",

        [Parameter(Mandatory = $false)]
        [bool]$preserveOldFiles = $true
    )

    function Test-IsUTF8 {
        param(
            [Parameter(Mandatory = $true)]
            [string]$file
        )

        $reader = [System.IO.StreamReader]::new($file, [System.Text.Encoding]::Default, $true)
        [void]$reader.Peek()  # The actual encoding is determined once we read something
        $encoding = $reader.CurrentEncoding
        $reader.Close()

        if ($encoding.BodyName -eq "utf-8") {
            return $true
        }
        else {
            return $false
        }
    }

    $files = Get-ChildItem -Path $path -Recurse -Include $extensions -File

    foreach ($file in $files) {
        if (-not (Test-IsUTF8 -file $file.FullName)) {
            $content = Get-Content -Path $file.FullName -Raw -Encoding Default
            if ($preserveOldFiles) {
                $oldFileName = "$($file.FullName).old"
                Rename-Item -Path $file.FullName -NewName $oldFileName
                Add-Content -Path $logPath -Value "Converted $file to UTF-8 and original file renamed to $oldFileName"
            }
            else {
                Remove-Item -Path $file.FullName
                Add-Content -Path $logPath -Value "Converted $file to UTF-8 and original file deleted"
            }
            Set-Content -Path $file.FullName -Value $content -Encoding utf8
        }
        else {
            Add-Content -Path $logPath -Value "Skipped $file as it's already in UTF-8 format"
        }
    }
}

ConvertToUTF8 -path "C:\path\to\your\files\" -extensions @('*.mqh', '*.mq4', '*.mq5') -logPath "C:\path\to\your\log.txt" -preserveOldFiles $true
 
Marcin Madrzak #:
Here is the PS script to convert all UTF-16 files to UTF-8. I had the same problem using github and having a lot of UTF-16 source files - this script did the trick. Use at your own risk.


Nice script!

Thank you.

However, the converted text files to UTF8 contains some errors if the original encoding of the files are other than UTF-16, for example ANSI.

I suggest 'Encoding Checker' program which detects encodings and converts the files accurately. Encoding Checker currently supports over forty charsets.

https://github.com/amrali-eg/EncodingChecker


GitHub - amrali-eg/EncodingChecker: A GUI tool that allows you to validate the text encoding of one or more files. Modified from https://encodingchecker.codeplex.com/
GitHub - amrali-eg/EncodingChecker: A GUI tool that allows you to validate the text encoding of one or more files. Modified from https://encodingchecker.codeplex.com/
  • amrali-eg
  • github.com
File Encoding Checker is a GUI tool that allows you to validate the text encoding of one or more files. The tool can display the encoding for all selected files, or only the files that do not have the encodings you specify. File Encoding Checker requires Microsoft .NET Framework 4 to run. Fixed issues Sorting the results by clicking a column...