Question for developers - using all computational cores during optimisation - page 3

 
Renat Fatkhullin:
Rebuild of the tester is a priority for us now. A lot of things will be rewritten.

The problem of a rational task manager is solved.

We would like to understand the deadline for fixing the error, please as soon as possible... Can you give us an estimate of when to wait?

 
Maksim Emeliashin:

Many times I wrote about this problem, but I was sent to read how the genetic algorithm works. I do know how it works, and in my 4th year at university I even implemented it myself as a lab.

My situation was even worse, here is a screenshot:


With version 2286 it got better, there is no such an obvious bug anymore, but periodically half of the agents still fail for good. I know how to fix it, but it's a pain in the ass.

Describe the problem!

The older the generation, the fewer cores are needed for the calculation.

How to use 18 agents for 3-4-5 unique parameter sets in the next generation?

You say you know how genetics works - give us your suggestions

 
Boris Egorov:

We would like to understand the deadline for correcting the error, please as soon as possible... Can you give us an estimate of when to wait?

Which error are you referring to?

Have you read how the genetic optimization algorithm works?

 
Slava:

Describe the problem!


I will describe a solution that does not require knowledge of the algorithm.

Disconnect one processor core at the time the problem occurs (half of the local or network agents have failed). It is important to disable the core which is currently running.

2. Switching the kernel back on.

And, suddenly, all other local and network agents are switched on and working fine until the very end.

 
Maksim Emeliashin:

I will describe a solution that does not require knowledge of the algorithm.

Disconnect one of the processor cores at the time the situation occurs (half of the local or network agents have failed). It is important to disable the core which is currently running.

2. Switching the kernel back on.

Suddenly, all other local and network agents come online and work fine until the end.

Yes, I even suspect why the "error" occurs and why this tipster "fixes" it. But without seeing the source code of MQ's specific implementation in front of your eyes, it is meaningless to speculate about it.

But even seeing the black box in front of us, we can assume that the problem lies in the distribution of job packages between agents.

 
Slava:

What error are you referring to?

Have you read how the genetic optimization algorithm works?

I don't need to know the algorithm, although I do.

And you do not need to play the smartass, because it does not look like it.

If you have not read previous posts and do not see pictures - do not interfere and do not show your ignorance.

the bug is .... It was not present in previous versions and it is undeniable.

It amazes me sometimes when some guys show up out of nowhere, read nothing, write crap like they're smart.

Slava - read my previous posts with pictures, everything is described in detail there, I'm a programmer myself but I don't do such stupid things, you're talking nonsense about generations ... It's useless to explain if you don't read previous posts with pictures, besides I think you don't know the algorithm yourself ...

>The older the generation, the fewer cores are needed for calculations.

>How to use 18 agents for 3-4-5 unique sets of parameters in next generation?

it works like this from the start in the second generation, in my case count another 70-80k variants... Itaccepts LOTS of jobs ONLY for local agents anddoesn't acceptnetwork agentsat all, in fact, they disabled all network agents completely, optimization is not working from the word FULL, the error is CRITICAL and needs to be solved immediately

 
Boris Egorov:

I don't need to know the algorithm, although I do.

And you do not need to play the smartass, because it does not look like it.

If you have not read previous posts and do not see pictures - do not interfere and do not show your ignorance.

the bug is .... It was not present in previous versions and it is undeniable.

It amazes me sometimes when some people come out of nowhere, read nothing, write crap like they're smart.

Slava - read my previous posts with pictures, everything is described in detail there, I'm a programmer myself but I do not do such stupid things, about generations you are writing nonsense ... It's useless to explain if you have not read previous posts with pictures, besides I think you do not know the algorithm yourself...

You showed one screenshot. Without any description, other than "not all cores are loaded".

You can understand from this screenshot that genetics works, calculation of the second generation. What is the minimum and maximum execution time per task is unknown. What is the average execution time is also unknown - the right place in the screenshot is simply closed.

Again a guess - the average execution time is very short. Therefore, the job redistribution mechanism has not yet been activated.

The re-distribution mechanism has not changed since previous versions. For at least half a year. It seems that most of the randomly selected parameters are not suitable for this strategy, so most of the passes ended very quickly.

This is just a diagnosis from one incomplete screenshot. Without any logs provided.

 
Slava:

You showed one screenshot. Without any description other than "not all cores loaded".

You can tell from this screenshot that genetics works, second generation calculation. What is the minimum and maximum execution time per job is unknown. What the average execution time is also unknown - the right part of the screenshot is just closed.

Again a guess - the average execution time is very short. Therefore, the job redistribution mechanism has not yet been activated.

The re-distribution mechanism hasn't changed since previous versions. For at least half a year. It looks like most of the randomly chosen parameters are not suitable for this strategy, which is why most of the passes ended very quickly.

This is just a diagnosis from one incomplete screenshot. Without any logs provided.

I use full overshoot and clearly wrote - that before optimization took 3 hours now 11 and a half ... - this is your answer.

>What is the minimum and maximum execution time per job is unknown. What is the average execution time is also unknown - the right place in the screenshot is simply closed.

You don't need to know this at all.

>The resharing mechanism has not changed since previous versions. For at least half a year. It looks like most of the randomly chosen parameters are not suitable for this strategy, so most of the passes ended very quickly.

it all started after the latest updates, i haven't changed the program, i basically only do calculations with different parameters, i'm telling you that the same program (without recompilation) with the same parameters used to take 3 hours to optimize, now its 11 and a half, and i'm telling you - all the network agents are disabled in fact .... So do not say that the mechanism of distribution has not changed - it has definitely changed.

 
Boris Egorov:

I'm using a full overshoot and clearly wrote - that previously the optimization took 3 hours now 11 and a half ... - that's your answer.

>What is the minimum and maximum execution time per job - unknown. What is the average execution time is also unknown - the right place in the screenshot is simply closed.

You don't need to know this at all.

>The resharing mechanism has not changed since previous versions. For at least half a year. It looks like most of the randomly chosen parameters are not suitable for this strategy, so most of the passes ended very quickly.

it all started after the latest updates, i didn't change the program, i basically only do calculations with different parameters, i'm telling you that the same program (without recompilation) with the same parameters used to take 3 hours to optimize, now its 11 and a half, and i'm telling you - all network agents are disabled in fact .... So don't say that the distribution mechanism hasn't changed - it has definitely changed.

You haven't provided any logs.

Why don't your remote agents count? Why do they have build 2214? Is the client terminal also a 2214 build?

 
Slava:

You have not provided any logs.

Why aren't your remote agents counting? Why is their build 2214? Is the client terminal also build 2214?

2286

if you need logs it's hard, it's easier to run any Expert Advisor with a large set for optimisation

but if you tell me where to place the logs i will try to do it

i just do not understand that the logs in some time exceed all imaginable size, and turn off or limit them in any way does not want, so i have to clean them

i can only do it in about 12 hours when i run a new calculation

The above advice to disable one of the working cores works by the way :-) which confirms a bug in the distribution algorithm