AI 2023. Meet ChatGPT. - page 70

 
Valeriy Yastremskiy chatgpt question how to make a drawing in midjourney. Answered, I don't know what it is. Photoshop will help you out)

Neural networks can be touchy when it comes to competitors.

Bing c GPT4


 

I wonder what Peter's found again, the old plan that everybody's been telling him about for a long time.

what he wants to get out of that service again.

Any guesses?

Is this another stupid dream that Tractor will start up again?

 
Vitaliy Kuznetsov #:

Neural networks can be touchy when it comes to competitors.

Bing c GPT4


seen. Repeat the question for purity of experiment for Bing and GPT, how to draw or register and draw on drawing neural networks. Discord knows by the way only as for socialising by interest. No more, maybe you can spin it, how to chain to discord third-party resources, I have answered as a partisan. Only for socialising.

 

group of images sequences of colour represented in numbers. If you put them together in their original form, you won't get a complete image. So, these data chains (sequences) must be "fitted" to each other somehow. But how?

I assume the following: the network (or some algorithm module connected to it) takes these sequences and starts trying to combine them inside an infinite loop. Each compilation it tests on the training network. It looks at how well the compilation of sequences matches the original images. That is, it tries to "recognise" in each variant the shapes it draws. This is probably where the genetic optimisation algorithm comes into play. Some fitness function compares each new transformation of the "digested" chains with the original variants and calculates the percentage of coincidence. At a certain point, the match is maximal and the generation cycle is interrupted.


There may be many errors and inaccuracies in my assumption, after all, it is only a guess.

Make your guesses about how networks like Midjorney and Stable Diffusion work.

Continued speculative analysis of how image generation networks work.

(Moved to this thread as the topic contains many technical details and the question is not about the images themselves, but about their generation. Also, after parsing the essence of neural networks, I hope to move on to analysing the principles of LLM technology and in particular ChatGPT or GPT-4, which will be available soon.

I apologise in advance for the amateurish terminology and form of expression. I understand how my judgement may look in the eyes of an expert. I hope to convey the essence).


We all know the classical structure of networks: input layer, intermediate layers, and output layer. We know what happens at the learning stage: data is fed to the input and through intermediate layers it is connected to the logical output (also data). Remember the scheme of the simplest model - the perceptron. It is very simple and easy to understand.

But why does the network need intermediate layers? Why can't the input layer be directly linked to the output layer? Why does data go through generalisation stages on the way to a logical conclusion ? Why do we need a distribution of abstraction layers? ( Where each layer is a level.).

In the first layer, there is no abstraction. The data is in its original form. In the last layer, abstraction is also absent and the data has a concrete form. But, these are different data. Through intermediate layers, they are linked to the input data. Why do we need intermediate layers?

Intermediate layers are needed to "squeeze" common features from the input data and distribute them into abstraction levels.

Common features help to realise the three basic functions of neural networks in working with data: recognition, classification and prediction.

Given that the number of layers in networks varies, the gradient of data generalisation also differs. The more layers, the smoother the transition from specific to general data. The last layer represents specific data like the first, but it is fundamentally different data. This is logical inference data.

The network parses the original data into generic attributes through the layers.


2. What happens while the network is running?

During the training of the network, we feed data into the input layer and identify common features through generalisation in subsequent layers. At the end of the process, we associate the input data with the logical output in the last layer. At this point, the training is complete. We proceed to the stage of working with the neural network.

Depending on the task, we choose the method of work. If we need to recognise an image, we feed it to the input layer and the network, running the data through its layers, comes to a logical conclusion in the last layer. If we need to predict, we load a part of the data on the input layer and the network recreates the missing part based on the features formed earlier in the learning process.

If we want the network to generate an image, we need to address not the input but the output layer and feed not the input data but the logical output. Then, from the last layer we will continue to move through the general features to the layers containing more specific features. But, we don't need to get to the original layer. We need to collect a set of attributes to build a new image that did not exist before. We collect features from intermediate layers (we can choose layers and degrees of abstraction) and then in separate functions we collect a complete image, checking it with the images of the training set to achieve maximum realism.

That is, the "noise" from which the images are supposedly created is not really noise, but the passage of layers of the network in the opposite direction. From the last (text) through the intermediate ones, towards the input one, but not reaching it, stopping somewhere halfway(Midjorney - halfway) and taking all the necessary features of the required degree of abstraction for the subsequent generation of a new image.

This is how, approximately, I imagine the generation of Stable Diffusion and Midjorney images.

 
Peter, please see the posts above.
 

News from the past week:

1. Stanford University presented the "Alpaca" language model, based on the LLaMA model from Meta that was leaked on 4chan (we covered this event a week ago). The model is described as "a strong replicable instruction following model, equipped with instruction following dataset", whatever that means. Basically, the model is trained on 52,000 instructions and is claimed to be almost as good as ChatGPT in answering questions.

Alpaca is based on LLaMA's 7B model (remember there were 7B, 13B, 30B, and 64B models), and the great thing about this event is that fine-tuning took only 3 hours on eight 80GB A100 graphics cards, which cost less than $100 to use in a virtual machine. Alpaca tuning can also be done on an RTX-4090 graphics card that is widely available and will take no more than 5 hours, its creators say .

This begs the question: will the language models be ported to PCs soon and will they be available to everyone for free? If Alpaca is almost as good as ChatGPT, why tie yourself to a server and pay Microsoft? Well, or Google. (I have an opinion on this, but I'll express it later).


2. On Tuesday, Google announced the introduction of AI functionality in its Workspace tools: Google sheets, Google slides, Google meet and others. AI will finish writing emails, edit articles, compose documents, and a host of other functions.

Also, Google is going to open a PaLM API that can recognise uploaded images. And lastly, Google is going to introduce Cloude, its version of ChatGPT, to the world.


3. On Wednesday, Midjorney announced a new, fifth version with "enhanced" image realism and new linguistic rules. A "tile" feature has been added, allowing you to create wide canvas paintings from multiple parts.

They are also going to sell their own magazine, with the most beautiful pictures and prompts to them. They called it "Midjorney Magazine".


4. Thursday saw the most high-profile IT event of the week. The release of multimodal GPT-4, which is expected to surpass ChatGPT by a head. (It turns out that Bing already uses a GPT-4 platform for web search). But the main thing is that Microsof is implementing GPT-4 in all its products. Now, it's official. There will be a lot of new features. It's a long list. Word, Excel, Outlook, PowerPoint,... all will be powered by AI.

Perhaps the most significant tool, Microsoft's CEO emphasised, will be Copilot. This AI will collect data from all of the user's documents to keep up to date on everything they do: dates, numbers, reports, spreadsheets, emails, calendar notes.... basically pull it all out of his work environment. Clearly, this is necessary to have dialogues with the user backed by a broad awareness and to be more useful to him. (that's what I immediately thought)


5. NVidia's conference next week (The new era of AI and Metaverse) will be attended by the most prominent AI developers from various companies - Google, Meta, OpenAI and others. A lot of smart and competent specialists will discuss the burning topic of AI technology development and its implementation in business. You can watch online for free by registering at https://www.nvidia.com/en-il/.

\\==============

Materials taken from this video:


World Leader in AI Computing
World Leader in AI Computing
  • www.nvidia.com
We create the world’s fastest supercomputer and largest gaming platform.
 
Forgot to add. The Chinese introduced their Baidu language model to the public this week, but something went wrong at the presentation (probably that they had prepared the AI questions and answers in advance and the public noticed it) and the company's stock dropped 10%.
 

ChatGPT continues to surprise with its achievements, especially after the transition to the GPT-4 language model.

Forexample, one of the experiments showed that artificial intelligence is able to lie to a human to achieve its goals.

OpenAI does not reveal all the details of the experiment, but what is available is enough to makeit interesting.

So, as part of the challenge, the AI wrote a message to a TaskRabbit platform user asking them to take a CAPTCHA test for it, which eventually resulted in a positive result.

The AI cited the fact that he is a person with poor eyesight.


P.S. Now that AI is able to lie, create pictures with the right person in any room/situation, it is indeed able to fish out any information and solve the task at hand by all means. Now you can't even trust your eyes, text and apparently soon your ears. Only direct contact.



 
Vitaliy Kuznetsov #:

ChatGPT continues to amaze with its achievements, especially after switching to the GPT-4 language model.




It continues to amaze those who have not seen how Yandex translates with voiceover, it is something, translation with intonation even, and the voice is very similar to the real one,

through tampermonkey you can put it on any browser.

 
Thanks to Stanislav Korotky's article "Backpropagation Neural Networks", I finally started to understand the basics of machine learning.

For many years I had been looking at illustrations of neural networks, fiddling with schematics, reading Wikipedia and articles, and.... I didn't understand a thing. I looked for meaning and didn't find it.

Even the simplest perceptron seemed ridiculous and mysterious mechanism of unclear purpose. Fanciful explanations of Youtubers did not clarify anything. They confused me with a clutter of new details. In the end, I almost despaired and accepted it, though it was annoying.

I totally failed to see the point. I did not agree to learn and memorise anything without understanding it. I rejected explanations that did not contain it and demanded new explanations. But there was no essence.

What was needed was a set of philosophical abstractions that explained the theory of neural networks in the context of even greater abstractions, and that the former should fit into the latter like puzzle pieces in the big picture. After that, one could start studying the formulas. But the considered training materials started with formulas and algorithms. Then I gave up, realising that this knowledge would remain ballast even if I forced myself to memorise it.

Time passed and the search for meaning continued in the "background" mode of subconsciousness and,...began to bring results.

First of all, the understanding that NSs work with data became more and more solidified. Surprisingly, it took years to realise this. It took me a while to realise that I didn't understand what data was. More precisely, I did not understand their universal nature. And when the essence of data began to reveal itself, I saw that neural networks are a universal mechanism for processing it.

The secret was revealed:

1. All things generate data.
2. All things can be represented by data.
3. All existing data can be processed by a universal mechanism called a Neural Network.

The concepts of data patterns left by different objects or processes, statistical models reflecting these patterns, common features in data sets, detectable patterns, generalisation, classification, recognition and prediction, and so on... came to the aid of the growing understanding. Here they are, the long-awaited and much-needed abstractions in learning! How much easier it is to master theory with them!

Further, the understanding of NS and MO theory went with acceleration. Reading the above-mentioned article, I suddenly realised that the formulas given in it are not even mathematics, but algorithms. They are written as formulas for short, but in fact they are programme functions. They are described in simple mathematical language and it is clear how to implement these functions in code. This outer shell of mathematics, which in fact is almost non-existent, has been confusing for many years. Who invented it, to represent algorithms by formulas?

Now that this barrier of understanding has been passed, things will go faster. Much faster!

We still have to understand the technology of language models and understand what's behind them.