AI 2023. Meet ChatGPT. - page 73

 

The intellectual power of GPT-4:


 
Sergey Gridnev #:
What's up with the scalper testing?



It doesn't matter. The character's about to start using the field for zeno-wod gossipers! Plus code-wankers.... ;-)

Horror! IMHO.... ;-)

With a complete lack of sane meaningfulness.... ;-)
 
Реter Konow #:

Planned testing of ChatGPT is cancelled as it is rapidly fading into history.

A new intellectual monster is rising: GPT-4 .

25K tokens vs. 4K!

Qualitatively new level of text and code + illustrations and image handling.

We will test GPT-4.

No. Changed my mind.

We will test ChatGPT after all. Although it doesn't have much time left, we will have results which we will then use to compare with GPT-4.

This will help us measure the speed of progress and evaluate the near future.

We will start the tests tonight.

 

And then there's this.

Отлично, вот код, который можно использовать для создания расширения Google Chrome:


```

// Set the reminder interval in milliseconds (3600000 milliseconds = 1 hour)

var interval = 3600000;


// Function that will be executed every interval of time

function remindToRest() {

alert("It's time to rest! Pause and take your eyes off the monitor.")

}


// Start the timer for the reminder

setInterval(remindToRest, interval);

```


To create an extension, you need to follow these steps:


1. Create a new folder and name it, for example, "my-extension".

2. In the folder, create a manifest.json file with the following contents:


```

{

"name": "The name of your extension",

``` { "version": "1.0",

"manifest_version": 2,

"description": "Description of your extension",

"browser_action": {},

"permissions": [

{ "alarms",

"notifications"

]

}

```


3. Copy the code from the example into the content.js file in the "my-extension" folder.

4. Open Google Chrome, go to chrome://extensions/ and enable "Developer Mode".

5. Click on the "Download unzipped extension" button and select the "my-extension" folder.

6. The extension is installed! It will now remind you every hour to rest.


You can use the Chrome API notifications library to notify the user. To add a notification, you need to add the following code to the remindToRest() function:


```

chrome.notifications.create({

type: "basic",

iconUrl: "icon.png",

title: "It's time to rest!",

message: "Take a break and take your eyes off the monitor."

});

```


Remember that before you do this, you need to download the image for the extension icon and save it with the name "icon.png" in the "my-extension" folder.


 
Vitaliy Kuznetsov use resources locally rather than from a global data centre.

...

  • The scaling and embedding of models into work and development environments will continue.
  • The current trend of total AI expansion will continue and will require the collection, loading, and processing of unthinkable amounts of data that will be collected from all areas of human activity.
  • Identifying patterns and training neural networks on connections and patterns of work processes requires global computing resources. This allows us to reach new levels of automation.
  • The number of users sending AI queries to the server will continue to grow and requires proportional scaling. All work and development environments will be permanently connected to data centres.


AI cannot be used locally. The more capabilities it has, the more data and resources it requires, and the less likely it is to create an offline version.

Perhaps in the future, a local computer will be able to run a version of ChatGPT limited to chatbot functions, but no more than that.


Conclusion: isolated, local and offline use of working software on PCs, without AI support, will be a thing of the past.

Connecting to AI means connecting to a server and I don't see any other way.


 

Current Neural Network technology promises to increase AI intelligence in proportion to the increase in training set data and data centre capacity. It may seem so at first glance. You can even construct a graph. But, somewhere buried here is a limit that has yet to be found.

Increasing data and processing power will continue to increase AI intelligence, but not indefinitely. Somewhere there is a boundary and a plateau. But where? And when?

 
Реter Konow how to test this devil box that knows EVERYTHING. But my confusion was short-lived.

First of all, it became clear that ChatGPT does not count well. Of course it knows maths, but it solves only easy equations. Also, he repeatedly gave logical lapses and contradictions in judgements. It became clear that he knows elementary logic, but as with equations, he "fails" when the complexity is increased. However, it was not interesting to specially develop tests to reveal the exact level of mathematical and logical abilities. Clearly, this is the third to fifth grade of primary school.

The previous question remained unanswered: how to define the area of domestic and professional application of AI, and to assess its real commercial potential? And where will the boundary of possibilities be in a month, a year? Of course, it depends on the technology, and who among us here knows it well enough to say for sure? The product is not finished and is constantly evolving.

We can say that at the moment we are dealing with a half-finished product that is both good and bad at the same time. It is unfinished. No one can say what other opportunities the technology will offer at new stages of development, but right now it is like a fashionable suitcase without a handle. It both exceeds expectations and disappoints at the same time.


Nevertheless, we will continue testing and move on to benchmarking - comparative analysis based on benchmarks. We will compare ChatGPT with Wikipedia. The goal is to evaluate AI as a reliable source of complete and reliable information by comparing it to another, proven source of complete and reliable information. Let's figure out whether AI has the right to be used in responsible and important work.


In the next posts I will outline the prepared criteria and tests.


data structure in the table (i.e., the quality of the tables).

Note the last point. The structure and content of the table reflects the quality of the subject data classification. In the case of Wikipedia, the data is classified by humans, but we will be testing the AI (not humans). Let's see how well an advanced language model classifies (links and distributes data). The goal, to study and evaluate in detail the processing and inference of subject data, since this is the essence of the purpose of conversational AI.

Content, completeness, accuracy and ordering of information.... are all parameters to be tested.


The disciplines chosen for testing are:

  • Astronomy
  • Physics
  • Zoology

Each of these sciences has both descriptive and tabular content, on the basis of which branching classification models can be built. From this perspective, these fields of knowledge are well suited for our testing.


Conclusion:

I will talk about the test tasks, checking and summarising the results, and the conclusions in the next posts.


 

Let me remind you of the previously approved ChatGPT testing plan:
.


1. Test breadth of knowledge.

2. Checking the depth of knowledge.


3. Testing the logicality of judgement.

4. A test of mathematical ability.


5. Comparison of ChatGPT and Wikipedia in terms of completeness and orderliness of subject description.

6. Comparison of ChatGPT and Wikipedia in terms of completeness and order liness of data presentation in tables.


5 and 6 is a comparison of the quality of structuring of subject information by humans (while Wikipedia is still written by humans) and ChatGPT.


Disciplines chosen for testing:

  • Astronomy
  • Physics
  • Zoology

 

Primary objective, todetermine the feasibility of using ChatGPT as a source of reliable information in responsible work.

Secondary objective: Determining the technical limitations of LLM as an AI

Parameters tested: Content, completeness, accuracy and orderliness of information.

 

Let me remind you of the tests that were conducted earlier and the results:

1. test for breadth of knowledge.

Result: ChatGPT knows orders of magnitude more than a human. Volume theoretical knowledge is roughly equivalent to a large city library.


2. Testing for depth of knowledge.

Result: at the moment there is no exact result. It is difficult to say how well ChatGPT knows the topics it is discussing. It is impossible to check due to limited tokens. Chat can't give out ALL knowledge on the subject, because as it talks, itwill exceed the token limit and "forget" the previously stated material.

Also, Chat always answers in the context of the question and this limits the completeness of the content. We cannot accurately measure ChatGPT's depth of knowledge.


3. Logic Check.

Result: Chat is relatively weak in logic. About as good as a junior high school child.


4. Maths aptitude test.

Result: Compared to an adult, Chat is poor at solving complex equations and problems, even though he knows the theory and methods very well.