Машинное обучение в трейдинге: теория, модели, практика и алготорговля - страница 2875

 
Maxim Dmitrievsky #:
Попробуйте конкретизировать, я спрошу

Может попросить привести простейший пример использования логистической регрессии в задаче классификации с переменным числом признаков?

 
Aleksey Nikolayev #:

Может попросить привести простейший пример использования логистической регрессии в задаче классификации с переменным числом признаков?

Попробую чуть позже. Если есть еще варианты вопросов - кидайте :)
 
Maxim Dmitrievsky #:
Попробую чуть позже. Если есть еще варианты вопросов - кидайте :)

1) Используется ли для него самого модель с нефиксированным числом признаков? (Вроде да)

2) Попросить привести общий вид рекурсивной функции сопоставляющей число числовому списку произвольной длины

 
Aleksey Nikolayev #:

Может попросить привести простейший пример использования логистической регрессии в задаче классификации с переменным числом признаков?

If the number of features is always different, it may be difficult to apply traditional machine learning algorithms that expect a fixed-size feature vector as input. One solution to this problem is to use a method called "feature hashing," which allows you to represent a variable-length feature set as a fixed-size feature vector.

Feature hashing works by applying a hash function to the features in the feature set and using the hash values as indices in a fixed-size feature vector. This allows you to map a variable-length feature set to a fixed-size vector, which can then be used as input to a machine learning model.

For example, suppose you have a dataset with a variable number of features, and you want to use logistic regression to classify the data. You can apply feature hashing to the features in the dataset, resulting in a fixed-size feature vector that can be used as input to the logistic regression model.

Overall, feature hashing is a useful technique for handling datasets with a variable number of features, and can be used in combination with machine learning algorithms such as logistic regression to solve classification problems.

 
Aleksey Nikolayev #:

1) Используется ли для него самого модель с нефиксированным числом признаков? (Вроде да)

2) Попросить привести общий вид рекурсивной функции сопоставляющей число числовому списку произвольной длины

1 про себя как-то неохотно отвечает, но там трансформер, а значит да

a general view of a recursive function that maps a number to a numeric list of arbitrary length

A recursive function is a function that calls itself as part of its execution. Here is a general example of a recursive function that maps a number to a numeric list of arbitrary length:

def map_to_list(n: int, lst: List[int]) -> List[int]:
  if n == 0:
    return lst
  else:
    lst.append(n)
    return map_to_list(n-1, lst)

This function takes an integer n and a list lst as input. It then checks if n is equal to 0. If it is, the function returns the list lst . If n is not equal to 0, the function appends n to the end of the list lst and then calls itself with the modified lst and the value n-1 . This process continues until n is equal to 0, at which point the function returns the final version of the list.

This is just one example of how a recursive function can be used to map a number to a list. There are many other ways to write such a function, and the specific implementation will depend on the specific requirements and constraints of the problem you are trying to solve.


show this code in R language

Here is the recursive function from the previous example, written in R:

map_to_list <- function(n, lst) {
  if (n == 0) {
    return(lst)
  } else {
    lst <- c(lst, n)
    return(map_to_list(n-1, lst))
  }
}

This function is similar to the previous example, but it uses R syntax and functions. The if statement is used to check if n is equal to 0, and the return function is used to return the final version of the list. The c function is used to append n to the end of the list lst .

As before, this is just one example of how a recursive function can be used to map a number to a list. The specific implementation will depend on the specific requirements and constraints of the problem you are trying to solve.

 
Maxim Dmitrievsky #:

Feature hashing works by applying a hash function to the features in the feature set and using the hash values as indices in a fixed-size feature vector. This allows you to map a variable-length feature set to a fixed-size vector, which can then be used as input to a machine learning model.

Пусть тогда расскажет какие варианты хэш-функций используются для числовых векторов.

 
Maxim Dmitrievsky #:

If the number of features is always different, it may be difficult to apply traditional machine learning algorithms that expect a fixed-size feature vector as input. One solution to this problem is to use a method called "feature hashing," which allows you to represent a variable-length feature set as a fixed-size feature vector.

Feature hashing works by applying a hash function to the features in the feature set and using the hash values as indices in a fixed-size feature vector. This allows you to map a variable-length feature set to a fixed-size vector, which can then be used as input to a machine learning model.

For example, suppose you have a dataset with a variable number of features, and you want to use logistic regression to classify the data. You can apply feature hashing to the features in the dataset, resulting in a fixed-size feature vector that can be used as input to the logistic regression model.

Overall, feature hashing is a useful technique for handling datasets with a variable number of features, and can be used in combination with machine learning algorithms such as logistic regression to solve classification problems.

upd


For data with a variable number of features, some possible representations you can use include:

  1. Fixed-size feature vectors: As mentioned before, you can use feature selection techniques to select a fixed number of features from your data and represent it as a feature vector.

  2. Sparse feature vectors: Instead of using a dense feature vector, you can use a sparse feature vector, which only includes a subset of the possible features. This can be useful if you have a very large number of possible features and only a small number of them are present in any given instance.

  3. Graph data: You can represent your data as a graph, with each instance represented as a node in the graph and the edges between nodes representing the relationships between instances. This can be useful if you are using a machine learning model that is designed to handle graph data, such as a graph convolutional network.

Again, the best representation will depend on the specific requirements and constraints of your problem. It's important to consider the characteristics of your data and the goals of your project when deciding which representation is best for your needs.

 
Aleksey Nikolayev #:

Пусть тогда расскажет какие варианты хэш-функций используются для числовых векторов.

Here is a list of some common hash functions that can be used to map integers or other data types to lists or other data structures:

  1. Modular hashing: Modular hashing is a simple hash function that maps an integer or other data type to a list or other data structure by taking the modulo of the value at different powers of a prime number.

  2. Polynomial hashing: Polynomial hashing is a hash function that maps an integer or other data type to a list or other data structure by evaluating a polynomial at a specific point. The polynomial is defined by the value and a prime number.

  3. Universal hashing: Universal hashing is a hash function that maps an integer or other data type to a list or other data structure using a randomly generated hash function. The hash function is chosen from a family of hash functions that have a low probability of collision.

  4. Hash tables: Hash tables are data structures that use a hash function to map keys to values. They can be used to map an integer or other data type to a list or other data structure.

  5. Hash maps: Hash maps are data structures that use a hash function to map keys to values. They can be used to map an integer or other data type to a list or other data structure.

  6. Hash sets: Hash sets are data structures that use a hash function to store a set of values. They can be used to map an integer or other data type to a list or other data structure.

These are just a few examples of hash functions and data structures that can be used to map integers or other data types to lists or other data structures. The specific hash function or data structure that you choose will depend on the characteristics of your data


Here are examples of the modular hashing, polynomial hashing, universal hashing, hash table, hash map, and hash set functions in R:

modular_hash <- function(n, length, p) {
  lst <- vector("numeric", length)
  for (i in 1:length) {
    lst[i] <- n %% p^(i+1)
  }
  lst
}

polynomial_hash <- function(n, length, p) {
  lst <- vector("numeric", length)
  for (i in 1:length) {
    lst[i] <- sum(n * p^(i-1)) %% p^(length+1)
  }
  lst
}

universal_hash <- function(n, length, p) {
  a <- sample(p^length, length, replace=TRUE)
  b <- sample(p^length, length, replace=TRUE)
  lst <- vector("numeric", length)
  for (i in 1:length) {
    lst[i] <- sum(a[i] * n + b[i]) %% p^(length+1)
  }
  lst
}

hash_table <- function(n, length, p) {
  lst <- vector("list", p^length)
 
Maxim Dmitrievsky #:

2 a general view of a recursive function that maps a number to a numeric list of arbitrary length

Наоборот - на входе список, на выходе число. Типа, по котировкам строим прогноз)

Потом спросить какие алгоритмы МО могут использоваться для построения таких функций.

 
Aleksey Nikolayev #:

Наоборот - на входе список, на выходе число. Типа, по котировкам строим прогноз)

Потом спросить какие алгоритмы МО могут использоваться для построения таких функций.

If you want to hash a list of different length to an integer, you can use a hash function that maps the list to an integer. One way to do this is to use a modular hash function, which maps the list to an integer by taking the modulo of the sum of the elements of the list at different powers of a prime number.

def modular_hash(lst: List[int], p: int) -> int:
  h = 0
  for i, x in enumerate(lst):
    h += x * p**i
  return h % M

This function takes a list lst and a prime number p as input, and returns an integer hash value by summing the elements of the list at different powers of p and taking the modulo of the result.

For example, if we call modular_hash([1, 2, 3], 2) , the function will return 7 .

You can use a similar approach in R as well. Here is the code for a modular hash function in R that maps a list to an integer:

modular_hash <- function(lst, p, M) {
  h <- 0
  for (i in 1:length(lst)) {
    h <- h + lst[i] * p^(i-1)
  }
  h %% M
}
polynomial_hash <- function(lst, p, M) {
  h <- 0
  for (i in 1:length(lst)) {
    h <- h + lst[i] * p^(i-1)
  }
  h %% M
}

universal_hash <- function(lst, p, M) {
  a <- sample(p^length(lst), length(lst), replace=TRUE)
  b <- sample(p^length(lst), length(lst), replace=TRUE)
  h <- 0
  for (i in 1:length(lst)) {
    h <- h + a[i] * lst[i] + b[i]
  }
  h %% M
}


Hash functions are not typically generated using machine learning algorithms. Rather, they are typically designed based on the characteristics of the data being hashed and the requirements of the application.

That being said, machine learning algorithms can be used to evaluate the performance of different hash functions, or to optimize the parameters of a hash function for a specific task. For example, you might use a machine learning algorithm to optimize the prime number used in a modular hash function, or to choose the coefficients of a polynomial hash function that performs well on a particular dataset.

Overall, while machine learning algorithms are not typically used to build hash functions, they can be used to evaluate and optimize the performance of hash functions in certain applications.