1. Pipelining of regressors in scikit-learn
  2. How to aggregate SentiWordNet sentiment for words correctly?
  3. (SVM) Difference between linear kernel and polynomial kernel of degree 1?
  4. GPUs or ram is more important?
  5. AdaBoost algorithm question

  6. What is minibatch discrimination?

  7. Is my TensorFlow OOP implementation of an LSTM ok?

  8. How to choose the model hyperparameter after cross-validation when the model fit indices are really similar?

  9. Approach to a multi-view object classfication problem

  10. How to propagate uncertainty into the prediction of a neural network?

  11. Which type of machine learning algorithm is suited for data from multiple flights?

  12. How to interpret stable and overlapping learning curves?
  13. What is the meaning of initializing weights from a distribution function

  14. Confused about the realizability assumption and equations of upper bound
  15. get_dummies vs categorical data in r for machine learning
  16. how to weight KLD loss vs reconstruction loss in variational auto-encoder

  17. Random Forest has almost perfect training AUC compared to other models
  18. what is the correct permutation scheme for assessing the significance of the cross-validated accuracy of a two class classifier?
  19. Designing a simple Reinforcement Learning themed game?

  20. regular machine learning algorithms(not deep learning) like logistics regression/GBDT need more GPU?
  21. The bandwidth (or sigma) for rbf kernel in M-SVM for classification

  22. Is my model linear or non-linear

  23. How to determine the optimal range for the important variables

  24. How to calculate accuracy from MAE,RMSE,RAE,RRSE values?

  25. Neural Nets: One-hot variable overwhelming continuous?

  26. Intuition behind attribute learning in machine learning

  27. rows linkage for tracking process
  28. Is is true that Bayesians don't need test sets?

  29. How to verify the bias-variance decomposition using a simluated experiment

  30. Logistic regression using splines in python
  31. Why would anyone use KNN for regression?
  32. DNN train/validate decisions - am I overfitting?
  33. Resources for learning about multiple-target techniques?

  34. Definition of activation of input units xi = si in ADALINE
  35. definition of "hidden unit" in a ConvNet
  36. STL decomposition is working for no increasing trend?
  37. Standard deviation in regression trees

  38. Should you ever standardise binary variables?
  39. How to process categorical features with many values?
  40. Piecewise constant distribution stationary test
  41. Linear regression or polynomial regression? - Choice of features
  42. Parsing text data

  43. feature scaling giving reduced output (linear regression using gradient descent)
  44. CNN Fully conected layer confusion
  45. How can I get the optimal perturbation of a trained model?
  46. How to interpret the following GAN training losses?

  47. What is variable importance?

  48. Training with a max-margin ranking loss converges to useless solution
  49. Logistic Regression for random data

  50. Ridge Regression - Increase of the error on the trainings set
  51. Neural network architecture for formation classification based on 2d spatial data

  52. How do I improve sklearn MLP regression output "variance"?
  53. weak performance of caret bstTree

  54. What is the difference between MIQ and MID in MRMR for feature Selection
  55. Understanding Connectionist Temporal Classification (CTC)
  56. How to repair neural network's time sequence prediction?
  57. What is the difference among stochastic, batch and mini-batch learning styles?

  58. General procedures for combined feature selection, model tuning, and model selection?
  59. How to incorporate prior knowledge in GPML?
  60. Accuracy increases on decreasing the percentage of training data with stable precision, recall and F-score

  61. Alternative to The Elements of Statistical Learning: Data Mining, Inference, and Prediction
  62. Classification of categorical value using time-series graphs?

  63. Is random state a parameter to tune?

  64. Does using a kernel function make the data linearly separable? If so, why using soft-margin SVM?

  65. Conv2d in keras

  66. Multiclass classifier with given number of possible hyperplanes
  67. Relationship between High Variance in Machine Learning and Statistics
  68. Unscented Kalman filter-negative covariance matrix

  69. Feature engineering with non-fixed length vectors?
  70. Why do my Feature Importance and Partial Dependence plots not agree?
  71. Observation symbols for training a set of HMMs

  72. Time series Decomposition

  73. EM algorithm with constraints
  74. Different regression algorithms with many categorical variables
  75. Setting up a MLP for binary classification with tensorflow

  76. Reward functions in reinforcement learning
  77. Help understanding how more features than word embeddings are fed into a neural network

  78. detecting persons on video/frames
  79. If only prediction is of interest, why use lasso over ridge?

  80. Why SVM classifier generates bad results with LDA in Classification of audio data?
  81. Difference between Factorization machines and Matrix Factorization?

  82. Cross Validation with Preprocessing (Normalization, Discretization, Feature Selection)

  83. Creating training and validation sets for churn model

  84. Training threshold vs validation threshold for better prediction results?
  85. Understanding of the sigmoid activation function as last layer in network

  86. Relevant statistical features for time in consumer data

  87. More data, to counteract overfitting, results in worse validation accuracy

  88. Deep learning/Machine Learning to Predict Function Values

  89. Tackling highly skewed features in predictive modelling?

  90. How to use the information in soft (class) labels?

  91. Generalized Pareto distribution (GPD)
  92. Estimating change in probabilities in 2 exponential distributions
  93. What are soft classes?
  94. Naive Bayes: Mix unigrams and bigrams for text classification?
  95. Classification for high dimension data

  96. K nearest neighbours model complexity
  97. Are these methods suitable for predicting a numeric value?
  98. Boosting causes problem with Pink Noise? When Normalization is not good?
  99. can better features help with generalization performance (variance)?

  100. Why is the validation accuracy fluctuating?