data-mining

  1. How get prediction using Boruta package?

  2. How to find accuracy in Random Forest by using python

  3. The problem of K-Means with non-convex function

  4. Amount of data needed and hypothesis for SVD
  5. Knime : Scatter Plot

  6. What is the appropriate learning algorithm?
  7. What is the difference between data-driven methods and machine learning?

  8. How to deal with time series which change in seasonality or other patterns?

  9. Keyword extraction using deep learning
  10. Text annotating process, quality vs quantity?
  11. What is LSTM, BiLSTM and when to use it?

  12. A tool like Matlab for NLP?

  13. Conceptual Question about finding relation between one categorical variable and one numeric variable
  14. Reduce data length to train effectively

  15. Multi-touch Attribution Model

  16. Understanding what is going on

  17. Calculate Average Error Rate in KNN

  18. Kmeans question?
  19. How can I reduce data size using KPCA for larger features than observations?
  20. Meaning of latent features?
  21. What is the definition of Data Scout?

  22. Find a matching pattern in Sequence Pattern Mining Results

  23. Decision tree vs. KNN

  24. Rank terms in a bag -of-words model

  25. Which algorithm to use for predicting late deliveries at warehouse?
  26. Keyword extraction algorithms

  27. Sample Size in Data Mining Researches

  28. How to handle non-stationary data in online neural network based one-class classifier for anomaly detection?
  29. How can I promote my work?
  30. Modeling the influence of events order on probability

  31. Classifier runtime evaluation

  32. What kind of research can be done with an email data set?

  33. Sensitivity analysis in outlier explanation
  34. pre-process data for document classification when the words are short-cut in R?
  35. Text post-processing
  36. Why I didn't get any significant variable in my logistic model?
  37. Looking for a Good Phd Topic in Predictive Analytics in the context of Data Search and ranking (Discover/Objectrank)

  38. Identify important less frequent words

  39. How to find the datasets for skill test (like java, python, c c++, etc.)

  40. How can I calculate Kernel matrix K for clustering based Kernel Principal Component Analysis?

  41. What is Hellinger Distance and when to use it?

  42. Feature engineering on distributions
  43. Why do we need XGBoost and Random Forest?
  44. Feedback Analysis
  45. Error in FUN(newX[, i], ...) : argument "Iclo" is missing, with no default
  46. Any case studies using Bayesian Networks for system design trades?
  47. Error in y.predict.trend + y.predict.complement : non-conformable arrays

  48. Implementation of reliable rule learning

  49. What is the standard procedure for Data Analysis?
  50. Difference usage of document level, sentence level and aspect level in sentiment analysis

  51. How can I weigh observations differently that were provided for a time horizon?
  52. What is the difference between observation and variable?

  53. Error in f1(x) : argument "b" is missing, with no default

  54. Which process step in KDD or CRISP-DM includes labeling of the data?

  55. Python: Handling imbalance Classes in python Machine Learning

  56. How Can I Compute Information-Gain for Continuous- Valued Attributes

  57. Modeling Grocery Store Transactions

  58. Handling data imbalance and class number for classification

  59. Fitting data with Fourier Series coefficients

  60. New to data Science. Which techniques best to use Large data set in insurance company?

  61. Question about calculation of distortion function in LBG (and ELBG) algorithm?
  62. Classification methods using one overlapping feature
  63. Storing and mining medical images and related data

  64. How can you have a directed edit distance less than 1?

  65. Identifying which known groups are the most similar or most dissimilar
  66. Clustering with cosine similarity

  67. clustering multivariate time-series datasets

  68. Find type of relation between variables

  69. How to evaluate data capability to train a model?
  70. How to combine two CART decision trees learned in same type of data?
  71. Text annotating process, quality vs quantity?
  72. Orange Changes the data type

  73. How to preprocess data?

  74. K-Means clustering for mixed numeric and categorical data

  75. Construct direct citation networks

  76. Compare two topic modelling sets

  77. Orange3 summarizing data, grouping data values

  78. non-binary nominal variable in linear regression

  79. Can we apply community detection algorithms for word vector space?

  80. How does the naive Bayes classifier handle missing data in training?

  81. How to control false positives in sequential A/B testing while keeping a low sample size?

  82. What does "likelihood" mean in the image?

  83. Method for finding top-k cosine similarity based closest item on large dataset

  84. What are real world applications of Doc2Vec?
  85. What machine learning technique should I use in medical problem

  86. how can i collect data set from social networks like instagram?
  87. Beginning technology stack for predictive analytics?

  88. Using dates from predicting loan

  89. What is the difference between statistical learning and predictive analytics?

  90. Public dataset for news articles with their associated categories
  91. Preprocessing in Data mining?
  92. Clustering for high dimensional data

  93. Questionnaire data analysis

  94. Terms Extraction from distributed NoSQL databases - Hidden Markov Model

  95. What algorithm can help me discover synonyms?
  96. Discovering cross category sales using transactions history (Clustering?)

  97. How to create a good list of stopwords

  98. ADHoc Information Retrieval

  99. Detect related sentences

  100. What is required in Affinity Propagation