1. What algorithm can help me discover synonyms?

  2. How to calculate lexical cohension and semantic informaticveness for a given dataset?
  3. Extracting key value pairs from OCR'd text

  4. How can I weigh observations differently that were provided for a time horizon?

  5. Looking for Databases with gender of names and Ethnicity information

  6. Performing Binary classification using binary dataset

  7. Kolmogorov–Smirnov test after spliting a node in decision tree

  8. SQL data base to CSV file
  9. data transformation or normalization for pre-processing input data vectors including very small values

  10. What are temporal and local data?

  11. Knime : Scatter Plot
  12. What is the difference between data mining and unsupervised learning?

  13. Find intersection of data between rows and columns
  14. Machine Learning topics and books

  15. Predict class having only class proportions for every attribute (non labeled data)

  16. Why classification accuracy of kNN for irisdataset is not coming correct?

  17. How to get data fragments of Blockchain

  18. Looking for a strong Phd Topic in Predictive Analytics in the context of Big Data

  19. Public dataset for news articles with their associated categories
  20. Which algorithm to use for predicting late deliveries at warehouse?

  21. Variable importance in Random Forest Classifier returns only 1 value

  22. What should be the main concerns when generating extra data?

  23. Reduce data length to train effectively
  24. Statistically Robust Distance Measure/Metric for comparing more than two network data series
  25. Multi-touch Attribution Model
  26. An outlier detection for this data
  27. What initial steps should I use to make sense of large data sets, and what tools should I use?

  28. Is there any existing algorithm or formula that can calculate the complexity of a dataset?

  29. How does a Recommender System recommend movies to a New User?
  30. Data Mining - Intent matching and classification of text
  31. Tokyo Stock Exchange Historical Data

  32. Predicting probability of order arriving on or before customer Requested Delivery Date(RDD)
  33. Null-invariant measures of association in R

  34. Python : How to use Multinomial Logistic Regression using SKlearn
  35. Modeling the influence of events order on probability
  36. How can I create a "trained" dataset for categorizing news articles?

  37. How would one impute missing values for a Discrete variable?

  38. RMSE in Weka Time Series Forecasting

  39. How to apply ensemble clustering method?

  40. Using Machine Learning techniques for text-analysis

  41. Orange Changes the data type
  42. XGBoost outputs tend towards the extremes

  43. How the squared Euclidean distance is an example of non-metric function?
  44. How to collect data which exists centre or border of the cluster?

  45. How can I calculate Kernel matrix K for clustering based Kernel Principal Component Analysis?

  46. Feature engineering on distributions

  47. Shape of a distribution as a feature
  48. Rank terms in a bag -of-words model
  49. Preparing data for a live prediction engine
  50. Boxplots or violinplots?
  51. How to work with a big matrix on MDS using cmdscale
  52. Getting count of frequent itemsets in Python mlxtend

  53. Including the dependent variable in your data to perform principal component analysis?

  54. Text Mining with Naive Bayes

  55. Which classifier should I use for sparse boolean features?

  56. How can I reduce dimention using Deep Belief Network?
  57. Continuous MDP (reinforcement learning)

  58. Need sugestion to accuire a job in data mining

  59. compare two lists

  60. Is it possible to train the neural networks for programming
  61. Re: Missing Value

  62. Are we allowed to give the weight value for svm's predictors?

  63. Decision Tree used for Calculating Precision, Accuracy, and Recall, class breakdown question

  64. Error in FUN(newX[, i], ...) : argument "Iclo" is missing, with no default
  65. FP-Growth - find ALL patterns containing a specific item(s) only

  66. I have 50 videos. I ask a customer 10 questions. Based on their answers, I send them a set of videos. How do I do it?

  67. KMeans for crime patterns

  68. Find similarities or patterns between parameters before an event happens

  69. Hadoop - Capacity Scheduler and Fifo scheduler

  70. Faced problem while applying OneHotEncoder

  71. How to scrape imdb webpage?

  72. Text mining for text matching
  73. Filter row depending on specific object value and delete those instances

  74. Which algorithm to use to match two categories with n dimensions

  75. Where to start with Data Science

  76. Find similar observations in two datasets

  77. Google trends: Indexing, normalizing and rescaling

  78. Datasets in NLP research papers

  79. How can I fit categorical data types for random forest classification?
  80. Is there a standard data science epistemology?
  81. Why Accuracy and its median are very different ?

  82. tensorflow in production

  83. How get an optimal Neural Network algorithm using neuralnet package in R?

  84. Is it legal to scrape YouTube videos for training data?

  85. Need to prepare the data to Link Analysis project?
  86. What is the best Sequence Mining Algorithm to use for Hand Written digits Recognition system

  87. Can we apply community detection algorithms for word vector space?

  88. Amount of data needed and hypothesis for SVD

  89. Warning message in randomForest

  90. Map similar like clusters in two different cluster sets

  91. Couldn't get significant variables in my model even using Boruta package
  92. How get prediction using Boruta package?
  93. How to find accuracy in Random Forest by using python
  94. The problem of K-Means with non-convex function

  95. What is the difference between data-driven methods and machine learning?

  96. How to deal with time series which change in seasonality or other patterns?

  97. Keyword extraction using deep learning

  98. Text annotating process, quality vs quantity?

  99. A tool like Matlab for NLP?
  100. Conceptual Question about finding relation between one categorical variable and one numeric variable