ANDA'S IT LIBRARY
DATA EXAMPLES (chronological)
Reference Document by Anda Vitols
Here I explored tidyverse, piping and other tools that have become more intuitive and efficient for wrangling tasks. I also delve deeper into modeling and mining. R is now fully integrated with non-proprietory apps, as well as having a whole set of publishing apps for document and application creation.
ON THIS PAGE
section 4: DATA MINING (advanced data exploration using ML)
  • About Variables
  • About Cases
  • About other types of datasets

section 4: DATA MINING (2023 -2024)
ABOUT VARIABLES - exploring dimension data
  • conducting - a prinicpal component analysis (PCA)
  • conducting - a LDA
  • conducting - a t_SNE
ABOUT CASES - clustering & classifying cases
clustering
  • grouping - cases - with hierarchical clustering
  • grouping - cases - with k-means clustering
  • grouping - cases - with DBScan
classifying
  • classifying - cases - with k-nearest-neighbors (k-nn)
  • classifying - cases - with naive bayes
  • classifying - cases - with decision-trees
ABOUT OTHER TYPES OF DATA STRUCTURES
predicting - classification behaviour - with assocation analysis
  • Apriori
  • Eclat
  • CBA
decomposing - dimension behaivor over time - with time-series analysis
  • decomposition
  • ARIMA
  • MLP
grouping word patterns - with text mining
  • sentiment analysis
    • binary classification
    • sentiment scoring
  • visual word pairs
DATA MINING - finding patterns in the noise
Database datasets
List dataset
  • looking for behavioural associations: IF someone buys this THEN they likely will buy that
  • ASSOCIATION ANALYSIS
Time series datasets
Text datasets

back home | back to reports