R for Data Science (R4DS)
A highly influential book that teaches how to do data science with R, focusing on the `tidyverse` collection of packages. It covers data import, tidying, transformation, visualization, and modeling. Written by Hadley Wickham and Garrett Grolemund, it's considered a foundational text for modern R programming in data science. The entire book is available for free online.
More resources on Statistical Software (R, Python)
Scikit-learn Official Documentation
Scikit-learn is a widely used open-source machine learning library for Python. Its documentation provides extensive user guides, tutorials, and examples for various classification, regression, clustering, and dimensionality reduction algorithms.
NumPy Official Documentation
NumPy is the fundamental package for scientific computing with Python, providing support for large, multi-dimensional arrays and matrices, along with a collection of high-level mathematical functions. Its official documentation is crucial for understanding array operations and numerical computing in Python.
Pandas Official Documentation
The definitive and comprehensive documentation for the Pandas library, an indispensable tool for data manipulation and analysis in Python. It includes user guides, API references, and tutorials for working with DataFrames and Series.
RStudio Cheat Sheets
RStudio (now Posit) provides a collection of incredibly useful, visually appealing cheat sheets for various R packages and tasks, including `ggplot2`, `dplyr`, data import, R Markdown, and more. These are excellent quick-reference guides for R users.
The Comprehensive R Archive Network (CRAN)
The official repository for R, offering not only the R software itself but also extensive documentation, manuals, FAQs, and a vast collection of packages contributed by the R community. It's the authoritative source for R-related information.
MarinStatsLectures
A YouTube channel offering a vast library of video lectures on statistics and R programming. It covers a wide range of statistical topics from basic descriptive statistics to advanced inferential methods, often demonstrating their application in R.
