finds.dev← search

// the find

fengdu78/Data-Science-Notes

★ 8,566 · Jupyter Notebook · updated Aug 2021

数据科学的笔记以及资料搜集

A Chinese-language collection of Jupyter notebooks covering the standard data science stack: math foundations, Python, NumPy, pandas, scikit-learn, and basic deep learning. It's aimed at Chinese-speaking beginners who want worked examples alongside the theory from books like Li Hang's Statistical Learning Methods.

The coverage is logically sequenced — math before Python before ML — so a beginner can follow it top to bottom without jumping around. The feature engineering section is notably practical, translating a full book into runnable notebooks with real datasets. Including CS229 linear algebra and probability notes as starting material is a good call; most similar repos skip the math entirely. The numpy-100 exercises with hints and solutions in the same directory is a useful self-contained drill set.

The repo hasn't been touched since August 2021, so anything PyTorch or scikit-learn related is running against versions that have since had breaking API changes — cells will fail without pinned dependencies. There are no requirements files or environment specs beyond one environment.yml buried in the scikit-learn folder, so reproducing the notebooks is a manual dependency hunt. The deep learning section is thin: four PyTorch intro notebooks and a word2vec visualization doesn't get you far. The content is almost entirely in Chinese with no English translations, which limits its reach to a fraction of its apparent audience on GitHub.

View on GitHub →

// want more like this?

We dig through GitHub every week and send a few repos picked for what you actually care about — each with an honest take like this one.

Get finds in your inbox → Search again →