๐
Developer Archetype
The Data Scientist Wannabe
"Jupyter notebooks as far as the eye can see."
vibe
Talks about "the model" like it's sentient. Has never deployed one.
Python, pandas, and a folder full of .ipynb files that tell the story of three abandoned Kaggle competitions and a correlation you swear is significant. The model is never quite prod-ready.
Typical stack
Python
Jupyter
pandas
scikit-learn
matplotlib
Kaggle
Known examples
Early fast.ai students
Jeremy Howard made this archetype legitimate โ the notebooks that went further became real ML engineers
The Kaggle leaderboard grinder
Top 1% on toy datasets, 0 production deployments
Signature traits
- โ GitHub littered with .ipynb files that only run locally
- โ Has trained at least one model that "achieves 94% accuracy" on the training set
- โ README includes a confusion matrix screenshot
- โ Knows scikit-learn better than the Python standard library
Strengths
- โ Comfortable with data manipulation and statistical thinking
- โ Can explore and visualize datasets quickly
- โ Understands the ML pipeline end-to-end in theory
Watch out for
- โ Notebooks โ software โ nothing is production-deployable
- โ Data leakage and overfitting are politely ignored
- โ Software engineering fundamentals often lacking