Language Models GPT-2 Llama 2 Llama 3 Llama 3.1 LSTM Machine Learning Models Linear Regression Logistic Regression Multinomial Logistic Regression K-Nearest Neighbors (KNN) K-Means Clustering MLP (Neural Network from Scratch) Attention Mechanisms Self Attention Naive Multi-Head Attention Multi-Head Attention Multi-Query Attention Grouped-Query Attention More implementations coming soon!