Machine Learning
Contents
Full Title or Meme
What you need to get a deep intuition about current ML architectures and their underlying mathematics.
This page is about the leaning process. For a discussion on the use of the resulting model see the page on Artificial Intelligence.
Problems
- The big problem is letting people's privacy data and protected content from leaking through to the results.
- NIST Protecting Model Updates in Privacy-Preserving Federated Learning[1] describes attacks on models and the concepts of input privacy and output privacy and the problems in providing input privacy.
- There is some question, starting in 2023, about whether machines will be programed or educated. Should the model be treated as a general library or as an instance of a particular session? Should the model remember past sessions, or immediately wipe them out?
- Human bias and other weaknesses are embedded in the training data for AI like GPT. It is unclear how to accommodate for that without just introducing other biases.
- Johns Hopkins researchers found that algorithms trained on manufactured data can be even better than the real thing for important surgical tasks like X-ray image analysis or giving a robot the ability to detect medical instruments during procedures. [2]
Probabilistic Responses
LLMs are not even trying to find a “best” answer – they are trying to guess at what a human would most likely say, which is inherently probabilistic. LLMs estimate a probability distribution, and then sample from this distribution when asked to provide a single response.
In this regard Machine Learning is like Quantum Mechanics, you can create a probability cloud, but the actual response (or measurement) is not knowable in advance.
Creating an ML
for the math portions you can check out Harsh's other article or this excellent post by YC on the topic. His advice, learn enough Linear Algebra, Stats, Probability, and Multivariate Calculus to feel good about yourself, then you can dive into other topics as the need arises.
Elements of Statistical Learning
Prioritize Chapters 1–4 and Chapters 7–8.[3] This covers supervised learning, linear regression, classification, Model Assessment and Inference. Its okay if you don’t understand it at first, absolutely nobody does. Keep reading it and learning whatever math you need to until you get it. If you want, knock the whole book out, you won’t regret it.
If Elements is really just too hard, you can start with Introduction to Statistical Learning, by the same authors. The book sacrifices some mathematical explanation and focuses on a subset of the problems in Elements, but is a good ramping up point to understanding the material. There is an excellent accompanying course provided by Stanford for free.
Both books focus on R, which is worth learning.
Stanford CS 229
Once you’ve finished Elements, you’re in a great position to take Stanford’s ML course,[4] taught by Andrew Ng. You can think about this like the mathematically rigorous version of his popular Coursera course. Going into this course, make sure to refresh your Multivariate Calculus and Linear Algebra skills, as well as some probability. They provide some handy refresher guides on the site page.
Do all the exercises and problem sets, and try doing the programming assignments in both R and Python. You’ll thank me later.
You can again opt to go for a slightly easier route in Andrew Ng’s Coursera course, which is focused more on implementation and less on underlying theory and the math. I would really just do all the programming assignments from there as well. You don’t have to do them in Octave/Matlab, you can do R and Python versions. There are plenty of repos to compare to on Github.
Deep Learning Book
At this point, you’re starting to get formidable. You have a fundamental mathematical understanding of many popular, historic techniques in Machine Learning, and can choose to dive into any vertical you want. Of course, most people want to go into Deep Learning because of its significance in industry.
Go through the DL book. It will refresh you on a lot of math and also fundamentally explain much of modern Deep Learning well. You can start messing around with implementations by spinning up a Linux box and doing cool shit with CNNs, RNNs and regular old feed forward neural networks. Use Tensorflow and Pytorch, and start to get a sense of how awesome some of these libraries are for abstracting a lot of the complexity you learned.
I’ve also heard the DeepLearning.ai courses by Andrew Ng and co are worth it. They are not nearly as comprehensive as the textbook by Goodfellow et.al, but seem to be a useful companion.
arXiv and Google Scholar
If you’ve made it this far, congratulations, you’re probably in an excellent place to make sense of the latest papers in field. Just go onto Arxiv and Google Scholar and look at both seminal papers and recently papers that are popular. Remember that ML is a fast moving field and the literature changes, so keep checking back in every few months.
If you’re feeling particularly bold or find something cool, try implementing it yourself. The learning process will be invaluable.
Padding your resume and getting hired
You’ve probably reached the point by now that you can get hired at most places and/or get into grad school. If you want to fill out your resume, you can continue to implement new architectures, or even do Kaggle Competitions.
If you want to do the latter, but feel that your actual implementation skills aren’t totally up to par, take Fast.ai courses 1 and 2. They focus on cohesively applying all the shit you’ve learned over the past few months using popular libraries and tooling.
There are a lot of AI residency programs popping up at OpenAI, Microsoft, Google, Facebook, Uber, and a few other places. At this point you are probably a pretty good candidate.
If you get this far, well done. The journey is never over, but you’re in an excellent place and you understand ML as well as many experts.
Educating an ML
- Prior to 2023 data was mostly taken from huge data sets on the web, or uncurated data.
- Experiments like that at John Hopkins[2] tells us that curated data has a better chance at success. This is the equivalent of saying that some books are better for learning than others. Not surprising.
- An article https://arxiv.org/abs/2303.12712 claims that what GPT-4 does is not very human-like. Calling it intelligence is probably messing with our minds in strange ways.
- I claim that our interactions with any Artificial Intelligence is be more like that of a parent than like that of a programmer.
References
- ↑ Joseph Near and David Darais, Protecting Model Updates in Privacy-Preserving Federated Learning NIST (2024-03-21) https://www.nist.gov/blogs/cybersecurity-insights/protecting-model-updates-privacy-preserving-federated-learning
- ↑ 2.0 2.1 Catherine Graham, SYNTHETIC DATA FOR AI OUTPERFORM REAL DATA IN ROBOT-ASSISTED SURGERY (2023-03-20) https://hub.jhu.edu/2023/03/20/synthetic-data-outperform-real-data-robot-assisted-surgery/
- ↑ https://web.stanford.edu/~hastie/ElemStatLearn/
- ↑ https://see.stanford.edu/Course/CS229
Other Materiel
- See wiki page on Artificial Intelligence
- Harsh Sikka, Harvard. https://medium.com/technomancy/the-blunt-guide-to-mathematically-rigorous-machine-learning-c53263d45c7b