Advice for AI aspirations?

mindcrime · on Feb 4, 2011

You can't take enough maths and statistics classes. Machine Learning - these days at least - is very maths and statistics oriented. Linear Algebra is big, so make sure you have that covered.

If you want to get your toes in the water a bit with ML, there are some great ML libraries that encapsulate some of the popular algorithms. Mahout[1], Weka[2] and Mallet[3] are popular in the Java world,

A lot of folks use Python for ML as well, and there are some good libraries there.

The R language is also popular in ML circles; as is C++. If you learn some combination of Java, Python, C++ and/or R, you'll be in good shape from a programming language standpoint.

Check out http://mloss.org/software/ also.

Some good books to get started with include:

Algorithms of the Intelligent Web[4]

Programming Collective Intelligence[5]

Collective Intelligence In Action[6]

Stanford make a great series of lectures[7] available online that you might find useful.

[1]: http://mahout.apache.org/

[2]: http://www.cs.waikato.ac.nz/ml/weka/

[3]: http://mallet.cs.umass.edu/

[4]: http://www.manning.com/marmanis/

[5]: http://www.amazon.com/gp/product/0596529325

[6]: http://www.amazon.com/Collective-Intelligence-Action-Satnam-...

[7]: http://see.stanford.edu/see/lecturelist.aspx?coll=348ca38a-3...

mustafaf · on Feb 4, 2011

If you really want to learn the fundamental underpinnings of machine learning, you will need a strong background in probability and stochastic processes. I would suggest Python (or MATLAB if you can get access to it) to learn how different methods works. That way you can separate mathematical issues from programming issues. As far as courses go, you should be looking for courses in Liner Algebra, Numerical Computation/Optimization (Convex, Nonlinear), Statistical Inference, Stochastic Processes.

Good References: 1) Elements of Statistical Learning - Hastie, Tibshirani and Friedman

2) Pattern Classification - Duda, Hart and Stork

3) Pattern Recognition - Theoridis, Koutroumbas

4) Machine Learning - Tom Mitchell

5) http://videolectures.net/Top/Computer_Science/Machine_Learni...

snikolov · on Feb 6, 2011

Beyond just learning theory (although this is crucial), make sure to get your hands dirty implementing something, talking to knowledgeable people (professors, researchers in industry, classmates with common interests), and finding out what actually gets used in practice and why. Sometimes they don't tell you these things in classes.

When you are somewhat comfortable with some basic concepts, it might be interesting to form a reading group with some like-minded classmates and scrutinize some papers, from classics, to the more recent research.

snikolov · on Feb 6, 2011

I've found this set of tutorials to be a great resource for picking up many machine learning methods

http://www.autonlab.org/tutorials/

Somewhat complementary is this set of implementations (in Python) of many common machine learning techniques

http://www-ist.massey.ac.nz/smarsland/MLbook.html

curt · on Feb 4, 2011

Take classes in or read about biology/neurobiology, human development (how the brain develops and learns), and psychology (how the brain reacts and responds). I've done AI work and come at it from a unique perspective due to my biology/engineering education. Normally I come up with some unique solutions & products to problems that can't be solved using conventional means.

HilbertSpace · on Feb 6, 2011

Let's roll back to some simple things that actually make sense. Suppose we are give a triangle ABC with sides a, b, c, that is, side a is opposite angle A, etc. Then from some simple trigonometry, we know the law of cosines that

c^2 = a^2 + b^2 - 2ab cos(C)

Then if C is a right angle, cos(C) = 0 and we have the Pythagorean theorem

c^2 = a^2 + b^2

So, given the triangle we have the law of cosines. If in addition we know that angle C is a right angle, then we have the Pythagorean theorem.

Well, to conclude the Pythagorean theorem, it is just crucial to have a suitable assumption, say, angle C is a right angle. Else the Pythagorean theorem does not hold.

We know these things.

'Machine learning' is a four letter word -- junk. Here is why: They have a 'paradigm': Get an 'algorithm' from somewhere, put it a black box, and say "Try this!". By analogy, this is mixing up snake oil instead of good chemistry.

The big, huge problem is that, with their nonsense paradigm, there is no good reason to believe that the black box has any value. By analogy, they take the Pythagorean theorem and apply it to all triangles ignoring if there is a right angle or not.

There is a rational sickness ubiquitous in 'computer science': Take results from mathematics and with great determination ignore the assumptions. Machine learning is one of the sickest.

Bluntly put, done with any rational care, machine learning is a topic in applied math, but the workers in the field know far too little math, make gross mistakes at nearly every chance, and, in particular, ignore the mathematical assumptions and, thus, proceed with no rational support.

Computing was not always this way: The origins of electronic digital computing were in scientific calculations in WWII. D. Knuth's books on 'The Art of Computer Programming' are fairly careful mathematically.

At the time of the first editions of Knuth's books, hot topics were 'algorithms', especially for sorting and searching. So, there it was fairly clear what such an algorithm was to do.

Then the definition of an 'algorithm' grew to include just any piece of code that ran and, hopefully, usually stopped, and what properties the results had were not emphasized. So, we have gone some years with a programmer announcing that they have an algorithm to tell people, say, what they should eat for breakfast. They call a venture partner who says, "So, you have an algorithm.". Gee, anyone can have an algorithm for anything; the question is, what properties does the algorithm have? As long as computing wants to ignore the properties and the assumptions, they will have a tough time doing better than snake oil.

Part of this very relaxed attitude toward algorithms came from early work in 'artificial intelligence' (AI) where the criteria were:

"Do you have running code? Does the code produce something that at first glance looks like something a human might do?".

Then these criteria opened the doors wide to just any intuitive, heuristic, or black box techniques. In particular, now can call nearly anything 'machine learning'.

Actually just guessing, as from intuitive and heuristic efforts, is a tough way to find something powerful for a challenging problem. Instead, starting with some assumptions and doing some derivations is much more powerful, e.g., as for the law of cosines and the Pythagorean theorem.

The problem for AI, machine learning, etc. in computer science proceeding mathematically is that the corresponding math is often a bit advanced, and the profs didn't take the right courses in grad school and don't know the math.

Actually, for 'learning', there is a lot that is quite solid in parts of mathematical statistics. There the math is done with good care, is basically correct; and someone with a better math background than many of the statistics profs can make the math rock solid. There is more for 'learning' in optimization, control theory, and stochastic optimal control.

So, much of AI and machine learning is a C- student's rip off of some Cliffs Notes summary of mathematical statistics applied ignoring the assumptions. Bummer.

All that said, if you want to see AI approach the singularity where there is something like actual intelligence, then it does appear that you will need some very new approaches and ideas. Even done very well, not all of this work will be solid mathematics.

In the meanwhile, broadly the good stuff is in selected topics in mathematics. I suggest you emphasize linear algebra, mathematical analysis, and optimization and then relatively advanced approaches to probability, stochastic processes, and mathematical statistics.