How An AI Scientist Train His Puppy

Aug 02, 2020

Training a puppy is a fun experience. Our little pokie is turning 1 yr old, and turning into a cutie-pie.

This thoughts emerged while training my own dog. There’re many examples that reflected the fundamental difference between how human learn vs how animal. One of the key observations is the lack of communications. Since animal cannot understand-reason what human say, they have to many times learn thru the brute-force way. For instance, Poke LOVEs to scratch wooden floor while she’s bored. Many times we got up late finding him destroyed the tiles in the kitchen. It’s really hard to teach her not to since we cannot catch her doing it (well partly cuz it’s her nature destructive). We decided to not feed her breakfast if she breaks the floor tiles. But this is an annoying process, we don’t wanna torture her, but on the otherhand without enough of such “cases” she wouldn’t be able to perceive “break tile” — connected to –> no food. Think about it, we wouldn’t caught her on the scene considering it’s around 12 hours overnight, there could be a lot of stuff going on in this 12 hours. It could be the fluff toy she’s tore apart, could be the water plate she flipped over, the kitchen drawers she drew open, many many things that she would potentially related to the ultimate “no food” signal — the strongest supervision signal she could get as a dog. She would have to be fed/no fed enough times with all these different “noticeable” behaviors occurred differently, until she correctly correlated “breaking tiles” -> no feed. This is exactly the problem when we train neural networks — there are so many variables in the environment (features), and so few supervision (feed), that we need to train many many examples until the model converge.
Think about a simplified model, there’re N behaviors that is noticeable by Poke, each behavior is modeled as boolean variable x1…xn. Suppose Poke needs to learn a “denoise” bool function y (no feed) = x_<break tile>, thru a ML model. The joint posterior distribution:
P(y=1| x1=0, x2=0, x3=1, … xn=0) here we need to at least flip each bit x_i once to obtain marginal distribution of x_i on y: P(y | x_i), meaning that we need at least 2N examples to learn P(y | x_i), and apparently the more we have the more accurate the estimation of P(y | x_i) is. In case of training an MLP, which supposedly can represent any Lipschitz continuous function including XOR gate, 2N examples is far from enough, because the weight matrices are at least N*M*2(M is the hidden layer size).

Training a dog is just like training neural networks. There’re so many variables and so training data is needed and additionally, time consuming (costly) process to obtain those training data. Why isn’t so to train humans? This go to the my initial point –> communication. If language is not a barrier, say I’m training a chinese/english worker, I’d just need to tell him/her. She’d understand the boolean function presumably without any facts. However, if I was training Hungarian, I might end up doing the same with Poke cuz no way I can communicate this function to him –> don’t do this, otherwise that would happen. In the popular “thinking fast and slow” book, the author pointed out human has 2 branches of thinking, one is like intuition that you perceive things and correlated them to concept, imaging recognition, speech recognition these cognitive tasks are one of them; the other branch is the ability to “reason” — particularly “slow” thinking because apparently this route in your brain needs more power and more latency to resolve, and cannot be fast processed by “intuition” reflex. This is exactly how empirical programming does, you put logical instructions and it’ll execute logically. The crux behind learning “no-tile-breaking” task for human is thru communication, instead of training a neural network in his brain, human “reprogram” his logical part of the brain to deterministically execute this logic. The most exciting part is how this “reprogram” procedure works — I don’t believe anyone put much research into this. In a typical ML system, retraining a neural network can be completely automated as long as there’s continuous stream of learnable data. It does involve human in the loop if new structure (variables/features) are added, but with enough building blocks, more models, we can successfully remodel the these tasks with new data. But how do we reprogram deterministic logics? Firstly we need something to execute boolean logics — in this case we have imperial programming, cool, but coding is a complete human-in-the-loop process, how do you tell computer to code? I imagine there need to some high-level tasks defined first, and then the bot with some new intelligence enabled, can reprogram the code thru the finer grained task he defined himself.

Many questions come out of this assumption? What does the “high-level” task look like? Firstly it needs to be highly abstractive, and highly flexible to rearrange logics at the same time, so DSL these highly structured but inflexible frameworks are out of questions (let alone the trouble they brought to programmers that human can’t even control/understand very well); in the meantime it has to contain many many prior informations, it may need to connect to security (don’t hurt somebody! don’t touch my grass!), privacy (don’t pull this out! don’t show this to people!) all these “commonsense” that only human masters, so end supervision (feed/no feed) is not sufficient. I don’t think any meta-learning directions in current research world is tackling this problem srsly. The 2nd assumption about “reprogramming”, what kind of program is that? c? c++? How do you let this bot understand pointers, references? Taking a step back, how does the bot even understand integer, float and string primitives? Isn’t we solve the intelligence problem already if the bot can be smart enough to really know how to code? Is that compute programs or something we haven’t discovered yet? Obviously human uses something. Apparently without building such programming language, we wouldn’t be able to know how to program it.

I don’t mean to pour too much pessimism to AI research. The fact is the more I worked in AI the more I appall on human intelligence, and the more I feel human’s limited intelligence on understanding intelligence. A philosophy question then, is it even possible for intelligence itself to understand intelligence? A possible direction is just give up the understanding part — I mean human can give birth to humans, train them and put’em to work for thousands of years — humans are certainly entitled to “create” intelligence to make use of them, it’s just a matter of how fast and how efficient this creation is, and how tighter control we can put on these “intelligent” creatures.

Share this:

Related

Leave a comment Cancel reply