Darwin is a hacker

rik · May 13, 2010, 9:18am

3evolve.pl_.txt (12651Bytes)
4evolve.pl_.txt (13631Bytes)
5evolve.pl_.txt (15007Bytes)
6evolve.pl_.txt (17455Bytes)
7evolve.pl_.txt (17642Bytes)
8evolve.pl_.txt (19965Bytes)
1population_management.pl_.txt (5803Bytes)

How many monkeys on type writers does it take to write the collected works of Shakespeare? Or Asimov for that matter? It has been claimed, but never tested, that a limited amount of monkeys will indeed produce such a work.

Given enough time...

I think I can emulate a finite amount of monkeys hitting random keys. I just did so. It's a perl program of about 300 lines that produces Picaxe Basic. Or rather, it produces absolute nonsense, and then feeds it to the picaxe compiler for a quick syntax check.

Every bit of gibberish that makes it through the test is relabelled as "viable" picaxe code. It is stored somewhere safe for future retrieval. And this is where my monkeys get a bit smarter. They copy pieces of existing picaxe code. The better the code, the more likely they are to reuse it. And add to it. Or delete parts from it. Or paste it into their own. But most of all, they will combine it with other pieces of existing code. And thus produce new code to be tested by the compiler.

Now, admittedly, I am a cheater. I gave my army of idiot programmers a few snippets of code. The very minimum lines like "b0=0" and "pause 1". And it shows. These snippets are reappearing all over the place. In about an hour, my collection of viable programs has grown from 5 to 500. One combination for example now reads "pause b0".

This is not enough to me. I want a real picaxe to start using this code. And I want some way of telling the monkeys which pieces did something really cool while uploaded to the micro controller. I am still without such a feedback system. It could be all kinds of stuff, but most of all, it needs to be dead simple. The feedback system is taking shape.

And it need to be something visible, so I can broadcast the experiments on live internet TV.

So, my monkeys… What kind of ideas will you copy and paste together for this cool experiment?

Changelog

# 201005121529 rik (1069) letsmakerobots.com
# artificial evolution of micro controller code
# Picaxe Basic in my case
# Warning: this code is highly experimental - might just be mental

# 201005121611 let’s introduce point mutations
# 201005121611 let’s introduce rudimentary evaluation functions
# 201005121809 let’s introduce sort by_fitness into select_creature
# 201005121950 let’s introduce specific picaxe basic genome cleaners
# 201005131258 let’s cleanup not-dead code upon loading
# 201005131818 let’s cleanup code before determining id (plus other major improvements in quality)
# 201005131819 let’s prevent two identical pieces of code to exist in the db
# 201005132025 let’s try and rate the quality of a compilation
# 201005141604 let’s remove some non lethal bugs
# 201005141607 let’s reuse make_creature in the main loop
# 201005141621 let’s cleanup accolades (curly braces) and reduce labels to one alpha char
# 201005141717 let’s not shorten labels, it destroys potentially interesting dna
# 201005141723 let’s dump all genomes in one huge logfile for online monitoring
# 201005141738 let’s expand the fitness range for dead genome: 1-9 depending on location of syntax error in the code
# 201005141756 let’s give small viable programs a small advantage when memused is just over 4 bytes
# 201005141856 let’s compile code with newlines replaced by spaces so that we know the exact length of the code
# 201005141948 let’s no longer compile code with newlines replaced by spaces, it’s another bad idea, the code appears to be functionally different in some cases
# 201005142036 let’s cleanup backslash-doublequote combo’s
# 201006011238 let’s allow for father and mother to be the same creature (!)
# 201006011441 let’s rename “retired” into “buried”
# 201006011824 let’s stop replacing empty genome by an arbitrary string (a:) and start evaluating it as fitness = 1
# 201006011929 let’s move all the sorting and counting of %creature to the top of the script, saving tons of CPU cycles
# 201006022235 let’s cleanup trailing newlines
# 201006032139 let’s push any new ids to the array @ids, without bothering with expensive sorts
# 201006032154 let’s no longer worry about genomes shorter than 2 chars, they too have a purpose in the gene pool

Continue reading here.

telefox · May 13, 2010, 5:52pm

Ehehe, I’m working on a

Ehehe, I’m working on a related project at the moment, but from a neural networking perspective. Once I finish my current prototypes I’ll be able to release a model “into the wild” and see how it develops.

The one thing you’re missing, as you’ve obviously noticed, is that your monkeys lack a goal.
In biological evolution the goal has always been simply survival, but your freshly spawned programs have no predators. The only way they can be culled is if the program fails to compile - essentially this is analagous to a critical genetic fault.

Now you have to choose whether or not your creations will be subject to the Hand of God (manual selection by you) or the Survival of the Fittest (‘natural’ selection in software)…

isotope · May 14, 2010, 12:58am

This is really interesting!

This is really interesting! For feedback you could have say, a counter with red and green buttons. This counter would represent some kind of rating. For example the default is 10 in the begining. During execution of program, if you liked something - push green button and increase rating. Same if you dislike - push red and decrease. In the end of execution you would have some kind of rating you could pass down to your monkeys

rik · May 14, 2010, 6:20am

Excellent work, my monkeys!

Glad to see that you are thinking along! Here is some progress so far.

Last night, after improving the selection mechanism a lot, I started a new run. Without injecting any “artificial DNA”. Not a single Picaxe word was released into the primordial soup. It ran for 12 hours and completely failed to come up with even a single “goto”. The largest viable piece of code reads:
uq:}
pd:
j3:
}
p9:
ch:
wt:
zj:
}
ju:
pu:
zt:
z:
zm:}}
}

Sure it will compile, but it won’t do anything. It’s just labels and formatting code. I will alter my DNA repair-code to minimize these.

Until now, the nightly work is done by a repeated batch process. I run 600 iterations and then cull 10% of the dead bodies at random. Each “body” is a file on disk, so that is easy to script. The viable bodies remain. And therein lies the risk Telefox is talking about. There is no mechanism to make any distinction between stupid and smart programs. There is no distinction whatsoever. All viable bodies (I call them genomes) are rated a fitness of “1”.

I tried to change the lack of distinction in this phase of the evolution. My latest code (version 5) is now optimizing for code size after compilation. Remember this message from your Picaxe programmer:
Compiled successfully.
Memory used = 9 out of 4096 bytes.

My evaluation routine is aiming for 40% memory use. That is a very arbitrary percentage and I might change it one day. It does not really matter. Because now, the fitness rating ranges from low to high (from 1 to 99 in version 5). Any piece of code that takes up 8 bytes in RAM or more will win from smaller programs. And even larger ones will win from the 8 bytes code, etcetera. This will hopefully separate the “empty but long” genomes from the “meaningful” ones, without actually testing them one by one in a real Picaxe.

The example genome above scores “1”. That is equally low as a minimalistic piece that reads “ma:” and they will both have equal chances of ever spreading their DNA through the gene pool. Without any proper competition, this means the chances are equally “high”. As soon as one genome appears with a higher fitness score, all bets are off. My routine selects parents for the next generation at random, but is preferring the better ones. By far. Without forgetting the poor little stupid ones.

My next step will probably be to increase the need for survival. By randomly culling viable genomes. Starting with the stupidest ones.

rik · May 14, 2010, 6:28am

Predation

I would love to read about your NN bots on LMR. I did not miss that article, did I? Even an unfinished prototype would be interesting. You know. It’s LMR. We love unfinished prototypes.

You are right about the goal. I think I introduced it now. Now, the only way to prevent (random) culling is to advance. This is not really analogous to predation. Rather starvation in a limited ecosystem.

But some day, my algorithm will produce thousands of genomes that are all optimized (all fit 40% of my Picaxe’s RAM). And I will need to feed them into real Picaxes. And judge them for their real world behaviour.

I prefer automated selection all the way. Or perhaps crowd sourced selection.

telefox · May 14, 2010, 6:48am

LMR may love prototypes, but

LMR may love prototypes, but I only love other peoples prototypes =)
They’ll turn up sooner rather than later, if progress continues as planned.

Had a browse through version 5, and I may not have much experience with PERL but this experiment is starting to look very promising. As you mentioned, it’ll just take one piece of minimally functional code to really get things rolling.
Starvation-induced competition is pretty much the same as predation as far as statistics are concerned, so with a bit of luck it won’t be long before we start to see something a little more animated squelch out of the digital primordial ooze…

rik · May 14, 2010, 7:26am

Prototype

I’ll show you mine if you show me your’s!

telefox · May 14, 2010, 7:29am

Haha, but I can already see

Haha, but I can already see yours! Why buy the cow when you can get the milk for free?

rik · May 14, 2010, 10:51am

Interactive feedback providers

This is what I call interactive feedback. We would need people to push those buttons. In order to process gazilions of genomes, we’d have to have many people spending many hours watching and judging the end result of the program (the phenotype).

I surely am not going to do all that work by myself! But if I can make the whole contraption Internet enabled somehow, maybe I could get everybody in the whole wide world involved. Well, everybody on LMR anyway.

So my aspirant judges, tell me; what would be a simple yet interesting machine to look at? Just an LED or two? When the LED goes crazy, would you assign more or less points? Or perhaps a simple motor with an stick attached. Hoping it will evolve to wave at you?

rik · May 14, 2010, 11:02am

Cheating killed my critters!

I cheated and I got punished for it.

My latest algorithm further cleans up any genome that was randomly generated. One of the cleaners removes labels that:
* are longer than one character
* uses anything but an alphabetic character
So now my viable code only produces nonsense like
v:
k:
t:
and the dead genomes are starting to look eerily similar. There is no chance at all in this system to ever produce anything as “complicated” as “goto k:”. The word “goto” simply is too long and will get cleansed before ever finding its way into the viable gene pool.

8-(

rik · May 14, 2010, 2:16pm

Picaxe Basic is stranger than you think

Stranger even than that!

Random code can lead to new knowledge. Did you know that this:
""
compiles just fine?

But what is worse, near such a backslash-doublequote monstrosity, the compiler will accept all kinds of gibberish!
“0a:r@zkxa:rr@f]s”+fz=qfa=64+
Compiles just fine!

Don’t they understand that this is a gateway to evolutionary hell?!

benbo231 · May 14, 2010, 3:42pm

Skynet

Sentient A.I…Now developing smileys!

:}

rik · May 14, 2010, 4:21pm

OMG, I am starting to hate astronomy now!

I just did a quick “back of an envelope” calculation. And it turns out that the odds of finding a word (any word) that exists in the picaxe manuals through combining random characters are ASTRONOMICAL.

That will not do!

So I did some major cheating and injected the entire index of manual 2, containing some 1500 discrete words. Now it turns out my algorithm does not respect that either. It is chomping on infinite variations of the word “disconnect”.

disconnect
fcf4:disconnect
disisconnconnsconnconnecnnec4:disconnect
ec4:disconnect
fisconnconnecnnec4:disconnect
fcnnfc24:disconnect

I get the hint. I’m off to bed.

philbot · May 15, 2010, 7:55am

Halting theorem

Rik, are you familiar at all with the halting theorem?

I get the feeling that if not, you’re going to encounter it pretty soon. Have fun with that.

The other thing is that I can see a flaw in your test: that is, the code you produce (if executed on a picaxe) could actually result in a large number of different outcomes, depending on how the chip is wired up and what to. The same program could potentially perform many different tasks, all dependent on the hardware the chip is plugged into.

Also, I’m not sure how many keywords there are in picaxe code, but even if you were to start with a language of only valid keywords, there are billions of possible combinations, and only a very few of them will actually do anything interesting.

rik · May 15, 2010, 9:25am

Nah, problems exist to be ignored

I never intended my experimental robot to run the program until the program decided for itself to halt. I planned on giving it like 60 seconds of running time before I (automatically) upload the next code for testing.

This is the beauty about my kind of experimenting. It’s just for fun and glory (thanks TinHead).

However, the problem of the astronomical odds is very real. I have yet to find a way to circumvent it. I already learned that the evolution algorithm needs to be able to recognize even the smallest advances in fitness. And I am not sure how I will implement that once I am ready (if ever) to start uploading genomes into my arbitrary robot.

rik · June 3, 2010, 4:31pm

This project just won’t die

My population manager is killing creature by the millions, but the project won’t die. The physical habitat for the viable creatures is underway! Stay tuned.

[image intentionally obscured - dum dum dummmm!!]