Category Archives: Uncategorized

Notes 12-17-2014

Lee email Paolo about meeting time next semester


  • final materials to wrap up independent studies
  • GPEM papers
  • Tim’s talk
  • simplification for generalization
  • epigenetic discussions


  • potential collaboration
  • discussion of work with John Klein
  • “binding” and its various definitions
  • how to incorporate ideas of bonding into pucks

Lexicase selection

  • inverse co-solvability lexicase bias for ordering cases
  • how to order cases to preserve niching, but also solve hard cases
  • what if the age of the individuals was considered in ordering the cases? that would favor cases that are newly solved. doesn’t everyone want to write more code?


  • in general results look good. need to reconcile differences between which settings work for symreg versus problem synthesis

Tom’s talk: machine learning classification of endothelial cell morphologies



Computational Intelligence Laboratory Meeting Notes – 12/10/14


Present: Wren, Tom, Lance, Karthik, Lee, Nic, Bill, Tom, Tim, Mike (Scribe)


Announcements: Meeting next Wednesday 12/17/14 in ASH 221

Meetings next semester at same time in ASH 111 starting January 21st.


Agenda: Uncle Bob Discussion, Lab projects

Deadlines: GECCO Abstracts January 21st or 24th, Papers due February 4th


Project Updates:

Nic:          Working on clustering.

Bill:          Finishing up his paper on EHC, bald eagle project.

Tim:       Working on multivariate clustering of the BC data. Statistics is hard. “Use R” is a                                good book for learning R.

Post-run simplification and generalization with Tom?

Mike:     Compare GP to other methods of analysis. What does the solution look like?   Who                       would use it and why? For next week how interpretable results are? Look into credit                                  card neural networks.

Karthik:  Looking at implementing Tom’s benchmarks for GECCO in Sketch and another system.

Wren:     Working on setting up the Bill’s system. Bill is working on getting it running on FLY.

Tom:        “Pounding the Cluster”. EHC work, just added new data to the table. EHC is performing                                                 generally worse.

Lexicase paper proof is ready.

Lance:     Working on a chemistry representation for ALife. Recently completed a virtual                                                                  ribosome.

Uncle Bob (Finally) part one:

                Create an operator that generalizes within the genetic pipeline?


19 November 2014

Attendance: Karthik, Wren, Tom, Bill, Tim, Mike, Beryl, Nic, Lee, Eddie (scribe)

GECCO abstracts due in January. Papers in February.

Pucks bonding in the future. Significance of improved efficiency still under investigation.

Mike has solid crime data.

Beryl working on building linear algebra clojure library that includes complex numbers.



Use post run simplification to generalize.

1. How well works on training data

2. How well it generalized to test data

3. How well it generalized after simplification

Data for above is available.


Karthick Presentation

  • SMT
    • Used for model checking, program verification, and function synthesis given constrains.
  • Semantic code search
    • Using I/O pairs as opposed to keyword searches
  • Can prove generalization of synthesized functions by converting the function into its constrains.
  • Constraints become effective way describe function’s semantics

11 12 14 Notes

Meeting times Wednesday 9-12?. In ash 111.
Jason Moore now at Upenn.
Moshe and Lee grant submitted.
A ridiculously interesting and high volume amount of stuff and exciting graphs.
Uncle Bob.
Karthik giving tutorial.
Tom’s stuff.
Wren’s project. Wisconsin card projects.
evolving one model for two different conditions
individual is just the equation.
don’t evolve the parameter, but when testing an individual sweep the parameter.
Jordan Grafman Northwestern. Data.
TomWeed out the crap of simplifying programs. Mixed cross validation and simplification.
1. Niching
2. post run simplification
3. stats stuff that has been on the email list
4. stuff that started with hill-climbing. How do you limit runs?
5. multi-chance lexicase.
Replace space with newline program. Tournament selection just as a well as lexicase selection
if you look at it in a different problem.
Niching- Look at stuff
“I’m not sure what that means”.

29 October 2014

Attendance: Wren, Tom, Bill, Eddie (Scribe), Tim, Mike, Beryl, Nic

Wren’s Project

Hoping to find equation of curve from data averages extracted from videos.

Might try simpler regression techniques before using GP.

EHC in Clojush/Bill’s Results

Non-significant improvement using EHC with lexicase. Very significant improvements using EHC with more traditional GP techniques.

Different number of point evaluations make it hard to compare.

Pagie problem improvements. Added a 1 to the constants, which allowed for solutions to be found, thus success rates compared. Seemed to be an improvement of baseline tests and EHC tests.

Nic discussed ways of looking at the statistically significance of results of a run. (Pairwise tests)

Tom’s Diversity/Homology Results

Lexicase increases the diversity of the population a lot.

Not clear if lexicase significantly lowers Homology or not.

Higher diversity = individuals are very different in population

Higher homology = population has higher average edit distance

Higher diversity generally result in lower homology. Is this bad?

Program sizes increased much more with lexicase selection.

Graphs become less informative as number of runs find solutions. Possibly change opacity or thickness of plot lines as success rates improve.

22 October 2014

Lab Meeting Notes October 22, 2014

Present: Bill, Tom, Lee, Ben (Scribe), Tim, Eddie, Mike, Lance


New library that allows people to integrate clojure in Unity.

This is awesome for AI and virtual worlds stuff.

Unity already can integrate C#, javascript, boo (?)

Eddie will look more into the clojure integration in Unity.


Push doesn’t have information protected at all so scaling up might be difficult.

The idea of “Names” became cumbersome.

“Tags” though with inexact matching might help.

Difficult to figure out if “Tags” accomplishes true modularity.

Eventually came up with “Environments” but have not tested it much.

Also developed “Environments” for geometric semantic operators

What happens is that when calling “environment” the current state gets stored somewhere and the thing you want to run gets run. Then when returning the state you can consume code that you had just run and create code that was the result of the run.

Should we be including “Tags”?

Should we be including “Environments”? (thinking about them separately)

Lee: Send Lance Tagspace machine paper.


Lee is updating transactions so that the “bid” and “ask” system can be more strongly developed.

In order for transaction to occur each “bid” should satisfy each “ask.”

Lee wants:

Double check the neighborhood implementation.

New complex worlds

New worlds that will be slightly different each run.

Also needs scalability.


Fitting a decision tree to the data

Should we remove duplicates?

Change the current paradigm since data seems to conflict with itself.

Bills Results

Results of new hill climbing algorithm and trying to find new efficient methods.

Crossover does seem to be pretty good

algorithm is not effected by number of iterations.

For next meeting

Make sure to include some Wren and Tom time

Come back to Uncle Bob.

15 October 2014

Lab meeting notes 15 October 2014

Present: Lee, Lance, Karthik(Scribe), Ben, Tim, Mike


The pucks environment. 

– Testing of the neighbor finding algorithm.

– Scale up the world and check for the efficiency of the newer neighbor finding algorithm.

– Add a potential scaling factor setting?

– Building more worlds that are somewhere in between completely random and completely deterministic.


Clojush documentation – outstanding issue for a very long term. 

– Evolving things with say, 3 inputs.

– Use a Wiki for documentation.

– Handling documentation for things that are highly dynamic.


– Talk about environments, either today/sometime soon

– Tim’s decision tree results: Current success rate ~62%, which isn’t necessarily better than just guessing “no” on the problem.

– Parameters that can potentially improve generalization?


Uncle Bob piece. 


(Long term) : Presentation on Satisfiability Modulo Theory, constraints, etc.

8 October 2014

Attendance: Lee, Tom, Bill, Kwaku (yay!), Wren, Eddie, Noah, Tim, Karthik, Nic, Mike (scribe)


  • FLY was down due to power outages
  • Pucks – Shooters have been added, discussion of possibly using pixel data for proximity detection
  • Introductions for new people, individual project discussions.
    • Wren has data from her images of pollen tubes!
    • Tim implemented a decision tree in the Bladder Cancer dataset.
    • Bill has been using the UMASS computing cluster, can we use it?
  • Kwaku – Evolution and ecology in software engineering. Learning how diversity occurs. Embedded systems. May collaborate with Karthik?
  • Homology – How much of the population exhibits similarity structurally.
    • Background
      • Only takes structure and sequence into account, not function.
      • Comes out of ULTRA in hopes of finding structurally similar individuals
      • Plush replaced ULTRA
    • Only doing crossover on the same parts of a program
    • Want to measure how homology occurs/changes over time.
    • Tom has been testing out using edit distance to measure homology (Levenshtein) only using instructions.
    • Can we measure homology properly without executing the programs?
      • We want to measure fitness and homology, some changes benefits may not benefit until further generations
    • Should epigenetic markers influence homology? Probably.
      • Could look at the push programs rather than the plush genome.
      • Genomes are more important to look at.
      • Tom want to start without epigenetic markers for simplicity’s sake.
      • Can we use tags to assist us?
      • Padding is used to normalize.
    • Uncle Bob – SOSIES Paper code transformations, discuss next time.
    • Databases – To implement in Push, we need to figure out the root parent issue or an alternative representation to make Ancestry linear so we can find clades.
      • Find a way to eliminate duplicates?
      • Using ancestry change crossover parameters.

Link to Uncle Bob Blog Post


1 October 2014

Present: Lee, Nic, Lance, Tom (scribe), Ben, Beryl, Tim, Wren, Eddie

Finishing some Student Project Ideas from Last Week

Eddie – He’s tried some stuff with Brevis and Quill. He might next work some with Pucks.

Ben – Not sure yet. Maybe something A-Life-y?

Tom’s Benchmark Problems

Should we include problems beyond iJava and Yuriy’s paper?

  • We should have a citable source outside our work for the problem — i.e. not “we made it up”. Otherwise, probably “yes”.
    • FizzBuzz (kata) – yes
    • Bowling (kata) – maybe
    • Factorial – maybe (too easy?)
    • WC – yes
    • US Change – maybe (need citable source)
  • Tom will consult rest of committee.

Pollen Tubes

Wren is going to try some things and consult others to try to see if a solution already exists.

Homology of Population

  • Levenshtein distance of pairs of programs seems like good measurement.
  • Likely can’t test all pairs, since this will usually be about 1 million Levenshtein distance computations. This will probably be too slow.
  • Instead, sample random pairs of individuals — maybe on the order of 1000 to 10000.
    • May need to do validation to make sure the sample results really reflect the population.
    • Report: median, quartiles, mean, standard deviation.

Nic McPhee Presents: Analysis of Ancestry in GP with a Graph DB

  • Using Neo4j – a graph database – to store ancestry info throughout a GP run.
  • Makes ancestry info easy to collect and query.
  • NeoCons – Clojure with Neo4j
  • Cool stats can be queried from the database!
  • Could be neat to look at common ancestries using different parent selection methods!