Object Oriented Programming:

Genetic Nets

 

BIOL 265/COMP 113

 

Michael Weir and Danny Krizanc

 


Summary

Object Oriented Programming (OOP), as the name suggests, emphasizes defining objects with properties/attributes.  This can lead naturally to definitions of relationships between the objects.  Genetic nets, in which genes 'talk to each other', can be modeled effectively using OOP.  By studying genetic nets, we can move beyond individual genes to begin to discover larger-scale properties of systems of genes working together.  We are going to model a version of genetic nets originally proposed by Stuart Kauffman.

 

Kauffman, S.A. (1969) Metabolic stability and epigenesis in randomly constructed genetic nets.  J. Theor. Biol. 22:437-467. (Wesleyan Electronic Reserve)

 


Kauffman's Model

Let us start by summarizing the essence of Kauffman's model:

 

Start with a network of N genes

Assume genes can be either on (1) or off (0)

Assume digitized time units

Gene expression space consists of 2N possible states

E.g., for N = 10, states include [1,0,0,1,1,1,0,1,0,0], [0,0,0,1,0,1,0,1,0,0], [1,1,0,0,1,1,0,1,0,0], . . . , etc

For any time = t, expression space assumes one of these states

One of these states is randomly chosen to initiate a simulation

Each gene receives inputs from two other genes in the network – these inputs are assigned randomly at the beginning of the simulation

For each gene gi, the activities (0 or 1) of the two input genes at time = t determine the activity of gi at time = t+1 – as defined by one of 16 Boolean functions assigned randomly to gi

Here is an example of one of the 16 Boolean functions

 

Time = t

Time = t+1

Input 1

Input 2

Gene activity

0

0

0

1

0

0

0

1

0

1

1

1

 

Kauffman's Result

Before we move on to coding the model, let us summarize briefly Kauffman's result: why it caused so much stir at the time. 

 

Despite the fact that the model is set up using

Random start state

Random assignment of gene inputs

Random assignment of Boolean functions

 

Kauffman found a remarkable amount of order in the gene behavior. 

This is what we are about to see . . .

 


Boolean Functions

To begin coding Kauffman nets in Python, let us start by defining Python functions for each of the 16 Boolean functions.

For example, the Boolean function above is coded by

 

def f10(val1, val2):

  if val1+val2 == 2:

    return 1

  else:

    return 0

 

You can see code for all 16 Boolean functions at link.

Try representing some of these functions using a table representation (like above).

 

We now define a function boolEval that applies the appropriate Boolean function to a pair of inputs.  boolEval implements a case statement evaluating a Boolean function of a given type with inputs val1 and val2.  [Notice that this function searches a directory to find a function name, and applies that name to the inputs val1 and val2.]

 

def boolEval(type, val1, val2):

  functions = {0: f0,

               1: f1,

               2: f2,

               3: f3,

               4: f4,

               5: f5,

               6: f6,

               7: f7,

               8: f8,

               9: f9,

               10: f10,

               11: f11,

               12: f12,

               13: f13,

               14: f14,

               15: f15

           }

  f = functions[type]

  return f(val1,val2)

 


The Nodes (genes) of the Network

We now define a new class called node that represents a gene. 

 

What properties/attributes would we want each gene to have?

value: State on or off

neigh1 and neigh2: Input genes

type: Boolean function

 

In order to update the net at each time point, it is convenient to also have an attribute oldvalue. 

 

Look at the code to define the node class with initial values for attributes: two neighbors set to empty; type set to 0; value randomly set to 0 or 1; oldvalue set to value.

 

Also, see the node methods to:

Set each neighbor: setNeigh1 and setNeigh2

Evolve the gene state for the next time point: evolveValue

Return the gene state: getValue

 


The Network

We are now ready to define a network! 

We do this by defining a new class called net.

 

The attributes of net are:

size: the number of nodes (genes)

nodes: a list of nodes

 

To initialize an instance of a net, each gene is given two random neighbors and a random type (i.e. its Boolean function)

 

We define a net method evolveNet to evolve the net.  A for statement is used to walk through each node in the network.  At the end of this method, all the values of oldvalue are updated to the current state.

 

We also define methods to:

Print the Boolean types for each node in the net: printTypes

Print or return the state of each node: printNet or listvalues, respectively

 


Running a Simulation

In order to run a simulation of the genetic net, we need to think about how we would interpret output.  For each timepoint in the simulation, we can look at the state of each gene (whether on or off). 

 

If the combinations of on's and off's at a given timepoint match the values seen at a previous timepoint, then we know we have reached a cycle.  This is because each of the subsequent states must match what was seen at the earlier timepoint.

 

When we run the simulation, we store each state in a growing list.  If the new state matches any previous state, we know we have reached a cycle, and we can print the length of the cycle and the path to that cycle.

 

def cycle(newstate, states):

  for i in range(len(states)):

    if states[i] == newstate:

      print 'cycle found of length: ', len(states) - i

      print 'path to cycle of length: ', i

      return 1

 

We define a function generun that takes the net size as an argument and runs until a cycle is found.

 

def generun(size):

  n = net(size)

  current = n.listvalues()

  states = []

  while cycle(current, states) != 1:

    n.evolveNet()

    states += [current]

    current = n.listvalues()

  states += [current]

  for i in states: print i

  n.printTypes()

 

You can try running nets of size 10 by calling

 

generun(10)

 


Unexpected Order

When Kauffman simulated genetic nets, he observed much more order than expected.  After many randomly-initiated simulations, he concluded that a net of size N typically produces sqrt(N) cycles of average length sqrt(N).  All other states lead to one of these cycles.  This is remarkable given that there are 2N states!

 


Further Analysis

Below are some ways you might extend this analysis.  Consider modifications to the code that would enhance functionality.

 

For example, you could modify the module so that you can output (print) in the generun function the two input gene identities for each gene and the Boolean function assigned to them.  [Hint: You may need to add one more attribute to the node class, and a new method in the net class.]

 

Your output would look like:

 

>>> generun(10)

cycle found of length:  2

path to cycle of length:  4

[0, 1, 1, 0, 1, 0, 1, 0, 1, 1]

[1, 0, 1, 1, 0, 1, 0, 1, 1, 0]

[1, 1, 1, 0, 1, 0, 0, 0, 0, 1]

[1, 1, 0, 1, 0, 0, 0, 1, 1, 0]

[1, 1, 1, 0, 1, 0, 0, 0, 1, 0]

[1, 1, 1, 1, 0, 0, 0, 1, 1, 0]

[1, 1, 1, 0, 1, 0, 0, 0, 1, 0]

id   neighb   boolean

0    (9, 1)    15

1    (0, 8)    14

2    (8, 0)    14

3    (5, 4)    13

4    (7, 0)    10

5    (7, 0)    4

6    (6, 5)    10

7    (3, 4)    3

8    (5, 1)    13

9    (7, 1)    11

>>> 

 


Assignment

 

[Please upload your two scripts from the assignment below (emailname_genenet.py and emailname_genenet2.py) to the course Moodle.]

 

1.  Copy the python script into a new file and save it as emailname_genenet.py.  Modify the module by adding a function net_trials(ntrials, netsize) that allows you to run the generun function ntrial times for a net of netsize genes (you may need to make modified versions of the generun and cycle functions - call them generun2 and cycle2 - that return instead of print data).  Net_trials should return the cycle length and path-length-to-cycle for each trial.  It should also calculate the mean and median cycle length for all the trials.

 

Your output should look like:

 

>>> net_trials(5, 10)

cyclelen leaderlen:

3     2

2     4

1     2

2     3

1     1

mean cycle length:  1.8

median cycle length:  2

>>> 

 

2.  Run net_trials for a large number of trials (i.e. 200) for a single net size (e.g. 400 genes).  What are the median and the mean cycle lengths?

 

3.  

a. Modify your code (save as emailname_genenet2.py) to disallow the f0 and f15 boolean functions. 

b. Why is this a reasonable idea? 

c. Repeat question 2 with the modified code.  Are your results (median cycle length) consistent with observations of Kauffman's publication [Kauffman, S.A. (1969) Metabolic stability and epigenesis in randomly constructed genetic nets.  J. Theor. Biol. 22:437-467. http://eres.olin.wesleyan.edu/eres/coursepage.aspx?cid=1018&page=docs]

 


Copyright 2016 Wesleyan University