An Introduction to
In this lab, you will get a chance to integrate different concepts of GENE
and MAP. You have considered genes variously as genetically mappable units,
information residing on chromosomes, that which underlies phenotypes, scorable
markers, historically discovered mutants, translatable coding regions, clonable
segments of DNA, and sequenced strings of nucleotides. (You can probably add
Today, you will choose one of the markers commonly scored in Drosophila, and follow it through various databases to find
cytological map position,
genetic map position
is known about alleles of the gene
other genes are known to be nearby
gene's DNA sequence, and how it would be cut by a restriction enzyme
(useful if you wanted to clone it into a vector)
gene's protein sequence (by conceptual translation)
there are recognizable features in the sequence, and what their likely
there are similar sequences (implying homologous genes) in other species,
and what is known about these potential homologs.
At the end of this session, you will know more than you probably ever wanted
to know about the gene you picked, but also have a feel for the difference
between kinds of maps, the power of databases, and the strengths and weaknesses
of some biological database searching tools.
Choose a partner and one of the Drosophila genes below, and let's get started.
The major resource
you will use is FlyBase, the on-line incarnation of what used to be known as
The Redbook - a big red book called "Genetic Variations of Drosophila
melanogaster" which listed all known Drosophila mutants, and what was known about them. The FlyBase
database has all of the original Redbook information, plus molecular
information and links to other databases. There are links directly to relevant
sections of the Berkeley Drosophila
Genome Project, which has genomic sequence, cloned and sequenced cDNAs, and
annotation of predicted genes; and to searches of international protein
databases such as SWISS-PROT and GenBank.
Open FlyBase (www.Flybase.org)
in a new window, and start by searching for your gene in the "genes"
Find your gene among the query results.
that the abbreviation matches, so you know you have the right gene.
- If you
are on a page with several genes in a table, click on your gene's symbol
to get to a brief report on the gene. You might bookmark this synopsis
page, since you'll want to come back to it. Information from many sources
is summarized on this page, with links to more details.
About the Gene and its Mutants.
the top of the synopsis page, you should be able to work out the different
kinds of map information. C. B. Bridges worked out a scheme for naming the
banding patterns when he drew detailed maps of the larval salivary gland
chromosomes in the 1930's. He assigned 20 numbered sections to each major
chromosome arm - 1-20 for the X chromosome, also known as chromosome 1,
21-40 for 2L, 41-60 for 2R, 61-80 for 3L, and 81-100 for 3R, with 101 and
102 leftover for the tiny 4th chromosome. L and R denote the left and
right arms of the metacentric 2nd and 3rd chromosomes. Major bands in each
section are labeled with capital letters, and the smaller bands in between
numbers. There is often some uncertainty in reading fine bands, so a range
may be listed. For example, 21D1-2 refers to a specific double band near
the left end of chromosome 2.
Genetic map distances are listed in recombinational map units
starting at the left tip of each chromosome.
the right-hand side are different sorts of "available reports."
Use them to determine:
many alleles of this gene are known?
was the oldest allele found? (Follow the links back to information about
alleles - the original mutant generally has superscript "1").
supplier of mutant fly stocks for most labs in this country is the
Bloomington Indiana Stock Center. Under "stocks", could you
order a fly with this mutation from Bloomington (if you had an account
there)? What stock number would you ask for? (There may be many, if the
marker is used in different combinations. Just list one.)
can you tell about the phenotype of mutants, and where the gene is
- On the schematic map of the gene region, you can see
other genes that have been identified nearby on the same chromosome. Find
the nearest genes to the left and right of your gene. Click on them to
find out more about them. How were they identified? Do they have mutant
phenotypes, or are they cDNAs or genes predicted from the sequencing and annotating
About the Gene Product - the DNA and
by retrieving the sequence in usable format. You may be able to get it
under "transcript". Otherwise, set the choices after Sequence:
get to TRANSCRIPT and FASTA, then click GET. That way, you'll get the
sequence uninterrupted by basepair numbering and other non-nucleotide
annotations. Again, bookmark this page for future reference. (You can also
try specific known transcripts. Just make sure it's all A, T, C, and G.
Proteins will be dealt with later.)
- Open a
new window to Webcutter (link ). This is
just one of many programs that will do an in silico restriction digest for you: that is, will find
by computer the sites that a given restriction enzyme would find and
cleave enzymatically in your sequence. Copy and paste your FASTA sequence
into the box halfway down the page, and choose enzyme conditions. Start
simple, with just EcoRI as the enzyme, to see how many EcoRI sites there
are in your gene. Now you can repeat with another enzyme, or look at many
enzymes at a time.
look for open reading frames: an ATG followed by translatable codons. The
reading frame closes at the first in-frame stop codon. Paste your sequence
from before into the NCBI open
reading frame (ORF) finder (http://www.ncbi.nih.gov/gorf/orfig.cgi).
It will show you ORFs in all 6 possible reading frames (why 6?). You can
see the DNA->protein translation by double clicking on one of the ORF
shaded boxes (choose the longest ORF). The translation will show up in the
one-letter amino acid abbreviation code. It should match the sequence you
get from "polypeptides" on the synopsis page, but may not. (Can
you think of reasons why not?)
there recognized protein domains for your protein listed on the synopsis
page? Do they relate to the presumed function of your gene?
Gene Ontology section is based on observed conserved protein domains. It
illustrates predicted molecular functions, biological processes, and
cellular components. By following some of the links, you will see the
shared properties predicted for this gene.
may investigate sequence similarities between the protein product(s) of
your gene and others using BLASTP (http://www.ncbi.nlm.nih.gov/BLAST).
(BLAST is a computer algorithm for Sequence Alignment). Copy the amino
acid sequence from Flybase, then paste it into the appropriate section of
BLAST (Standard protein-protein BLAST [blastp]). The initial output shows
conserved protein domains with links. You have to click the FORMAT button
to view the detailed results of the BLASTP search, and may have to refresh
a couple times if the server is busy. You can mouse down over the matches
found, to discover their sources, and see where the similarities listed in
"protein domains" above come from. (Gene names will be at the
top as you mouse over the alignment bars.)
leads you to some associated databases with information pertaining to your
gene. Try exploring these links.
[there is a report page to go with
Wesleyan University (mw/lfa)