Transcription Networks
Ildefonso Cases (CNB-CSIC)
Summary

Concepts in Transcription

Transcription Networks

Definition, properties and evolution

Transcription Networks vs Functional Networks

Evolution of Regulatory Structures

Transcription and Adaptation
Regulación de la Transcripción
• Resultado de la interacción entre proteínas y
DNA.
• El conjunto de proteínas que se unan a su región
promotora (directa o indirectamente) va a
determinar la expresión de un gen:
 En que tejidos
 En que momento del desarrollo
 Bajo que condiciones ambientales
 etc.
Transcripción en Bacterias
Transcripción en Bacterias:
Factores Sigma
Escherichia coli
 sigma70/D
 sigma32/H: heat shock
 sigma24/E: ECF
 sigma28: flagelo
 sigma38/S:fase estacionaria,stress
 sigma54/N: nitrógeno y otros
 fecI: hierro
Pseudomonas putida: > 15
Streptomyces: > 30
Transcripción en Bacterias
Transcripción en Bacterias:
Operones
Transcripción en Eukariotas
Transcripción en Eukariotas
Transcripción en Archeas
• Maquinaria basal Eukariota
• Reguladores eukariotas y bacterianos
Otras fuentes de Regulación
Elongación
Estabilidad del mRNA
etc.
Transcription Networks
Transcription Networks
Network Properties
Regulators regulates:
1
p(k)=ak-b
Scale-free
Networks
Resistant to Error
Sensitive to
Attack
0,1
0,01
1
10
genes
100
1000
Yeast
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
Guelzim et al. 2002 Nature Genet. 31:60-63
Preferential Attachment
1
2
3
Network evolution
Duplicated
Genes are often
co-expressed
and share
regulator binding
sites
van Noort et al., 2004 EMBO Rep 5(3):280-4
Binding sites Evolution
Papp et al,2003. Trends Genet 19:417
Motives
Milo et al,2002. Science 298:824
Motives
Milo et al,2002. Science 298:824
Motives Profiling
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
Milo et al. 2004 Science 303:1538-1542
Overlapping Motives
Bi-fan y FFL often share nodes and
edges
Dobrin et al,2004. BMC Bioiformatics 5:10
Motives Evolution
Conant & Wagner,2003. Nat Genet. 34:264
Motives Properties
Shen-Orr et al.,2002. Nat Genet. 31:64
Coregulation Network
gamma≈-1
c=0.6
scale-free
small world
van Noort et al., 2004 EMBO Rep 5(3):280-4
Network Evolution Simulation
van Noort et al., 2004 EMBO Rep 5(3):280-4
Network Evolution Simulation
In the absence
of selection
we can
reproduce a
network with
similar
properties
van Noort et al., 2004 EMBO Rep 5(3):280-4
Trancription Networks Dynamics
Luscombe et al., 2004 Nature 431:308
Trancription Networks Dynamics
Luscombe et al., 2004 Nature 431:308
Trancription Networks Dynamics
Endogenous
Exogenous
Luscombe et al., 2004 Nature 431:308
Combining Networks
Regulatory Networks vs.
Functional Networks
Functional Associations
•
Protein Complexes
•
•
•
Information/Biochemical Pathways
Metabolic Programs
•
•
Enzymes …. Ribosomes
Anaerobic… Aerobic Metabolism
Biological Processes
•
Transcription … Recombination
Relation between functional
associations and co-regulation?
“co-regulated genes are
functionally associated”
Precedents
•
Pairs of interacting proteins are more frequent
among co-expressed genes in S. cerevisiae
•
50% of the pairs of co-expressed genes belong
to the same biochemical pathway in S.
cerevisiae and more than 30% in C. elegans
•
In E. coli and B. subtilis genes in operons (and
thus presumably co-expressed) tend to belong
to the same general class of cellular function
Ecocyc
Protein Complexes and sub-complexes
• Biochemical Pathways
•
•
•
Pathways and Super-pathways
Regulatory information
•
•
Transcription Units
Regulatory Proteins
• Regulons: Genes directly regulated by the same protein in
the same way
• Super-regulons: also include indirect interactions
•
Functional
Associations
•
•
•
•
Complexes
Pathways
Superpathways
Regulatory
Associations
•
•
•
Transcription Units
Regulons
Supe-regulons
Correlated?
Coding functional associations
C
A
A
B
A
B
F
G
B
B
C
E
C
A
C
E
F
G
A
B
C
D
A
0
1
1
0
B
1
0
1
0
C
1
1
0
0
D
0
0
0
0
Coding Regulatory associations
A
B
C
A
C
B
A
A
B
A
C
B
D
D
C
C
A
B
C
D
B
D
A
B
C
D
A
0
1
0
0
B
1
0
1
0
C
0
1
0
0
D
0
0
0
0
Reduced
Matrices
Original
Matrices
A
Gene
Network
A
C
D
E
0
1
0
1
C
1
0
1
0
D
0
1
0
1
E
1
0
1
A
C
D
A
0
1
0
C
1
0
1
D
0
1
0
0
Ia=2
Functional
Assoc.
A
B
C
D
A
0
0
1
1
B
0
0
1
0
C
1
1
0
1
D
1
0
1
0
A
C
D
A
0
1
1
C
1
0
1
D
1
1
0
Ib=3
A
C
D
A
0
1
0
C
1
0
1
D
0
1
0
Iab/Ia=2/2=100%
Iab/Ib=2/3=66%
Ice = Ia*Ib/(N*(N-1)/2)
Complexes vs. Transcription Units
282 genes, 87% and 85%, 80 times more than expected
Exceptions
GatC
MtlA
GatB
GatA
PtsH
PtsI
Exceptions
Evolutionary Implications?
Pathways vs. Transcription Units
330 genes, 94% and 26%, 35 times more than expected
Transcription Units per Pathway
0,45
0,4
0,35
0,3
0,25
0,2
0,15
0,1
0,05
0
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
B
A
C
E
A
F
B
G
C
E
F
G
26%
B
A
C
E
F
G
66%
Complexes vs. Regulons
209 genes, 10% and 97%, 7 times more than expected
Pathways vs. Regulon
258 genes, 18% and 77%, 4 times more than expected
Functional
association
100
C
P
C
SP
P
SP
Gene Network
80
TU
RE
SR
87%
94%
94%
79.2
35.4
28.0
10%
18%
20%
6.9
4.2
3.8
7%
15%
16%
4.7
3.5
3.1
TU
85%
26%
20%
RE
97%
77%
71%
40
SR
97%
86%
78%
20
60
0
A
A
RE
GR
B
C
B
D
C
P
SP
10%
18%
20%
6.9
4.2
3.8
6%
13%
15%
4.1
3.2
2.8
C
D
C
P
SP
RE
97%
77%
71%
GR
97%
87%
80%
100%
100%
90%
90%
80%
80%
70%
70%
60%
60%
50%
50%
40%
40%
30%
30%
20%
20%
10%
10%
0%
0%
Complex
Pathway
Superpathway
ALL
NO
3
40
12
55
29
Super-reg.
0
67
0
67
58
504
Regulon
7
422
20
449
0
171
TU
164
24
0
188
TU
Regulon
Superregulon
ALL
NO
0
705
460
1165
Super-path.
0
20
9
Pathway
24
422
COMPLEX
164
7
Conclusions
Subunits of protein complexes are often in the
same transcription unit
• Pathways are spread in several transcription
units, which contains linear sub-pathways and
are often co-regulated
• Expression of pathway branches is often
coordinated
•
•
The tighter the functional association
the tighter the mechanism of coregulation
Evolution of regulons
Regulatory Structures has functional sense
How regulons are assemble during evolution?
Genome A Genome B
Genome C
Genome D
Genome F
Genome G
Genome H
Sigma54
Sigma54 regulon:
“relatively easy” to predict
well distributed in the bacterial tree
good number : 10-100 per genome
Distribution of sigma54
Aquifex aeolicus
S. meliloti
M. loti
A. tumafaciens
B. melitensis
C. crescentus
R. prowazekii
R. conorii
Actinobacteria
D. radiodurans
N. meningiditis
R. solanacearum
H. pylori
C. jejuni
T. maritima
E. coli
S. typhi
Y. pestis
V. cholerae
P. aeurginosa
Buchnera sp.
H. influenzae
P. multocida
T. pallidum
B. burgdorferi
C. trachomatis
B. subtilis
L. inocua
S. aureus
S. pyogenes
M. neumoniae
conserved sigma54-regulation
COG0174 Glutamine synthase
COG0347 Nitrogen regulatory protein PII
COG0642 Signal transduction histidine kinase
COG0683 ABC-type branched-chain amino acid transport systems,
periplasmic component
COG0834 ABC-type amino acid transport system,
periplasmic component
COG1301 Na+/H+-dicarboxylate symporters
COG1815 Flagellar basal body protein
COG2513 PEP phosphonomutase and related enzymes
COG4992 Ornithine/acetylornithine aminotransferase
phylogenetic profiles
GlnA
GlnK
His-Ki
LivK
HisJ
GltP
FlgB
PrpB
ArgD
Evolution Sigma54 regulon
Sigma54 regulon is very dynamic
Expression of genes transcribed from sigma54
promoters is couple to physiological conditions
Are Genes required to be coupled to
physiological conditions different in different
bacterial species?
How regulation reflects life-style?
Bacteria Lifestyles
QuickTime™ and a TIF F (Uncompressed) decompressor are needed to see this picture.
•
Enrichment in Transcriptional Regulators of the
Pseudomonas aeruginosa Genome
Cellular Processes and Bacterial
Lifestyle
•
Transport, Metabolism and Transcription
•
Three sets of proteins from E. coli
• 396 Transcription-associated proteins as annotated in Swissprot
• 548 Small-molecules Metabolism Enzymes from EcoCyc
• 647 Transporters from EcoCyc
•
Blast against all available sequenced genomes classified by lifestyle
60 genomes
15 0bligate intracellular pathogens and endosymbionts:
Buchnera sp., APS, Chlamydia pneumoniae, AR39, Chlamydia pneumoniae, CWL029, Chlamydia pneumoniae,
J138, Chlamydia trachomatis, MoPn, Chlamydia trachomatis, serovar D, Mycoplasma genitalium, G-37, Mycoplasma
pulmonis, UAB CTIP, Mycobacterium leprae, TN, Mycobacterium tuberculosis, CDC1551, Mycobacterium
tuberculosis, Hv37, Rickettsia conorii, Malish 7, Rickettsia prowazekii, Madrid E, Ureaplasma urealyticum, serovar 3
29 Pathogens ( all organisms reported to produce a disease in plants or animals):
Pseudomonas aeruginosa, PAO1, Pasteurella multocida, Pm70, Ralstonia solanacearum, Staphylococcus aureus,
Mu50, Staphylococcus aureus, N315 2624, Salmonella enterica serovar Typhi, CT18, Salmonella enterica serovar
Typhimurium, LT2, Streptococcus pneumoniae, TIGR4, Streptococcus pneumoniae, R6, Streptococcus pyogenes
M18, MGAS8232, Streptococcus pyogenes M1, SF370, Vibrio cholerae, El Tor N16961, Xylella fastidiosa, 9a5c,
Yersinia pestis, CO92, Treponema pallidum, Nichols, Agrobacterium tumefaciens, C58, Borrelia burgdorferi, B31,
Brucella melitensis, M16, Campylobacter jejuni, NCTC 11168, Clostridium perfringens, str. 13, Escherichia coli
O157:H7, EDL933, Escherichia coli 0157:H7, RIMD0509952, Fusobacterium nucleatum, ATCC 25586, Haemophilus
influenzae, KW20, Helicobacter pylori, 26695, Helicobacter pylori, J99, Listeria monocytogenes, EGD-e, Neisseria
meningitidis, MC58, Neisseria meningitidis, Z2491
12 Free-living organisms:
Anabaena sp., strain PCC 7120, Bacillus subtilis, 168, Caulobacter crescentus, CB15, Clostridium acetobutylicum,
ATCC 824, Corynebacterium glutamicum, Escherichia coli, MG1655, Lactococcus lactis, IL1403, Listeria innocua,
CLIP 11262, Mesorhizobium loti, MAFF303099, Sinorhizobium meliloti, strain 1021, Streptomyces coelicolor, A3(2),
Synechocystis sp., PCC6803).
4 Extemophiles:
Deinococcus radiodurans, R1, Aquifex aeolicus, VF5 1553,Thermotoga maritima, MSB8, Bacillus halodurans, C-125
The problem of phylogenetic
distances
•
30 set of randomly selected proteins
S = log2
Hits / Hits of Random set
∑Hit / ∑Hits of Random set
•
Negative values = UNDERREPRESENTATION
•
Positive values = OVERREPRESENTATION
Transport
0,6
0,5
0,4
0,3
0,2
0,1
0
-1.2 1.0
-1.0 0.8
-0.8 0.6
-0.6 0.4
Free living organisms
-0.4 0.2
-0.2
0.0
0.0
0.2
Pathogens
0.2
0.4
0.4
0.6
Extremophiles
0.6
0.8
0.8
1.0
Intracellular
1.0
1.2
Small-molecules Metabolism
0,6
0,5
0,4
0,3
0,2
0,1
0
-1.2 1.0
-1.0 0.8
-0.8 0.6
-0.6 0.4
Free living organisms
-0.4 0.2
-0.2
0.0
0.0
0.2
Pathogens
0.2
0.4
0.4
0.6
0.6
0.8
Extremophiles
0.8
1.0
1.0
1.2
Intracellular
Intracellular Pathogens and
symbionts enriched in Small
metabolism enzymes !!
Transcription
0,6
0,5
0,4
0,3
0,2
0,1
0
-1.2 1
-1.0 0.8
-0.8 0.6
-0.6 0.4
Free living Organisms
-0.4 0.2
-0.2
0.0
0.0
0.2
Pathogens
0.2
0.4
0.4
0.6
Extremophiles
0.6
0.8
0.8
1.0
Intracellular
1.0
1.2
Free-living bacteria require more
regulators since they face more
diverse conditions
Predictive power?
•
Can we use these parameter to classify bacterial species?
Combining TRANSC & SMMB
Scores
1
0,8
HPYL
NMEN
0,6
SMMB Score
SAUR0,4
0,2
0
-1
-0,8
-0,6
-0,4
-0,2
0
-0,2
-0,4
ECOL
0,2
0,4
PAER
SENT
-0,6
-0,8
-1
TRANSC Score
Intracellular
0,6
Free living Organisms Pathogens
0,8
1
Conclusions
•
Effects of Bacterial lifestyle can be
observed even at low resolution
•
Metabolism and Transcription-related
protein content can be use as lifestyle
descriptors to differentiate SPECIALIST
and GENERALIST Bacteria
Convergence between
Extremophiles and Endosymbionts
1
0,8
0,6
SM M B Score
0,4
0,2
0
-1
-0,8
-0,6
-0,4
-0,2
0
0,2
-0,2
-0,4
-0,6
-0,8
-1
TRANSC Score
Extremophyles
Intracellular
0,4
0,6
0,8
1
Does it hold with 114
Genomes?
June 2002:60
June 2003:114
Pathogens
Intracellular
Extremophiles
Free living
•Broader Phylogenetic
distribution
•Broader ecological
distribution
Transcription
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
-1.2 1
-1.0 0.8
-0.8 0.6
-0.6 0.4
Free living Organisms
-0.4 0.2
-0.2
0.0
0.0
0.2
Pathogens
0.2
0.4
0.4
0.6
Extremophiles
0.6
0.8
0.8
1.0
Intracellular
1.0
1.2
Small Molecule Metabolism
0.4
0.3
0.2
0.1
0
-1.2 1.0
-1.0 0.8
-0.8 0.6
-0.6 0.4
Free living organisms
-0.4 0.2
-0.2
0.0
0.0
0.2
Pathogens
0.2
0.4
0.4
0.6
Extremophiles
0.6
0.8
0.8
1.0
Intracellular
1.0
1.2
1
0.8
0.6
E n z S co r e
0.4
0.2
0
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
-0.2
-0.4
-0.6
-0.8
-1
Taps Score
Pathogens
Extremophyles
Intracellular
Free living Organisms
0.8
1
Sargasso Sea Metagenome
Venter et al.,2004. Science Apr 2;304(5667):66-74
1.045 Mb
1.2 Millions new
ORFs
from ~1400
different species
~140 new
metabolism
184850
15%
information
25965
2%
Venter et al.,2004. Science Apr 2;304(5667):66-74
Thanks
 Adrià
Garriga
 Guillermo Carbajosa
 Victor
de Lorenzo (CNB)
 Christos Ouzounis (EBI-EMBL, UK)
Descargar

Transcripción en Bacterias: Una perspectiva genómica