Transcript
Page 1: XIV. Yeast sequencing reports. Organization of the centromeric region of chromosome XIV in Saccharomyces cerevisiae

YEAST VOL. 10: 523-533 (1994)

oo O OO

0' 0 xrv % 0 Yeast Sequencing Reports 0 oooo

Organization of the Centromeric Region of Chromosome XIV in Saccharomyces cerevisiae DOMINIQUE LALO, SOPHIE STETTLER, SYLVIE MARIOTTE, EMMANUEL GENDREAU AND PIERRE THURIAUX*

Service de Biochimie et Ginitique Moliculaire, Dipartement der Biologie Cellulaire et Moliculaire, Commissariat a I'Energie Atomique, Centre dEtudes de Saclay, Brit. 142, F 91 191 Gif sur Yvette, France

Received 20 July 1993; accepted 20 October 1993

A 15.1 kb fragment of the yeast genome was allocated to the centromeric region of chromosome XIV by genetic mapping. It contained six bonajde genes, RPC34, FUN34, CITI (Suissa et al., 1984), RLP7, PET8 and MRP7 (Fearon and Mason, 1988) and two large open reading frames, DOM34 and TOM34. RPC34 and RLP7 define strictly essential functions, whereas CITI, PET8 and MRP7 encode mitochondrial proteins. The PET8 product belongs to a family of mitochondrial carrier proteins. FUN34 encodes a putative transmembraneous protein that is non-essential as judged from the normal growth of the fun34-::LUK18( URA3) allele, even on respirable substrates. TOM34 codes for a putative RNA binding protein, and DOM34 defines a hypothetical polypeptide of 35 kDa, with no significant homology to known proteins. The region under study also contains two divergently transcribed tDNAs, separated only by a chimeric transposable element. This tight tDNA linkage pattern is commonly encountered in yeast, and a general hypothesis is proposed for its emergence on the Saccharomyces cerevisiae genome. RPC34, RLP7, PET8 and MRP7 are unique on the yeast genome, but the remaining genes belong to an extant centromeric duplication between chromosome I11 and XIV. The sequences have been deposited in the EMBL/GenBank data libraries under Accession Numbers L11277, L19167, M11344, M22116, V02536, X00782 and X63746.

KEY WORDS - Mitochondria1 carriers; duplication; citrate synthase; RNA binding; ribosomes.

INTRODUCTION We have cloned a 15.1 kb genomic fragment of Saccharomyces cerevisiae that was assigned to the centromeric region of chromosome XIV. This region has a marked similarity to the centromeric region of chromosome 111, indicating an extant centromeric duplication of these chromosomes (Lalo et al., 1993a). We report here its complete DNA sequence, which contained two tDNAs, six genes, two open reading frames (ORFs) and one chimeric transposable element. Four of the genes (RPC34, CIT1, RLP7 and MRP7) were described in previous publications (Stettler et al., 1992;

*Addressee for correspondence.

Suissa et al., 1984; Lalo et al., 1993b; Fearon and Mason, 1988) and will not be specifically dealt with in this paper.

MATERIALS AND METHODS Plasmids and sequencing strategy

YCp34-1 and YCp34-2 were isolated from a genomic library of strain GRF88 (Rose et al., 1987) by colony hybridization with a 1.1 kb RPC34 NcoI-NdeI probe (see Figure 1 and Stettler et al., 1992). Restriction fragments subcloned from these vectors by standard ligation techniques generated the following plasmids: pc34-4 (a 2-6 kb BamHI-EcoRI RPC34 fragment cloned in

CCC 0749-503W94/040523-11 0 1994 by John Wiley & Sons Ltd

Page 2: XIV. Yeast sequencing reports. Organization of the centromeric region of chromosome XIV in Saccharomyces cerevisiae

524 D. LALO ET AL.

28 11 39 18

A ,.*' @a 1 YCp34-2 1

I YCp34-1 1 1 Kb

I I I I pPET8-5 I pPET8-12 pc34-4 pBS34R I

1 pPET8-10 I I pPET8-16 I

I pRL7-2 I

Figure 1. General organization of the centromeric region of chromosome XIV. (A) Subcloning. Horizontal lines symbolize the inserts of the principal plasmids used in this study. YCp34-1 and YCp34-2 are YCp50 derivates isolated from strain GRF88 (Rose et al., 1987) by colony hybridization in E. coli, using a 1.1 kb NcoI-NdeI probe denoted by the double arrow. The relative positions of the ClaI and SaA sites of YCp50 are indicated in brackets, to give the orientation of the inserts relative to that vector. The inserts of pC34-4, pBS34R, pPET8-5, pPET8-10, pPET8-12, pPET8-16 and pRL7-2 were subcloned from YCp34-1 and YCp34-2 as described in Materials and Methods. (B) Mutational map. Triangles above the restriction map indicate the approximate positions of the rpc34-:: LUK28 (URA3), rpc34-:: LUKll (URA3), rpc34-::LUK39 (URA3) and fun34:--LUK18 (URA3) insertions. Boxes below the restriction map correspond to the rpc34-A::HIS3 (Stettler et al., 1992), rlp7-A::HIS3 (Lalo et al., 1993b), citl-A::LEUZ (Suissa et al., 1984) and mrp7-A::URA3 (Fearon and Mason, 1988) deletions. Black boxes or triangles denote lethal alleles, whilst the dotted triangle defines the viable lac2 fusion allele

fun34::LUKI8. The suf10 (Cummins et al., 1981) and pet8 (Mortimer et al., 1992) mutations are indicated for the sake of completion, but were not precisely located on the physical map. Only relevant restriction sites are indicated. The map is oriented according to the conventional orientation of the genetic map of chromosome XIV (Mortimer et al., 1992). (C) Functional map. Shaded boxes correspond to genes, open boxes to ORFs, filled boxes to tDNAs and striped boxes to transposable elements. The oval box denotes the centromere (Neitz and Carbon, 1985). Arrows indicate the transcriptional orientations. Two potential origins of replication (1 111 1 match with a DTTWATVTTTH consensus, see Williamson, 1985) are indicated by a star (W=A or T; D=non-C; V=non-T and H=non-G). The corresponding sequences are (from left to right): X63746, X00782, L11277, L19167, V02536 and M22116.

pUC19); pPET8-10 (a 5.6 kb SaZI-Clal FUN34- CITI fragment cloned in the centromeric vector pUN75 of Elledge and Davis (1988), where the Sari border originates from the YCp34-2 vector); pBS34R (a 3.4 kb EcoRI fragment overlapping FUN34 and CITl, cloned in the pBS + vector from Strategene); pPET8-5 (a 2-9 kb CZaI-SaZI CENl4- DOM34 fragment cloned in pUN75); pRL7-2 (a 4.8 kb Hind111 RPL7-PET8 fragment cloned in

pUC19); pPET8-12 (a 6.4kb SaZI-CZaI PET8- TOM34-MRP7 fragment cloned in pUN75, where the CZaI border originates from the YCp34-2 vec- tor) and pPET8-16 (a 3.6 kb XhoI PET8-TOM34 fragment in pUN75). Note that YCp34-1, YCp34-2 and pPET8-5 are dicentric and therefore not suitable for amplification in S. cerevisiae. These overlapping inserts were shortened by further subcloning, or by nested deletion using

Page 3: XIV. Yeast sequencing reports. Organization of the centromeric region of chromosome XIV in Saccharomyces cerevisiae

CENTROMERIC REGION OF S. CEREVISIAE CHROMOSOME XIV 525

Table 1. Yeast strains

YNN28 1 * YNN282*

CD34-6a* 801 ade2-I01 This work

CD34-6a* 801 ade2-I01 This work

CD34..5a* rpc34::LUK39( URA3) This work

CD34.8a* rpc34-::LUK28( URA3) This work

CD34.7A* 801 RPC34 +lrpc34::LUK44( URA3) This work

CD34.1 A* uyra3-52 RPC34lrpc34: : HIS3 Stettler et al. (1992) GRF88 MATa gal2 his4-18 Rose et al. (1987) K393-27~ Klapholz and Easton-Esposito (1 982b) DJY36 Lee et al. (1 984) D8 1 - 5 ~ Offspring of DJY36 x CD34-6a

MATa ura3-52 his34200 trpl-A1 lys2-801 ade2-I01 MATa ura3-52 his3-A200 trpl-A1 lys2-801 ade2-I01

MATa fun34::LUKl8 ura3-52 his3-A200 t r p l d l lys2-

MATa fun34::LUK18 ura3-52 his3-A200 trpl-A1 lys2-

MATa ade2-101 lys2-801 his3-A200 trpl-A1 ura3-52

MATa ura3-52 his34200 trpl-A1 lysa2-801 ade2-101

MATaIMATa trpl-A1 his3-A200 ura3-52 ade2-101 lys2-

MATaIMATa ade2-101 lys2-801 trpl-A his3-A200

Hieter et al. (1985) Hieter et al. (1985)

MATa ura3 his2 leu1 lysl met4 pet8 MATa prp2-1 adel ade2 ura3

MATa prp2-1 fun34::LUK18( URA3) pet8 adel ura3

*Isogenic strains.

exonuclease I11 digestion (Henikoff, 1984). The corresponding plasmids, amplified in the Escherichia coli strain DH5a (endAl, hsdRl7, supE44, thi-1, recA1, gyrA96, relA1, AlacU169 ((p8OlacAM1.5)) were sequenced on both strands of alkali-denaturated DNA, using the dideoxy nucle- otide chain-termination method (Sanger et al., 1977) with a modified T7 DNA polymerase (Sequenase 11, USB). This sequence partly over- laps the X00782 (Suissa et al., 1984), M11344 (Neitz and Carbon, 1985) and M22116 (Fearon and Mason, 1988) sequences. X00782 derives from strain FLlOO (Suissa et al., 1984; F. Lacroute, personal communication), M 1 1344 from strain AB230, and M22116 from strain X2180 (Snyder and Davis, 1985). The sequences were analysed with the Fasta (Pearson, 1990) and DNA Strider (Marck, 1988) softwares.

Yeast strains and genetic techniques Strains are listed in Table 1. The rpc34-

::LUk28( URA3), rpc34-::LUKll (URA3), rpc34- ::LUK39( URA3) and fun34-::LUK18( URA3) alleles were generated on pC34-4 by in vivo mu- tagenesis with the URA3 TnlOLUK chimeric transposon of Huisman et al. (1987). Rpc34- ::LUK28( URA3) and rpc34-::LUK39( URA3) are viable alleles corresponding respectively to a TnlOLUK insertion immediately upstream and downstream of the RPC34 coding sequence. Rpc34-::LUKll, an insertion in the middle of the

coding sequence, has a lethal phenotype. Rpc34- A::HIS3 and rlp7-A:: HIS3 alleles were constructed by inserting a 1.7 kb BamHI HIS3 cassette within partial deletions of the RPC34 or RLP7 (Stettler et al., 1992; Lalo et al., 1993b). These constructions were transferred as heterozygous alleles to YNN281 x YNN282 diploid cells as described by Rothstein (1983). Microdissection was performed with a de Fonbrunne micromanipulator. Growth media were previously described (Stettler et al., 1992). P-galactosidase assays were done on perme- abilized cells (Huisman et al., 1987) grown on YPD or YPG plates with 40mg/l of 5-bromo-4- chloro-3-indolyl p-D-galactoside.

RESULTS AND DISCUSSION Physical and genetic map

YCp34-1 and YCp34-2 delineate an insert of 15.1 kb centred on CENZ4 (Figure 1). The 295 bp CENl4 fragment (M 1 1344), previously sequenced by Neitz and Carbon (1985) from strain AB320, differed by a 3 bp change (AAAITTT) in the present work, presumably reflecting strain poly- morphism. The region also included two hitherto unmapped genes, MRP7 (M22116, Fearon and Mason, 1988), encoding a protein of the large mitochondria1 ribosomal subunit, and CITl (X00782, Suissa et al., 1984), encoding the mito- chondrial citrate synthase (E.C.4.1.3.7). Two puta- tive ARS elements were identified, one very close

Page 4: XIV. Yeast sequencing reports. Organization of the centromeric region of chromosome XIV in Saccharomyces cerevisiae

526 D. LALO ET AL.

Figure 2. Similarity of the centromeric regions of chromosome I11 and XIV. Filled boxes correspond to homologous genes, ORFs or pseudogenes (XYZ3 on chromosome 111). Dotted boxes correspond to genes or ORFs present on one chromosome only. Arrows give transcriptional orientations (virtual in the case of the truncated delta34 element). The nomenclature of chromosome I11 ORFs was taken from Oliver et al. (1992). In contrast to Figure 1, chromosome XIV was inverted relative to the orientation of the genetic map (Mortimer et al., 1992), in order to align it directly with the chromosome 111 sequence.

to the centromere and one upstream of RPC34 (Figure 1). The latter is probably functional, since it is included in a 2.6 kb EcoRI-BarnHI fragment which, when cloned into an integrative plasmid, converted it into a replicative one (data not shown). DNA sequence and mutational analysis revealed a total of six genes (RPC34, FUN34, CITl, RLP7, PET8, MRP7), two ORFs (DOM34 and TOM34), two tDNAs and one chimeric trans- posable element (tau34ldelta34). The coden adap- tation index (Sharp and Cowe, 1991) of the eight genes and ORFs ranged between 0-1 1 (PET8) and 0.25 (CITl), suggesting that the corresponding proteins were expressed at a low or moderate level. Finally, a genomic insert containing the SISl gene (homologous to dnaJ in E. coli, Luke et al., 1991) overlapped the region analysed in the present work (K. Arndt, personal communication), putting SISl approximately 1.5 kb downstream of MRP7 (data not shown).

We have previously noted a strong homology in the centromeric regions of chromosomes I11 and XIV, implying that these regions are related by an extant duplication (Lalo et al., 1993a). The organization of the two regions is summarized in Figure 2. The presence of unique genes on chro- mosome I11 (e.g. CDClO and RVS161) suggests

that homologous genes were removed from chro- mosome XIV by deletions, or that insertional events led to a local gene transfer on chromosome 111. A pericentric inversion is also required to explain the opposite transcriptional orientation of TOM34 and YCLll C.

To map fun34-::LUK18, strain CD34-6a (MATa ura3 trpl fun34-::LUK18) was crossed to the MATa strains K393-27c (ura3 pet8) and DJY36 (ura3 prp2-1, see Table 1 for the complete genotype of these strains), and tetrads were analysed after sporulation of the corresponding diploids. The segregation of fun34::L UK18 (moni- tored by its uracil prototrophy in a ura3 back- ground) was assessed relative to TRPl (a tight centromeric marker, Mortimer et al., 1992) and to the PET8 and PRP2 genes of chromosome XIV. The high rate of prereductional segregation of fun34-::LUK18 relative to TRPl (33 parental ditypes, 34 recombinant ditypes and only one tetratype) indicated a strong centromeric linkage, in keeping with the physical map of Figure 1. Fun34-::L UK18 almost invariably cosegregated with pet8-1 (68 parental ditypes and one recombi- nant tetrad due to a 3:1 gene conversion offun34- ::L UK18), which corresponded to a genetic distance of 0.3 centimorgans. Furthermore, its

Figure 3. (A) Hydrophobicity plots of the FUN34 and YCRlOC coding sequences. The distribution of hydrophobic (positive values) and hydrophilic (negative values) residues was determined using the Strider software (Marck, 1988). It is consistent with a transmembrane organization, where the two hydrophilic regions (A and C) are on opposite sides of the hypothetical membrane. The two genes are characterized by N-terminal and C-terminal hydrophilic ends (A and E), with a small internal hydrophilic region (C) between two large hydrophobic domains (B and D). The latter two domains contain stretches of amino acids that are compatible with transmembrane a helices. (B) Alignment of the predicted amino acid sequence of FUN34 and YCRIOC. The alignment was generated using the Fasta software (Pearson, 1990). Identities between residues are shown by vertical lines and conservative changes by two dots. Note the less extensive similarity of the three hydrophilic domains A, C and E.

Page 5: XIV. Yeast sequencing reports. Organization of the centromeric region of chromosome XIV in Saccharomyces cerevisiae

CENTROMERIC REGION OF S. CEREVISIAE CHROMOSOME XIV 527

A 3 2 1 0 -1 -2 -3 -4

4 3 2 1 0

-1 -2 -3

B FUN3 4

YCRlOC

FUN3 4

YCRlOC

FUN3 4

YCRlOC

FUN3 4

YCRlOC

FUN3 4

YCRlOC

3 2 1 0 -1 -2 -3 -4

4 3 2 1 0 -1 -2 -3

4

l l l : l l : l l l l : : l l I : 1 1 : : : ::: : : : : I : I : : I : I I I I : I : 1 1 1 1 1 1 1 1 1 1

A M S D R E Q S S G N T A F E N - P K A L D S S E G E F I S E N N D Q S R H S Q E E Y I Y I G R Q 59

MSDKEQTSGNTDLENAPAGYYSSHDNDVNGVAEDERPSHDSLGKIYTGGDNNEYIYIGRQ 60

d I I I : : I l : : 1 1 1 1 1 1 l I I I l I I I I I I l 1 1 1 l I l I l I : l 1 1 1 l I I I I I I I I I I I I l I : l I l K F L R D D L F E A F G G T L N P G L A P A P V H K F A N P A P L G L S G F A L 119

K F L K S D L Y Q A F G G T L N P G L A P A P V H K F A N P A P L G L S A F A L 120

c WGCAMFYGGLVQLIAGIWEIALENTFGGTALCSFGGFWLSFGAIYIPWFGILDAYKDKE 1 7 9

WGCAMFYGGLVQLIAGIWEIALENTFGGTALCSYGGTALCSYGG~LSF~IYIPWFGILEAYEDNE 180 1 1 1 1 1 1 1 l I l 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 l 1 1 1 1 1 l 1 1 : l l I I I l I : l I I I l I I l l I : l I : I : I

-b

I I l : l l l l l l l l l l l : I l l I I : I I I I I I I : I I I I I l I I I : l I I I I I I : : I : : : I I I I I SDLGNALGFnLGWALFTFGLSVCTMKSTIMFFALFFLLAVTFLLLSIANFTGEVGVTRA 239

SDLNNALGFYLLGWAIFTFGLTVCTMKSTVMFFLLFFLLALTFLLLSIGHFANRLGVTRA 240

4 E b GGVLGVIVAFIAWYNAYAGIATRQNSYIMVHPFALPSNDKVFF I I I I I I : I I I I l l I I I I I l : I I : l I I I : : : : I l : l I l : : : l : I

2 82

GGVLGVWAFIAWYNAYAGVATKQNSYVLARPFPLPSTERVIF 283

Page 6: XIV. Yeast sequencing reports. Organization of the centromeric region of chromosome XIV in Saccharomyces cerevisiae

D. LALO ET AL.

%

Identity Similarity

24,6 67,s

28.8 61,6

25,4 64,4

27,4 65,9

20,l 63,9

19,9 65,8

528

A Length of

Homology

286 AA

292 AA

284 AA

179 AA

288 AA

131 AA

B

UCP Mitochondrial uncoupling protein (H. sapiens)

MRS4 Mitochondrial RNA splicing protein (S. cerevisiae)

YMC1 Putative Mitochondrial carrier protein (S. cerevisiae)

MSCH Mitochondrial solute carrier protein (H. sapiens)

MPCP Mitochondrial Phosphate carrier protein (S. cerevisiae)

ADT2 ADP/ATP carrier protein (S. cerevisiae)

2 1

0 -1

-2

100 200

IFasta Optimal rdf2 score score

204 24,3

202 23,6

193 19,4

138 11,8

127 8,7

124 12,s

1

0 -1 -2

100 200 300

2 1 0 -1

I I I I 100 200 300

2 1

0 -1

I -2

I I I 100 200 300

Figure 4. (A) Similarity between the PET8 gene product and mitochondria1 carrier proteins. Homology scores as calculated by the lFASTA algorithm (Pearson, 1990). Rdf2 scores give the number of standard deviations. Sources: UCP: Cassard et al., 1990; MRS4 Wisenberger et al., 1991; YMC1: Graf et al., 1993; MSCH: Zarrilli et al., 1989; MPCP: Murakami et al., 1990 and Phelps et al., 1991; ADT2: Kolarov et al., 1990. (B) Similarity of hydrophobicity profiles. These proiiles were determined by the algorithm of Kyte and Doolittle (1982), using the DNA Strider software (Marck, 1988).

linkage to prp2-I was 8 centimorgans (44 parental and a hydrophobicity profile strongly suggestive of ditypes and 8 tetratypes), confirming previous a transmembrane protein (Figure 3). Insertional mapping data (Mortimer et al., 1992). mutagenesis yielded the viable allele fun34-

::LUKI8( URA3), in which the insertion occurs FUN34 halfway into the coding sequence, generating an

in-frame fusion to the lacZ sequence of the trans- acids, with a predicted molecular weight of 31 kDa poson. The P-galactosidase fusion protein was

FUN34 encodes a polypeptide of 282 amino

Page 7: XIV. Yeast sequencing reports. Organization of the centromeric region of chromosome XIV in Saccharomyces cerevisiae

- .-

A B

1 opo

TO

M34

590

Il

l

I . ..

. I.

-. :

.I-

? .

.

,\

*

\I

. '\

..

-

..

\

. .

.:..

I'

500

. p

oo

o

I

R

U

DOMA

INE I

142 NLTYDSTPEDLTEFFSQIGKWRADI- ITSR

GHHR

GMGT

VEFT

NSDD

VDRA

IRQY

DG 197

0

DOMA

INE I1 24

2 N

LP

AS

VN

WQ

AL

KD

1F

KE

CG

NV

AV

AH

AD

VE

LM

;DG

VS

TG

SG

TV

G 298

E

11:

: :

::I:::I::

1:l

::ll

: :::

I I

Ill:

I :

:I::

III

:I:I

8 I

ll I

::

:I

I:(:

: 1

:I::

I::

1::

l::I

l ::

I::

:::I

:

11:

I:

DOMA

INE I11 357

NLPF

STAK

SDLY

DLFE

TIGK

VNNA

ELRY

DSKG

APTG

IAWE

YDNV

DDAD

VCIE

RLNN

413

7- 11

:::::

:II

::I:

Il

l1 :

I::

:I

I :

1::

:ll:

:l

Il:l

I

: :

DOMR

INE I

142 NLTYDSTPEDLTEFFSQIGKWRADI-ITSRGHHRGMGTVEFTNSDDVDRAIRQYDG 197

8 w

D

TOM3

4 35 R

GEYR

GGRE

R-SD

YRER

ERFN

NRDN

PRSR

DRYD

DRRR

GRDV

TGRY

GNRR

DDYP

RSFR

SRHN

T 9

6

YCLllC 20 RRRLSDDRDRYDDYNDSSSNNGNGSRRQRRDRGSRFNDR-YDQSYGGSRYHDDRNWPPRRGG

80

TOM3

4 97 RDDSRRGGF--GSSGARG-DYGPLLARELDSTYEEKVNRNYSNSIFVGNLTYDSTPEDLTEF 155

YCLl

lC 81 R

GR

GG

SRSF

RG

GR

GG

GR

GR

TLG

PIV

ERD

LER

QFD

A-T

KR

NFE

NSI

FVR

NLT

FDC

TPED

LKEL

141

' I

:::I

:I

:!I:

::::

I:

::::

) I

::I

::I

:

:ll:

:l

:

1::

::I:

::

I: :

: :I

[:

:I

II

:

1l:

::l:

l::

:::

::ll

::ll

lll

III:

I:II

IIl:

I:

s U TO

M34

156

FSQ

IGK

WR

AD

IITS

RG

HH

RG

MG

TVEF

TNSD

DV

DR

AIR

QY

DG

AFF

MD

RK

IFV

RQ

DN

PPPS

NN

217

YCLllC 142 F

GTV

GEV

VEA

DII

TSK

GH

HR

GM

GTV

EFTK

NES

VQ

DA

ISK

FDG

ALF

MD

RK

LMV

RQ

DN

PPPE

AA

203

I: :l

:ll

IIII

II:/

IIII

IIII

III:

:::I

: II

:::/

II:I

I/II

::Il

IIII

Il::

:

TOM3 4 2 1

8 IK

ERKA

LDRG

ELRH

NRKT

HEVI

VKNL

PASV

NWQA

LKDI

FKEC

GNVA

HADV

ELDG

DGVS

TGSG

27 9

YCLllC 204 KEF

SKKA

TREE

IDN-

--GF

EVFI

INLP

YSMN

WQSL

KDMF

KECG

HVLR

ADVE

LDFN

GFSR

GFG 262

TOM3

4 280

TVSF

YDIK

DLHR

AIEK

YNGY

SIEG

NVLD

V~--

---K

SKES

VHNH

SDGD

DVDI

PMDD

SP--

-- 331

YCLl

lC 263 S

VIYP

TEDE

MIRA

IDTF

NGME

VEGR

VLEV

REGR

FNKR

KNND

RYNQ

RRED

LEDT

RGTE

PGLA

Q 32

4

TOM3

4 332

----

-VNE

EARK

FTEN

WGGG

ERNR

LIYC

SNLP

FSTA

KSDL

YDLF

ETIG

KVNN

~LRY

DSKG

388

YCLl

lC 325 D

AA

VH

IDET

AA

KFT

EGV

NPG

GD

RN

CFI

YC

SNLP

FSTA

RSD

LFD

LFG

PIG

KIN

NA

ELK

PQEN

G 386

$ :

:I

:I:I

: :

: 11

::

Ill

I:I

lI:I

II:I

I/II

:I

:IlI

IIl

:I

I I

I

:I :

: :::

1II:::

II ::

ll:l

l:l

[:I:

: :

: :I

::

::::

:I

::I:

I II

II:/

II

:II

:lll

llll

llll

:lll

:lll

::ll

l:ll

lll:

::

:I

TOM3

4 389 AP

TGIA

WEYD

NVDD

ADVC

IERL

NNYN

YGGC

DLDI

SYAK

RL

YCLllC 387 Q

PTGV

AWEY

ENLV

DADF

CIQX

LNNY

NYGG

CSLQ

ISYA

RRD

:lll

:lll

ll:l

: Il

l II

::1

11

11

11

11

:I:I

III:

I 42

9

421

Figu

re 5

. Pr

oper

ties

of th

e TO

M34

(chr

omos

ome

XIV

) and

YC

Lll

C (c

hrom

osom

e 11

1) op

en re

adin

g fr

ames

. (A

) Hyd

roph

ilici

ty p

rofil

es an

d di

stri

butio

n of

cha

rged

am

ino

acid

s. T

he s

hade

d ar

ea c

orre

spon

ds to

the

hydr

ophi

lic a

min

o-te

rmin

al d

omai

n. D

oubl

e-he

aded

arr

ows

defin

e th

e hy

poth

etic

al R

NA

bin

ding

dom

ains

I, I1

and

11

1. T

he d

istri

butio

n of

aci

dic

(A) a

nd b

asic

(B) r

esid

ues w

as d

eter

min

ed u

sing

the

DN

A S

tride

r so

ftwar

e (s

ee M

arck

, 19

88, f

or th

e sy

mbo

ls u

sed)

. (B)

Alig

nmen

t of

the

thre

e hy

poth

etic

al R

NA

-bin

ding

dom

ains

, as

gen

erat

ed b

y th

e Fa

sta

softw

are

(Pea

rson

, 19

90).

Iden

titie

s be

twee

n re

sidu

es a

re s

how

n by

ver

tical

lin

es a

nd

cons

erva

tive

chan

ges b

y tw

o do

ts. (

C) D

ot m

atrix

alig

nmen

t of t

he Y

CL

l 1C

and

TO

M34

codi

ng s

eque

nces

. The

DN

A se

quen

ces w

ere

alig

ned

usin

g th

e D

NA

Stri

der

softw

are

(Mar

ck,

1988

) at a

stri

ngen

cy le

vel o

f 15

iden

titie

s in

win

dow

s of 2

3 re

sidu

es. T

he ro

man

num

bers

I, I1

and

111

den

ote

the

thre

e in

tern

ally

repe

ated

dom

ains

. (D

) A

lignm

ent o

f th

e pr

edic

ted

prod

ucts

of

TOM

34 a

nd o

f Y

CL

llC

(Oliv

er ef a

l.. 1

992)

.

Page 8: XIV. Yeast sequencing reports. Organization of the centromeric region of chromosome XIV in Saccharomyces cerevisiae

530 D. LALO ET AL.

expressed on fermentative (YPD) or respiratory (YPG) growth media, without detectable adverse growth effect on either media. This expression implies that the FUN34 ORF is an authentic gene, and suggests that its product is probably not a mitochondrial protein since it is not glucose- repressed. There is a very similar coding sequence, YCRlOC (Skala et al., 1992), near the centromere of chromosome 111. Figure 3 shows that the differ- ences in amino acid sequences are essentially restricted to the amino-terminal end, possibly reflecting localization to distinct but so far uniden- tified cellular compartments. The citrate synthases encoded by the neighbouring duplicated gene padr CITIICIT2 are located in two distinct com- partments, the mitochondrion (Suissa et al., 1984) and the peroxisomes (Lewin et al., 1990). It would be interesting to examine if the FUN34 and YCRlOC products had the same distinct cellular localization.

DOM34 DOM34 is defined by an ORF potentially en-

coding a 31 1 amino acid polypeptide (35 kDA). A PAC box (Dequard-Chablat et al., 1991) is present in inverted orientation 439 bp upstream of the initiator ATG. The PAC box (TGAGATGAG) was found in front of several genes encoding RNA polymerase I and I11 subunits, including RPC34, (Dequard-Chablat et al., 1991), but its functional significance (if any) has not been directly estab- lished. DOM34 shows no homology to any protein currently released in databanks, but a related pseudogene (XYZ3) lies near the centromere of chromosome I11 (Lalo et al., 1993a). Because of its genetic location, DOM34 might correspond to the centromere-linked spol mutation, characterized by a defective meiosis (Klapholz and Easton- Esposito, 1982a).

PET8 The tight genetic linkage between FUN34 and

PET8 suggested that the latter gene was in the region under study. This prediction was confirmed by showing that pRPL7-2, pPET8-16 and pPET8-12 complement the pet8 mutation. As can be seen from Figure 1, this allows us to identify the PET8 ORF. These complementation data also show that the pet8 cells are pP+ (since otherwise there would have been no complementation) and thus have an intact protein synthesis machinery (Myers et al., 1985). PET8 is a single-copy gene

encoding a largely hydrophobic 284 amino acid product (31 kDa) with a central hydrophilic domain. It belongs to a set of proteins which, as far as analysed, are all mitochondrial carriers located in the inner membrane. Their sequence homology, although not very pronounced, is statistically significant and leads to conserved hydrophobicity profiles (Figure 4). Eight other genes of this family have been identified so far in S. cerevisiue (Lawson and Douglas, 1988; Kolarov et al., 1990; Murakami et al., 1990; Phelps et al., 1991; Wiesenberger et al., 1991; Colleaux et al., 1992; Graf et ul., 1993). They presumably have an important, but still poorly understood, role in the mitochondrial transport of ions, metabolites or proteins.

TOM34 TOM34 potentially encodes a 40 kDa hydro-

philic polypeptide of 429 amino acids (Figure 5). The first 100 amino-terminal residues delineate a predominantly basic region rich in arginyl and glycyl residues and also containing tyrosyl resi- dues, which is somewhat reminiscent of the ‘GAR’ motif of nucleolar proteins (Lapeyre et al., 1987). Three internal repeats of ca. 70 residues have simi- larity to a domain shared by several RNA-binding proteins and thought to form an RNA-binding site (Bandziulis et al., 1989). This tentatively suggests an RNA-binding protein able to unwind RNAs through its basic amino-terminal domain, and possibly involved in nucleolar RNA metabolism. The polypeptide predicted from the YCL11 C ORF on chromosome I11 is highly homologous to TOM34 (47% identical and 87% similar residues), suggesting a similar function.

A DNA alignment of the TOM34 and YCLIIC ORFs mirrors the modular organization pre- dicted for their gene products (Figure 5D). Thus, mutations accumulated in the amino-terminal region, but retained its overall hydrophilic and basic character. Mutations were also abundant in the regions of ca. 100 bp separating the three repeated domains, suggesting that the correspond- ing amino acids merely operate as peptidic ‘hinges’ between these domains. In contrast, the DNA sequences corresponding to the three internally repeated motifs were highly conserved. Remark- ably, some of the mutations accumulated in the inter-domain regions were multisite events since they resulted in the insertion or deletion of amino acid stretches between the predicted polypeptides of YCLllC and TOM34.

Page 9: XIV. Yeast sequencing reports. Organization of the centromeric region of chromosome XIV in Saccharomyces cerevisiae

CENTROMERIC REGION OF s. cEmvrsrm CHROMOSOME XIV 53 1

Clusters of divergently transcribed tDNAs: a consequence of recurrent deletions between transposable elements?

The tDNAAS" gene (with a 3'TTG5' anticodon site) on chromosome XIV was identified by its sequence identity to a yeast tRNAAS" gene (Keith and Pixa, 1984), whereas the closely linked tDNARo gene (with a 3'GGA5' anticodon site) was identical to the wild-type allele of the tRNAPro-encoding SUF2 gene on chromosome 111. A frame-shift suppressor sup allele bears a GGA-GGGA insertion at the anticodon site (Cummins et al., 1982). By extension, the tDNAPro gene of chromosome XIV may correspond to the SUFlO frameshift suppressor allele tightly linked to the centromere of chromosome XIV (Cummins et al., 1981). The tDNAAS" and tDNAPro genes are separated by a chimeric transposable element, tau34ldelta34, consisting of a 370 bp tau sequence (Chisholm et al., 1984) located immediately up- stream of the last 100 bp of a delta sequence. Tau is the long terminal repeat (LTR) of the TY4 trans- poson, and delta is the LTR of TYl and TY2. Their junction in tau34ldelta34 corresponds to the 5' end of the major TYlITY2 transcript. Such nested insertions are quite common for yeast trans- posable elements (see Boeke and Sandmeyer, 1991 and references therein). Unlike the canonical tau elements, tau34 is not flanked by a tandem dupli- cation of its 5 pb target sequence, which suggests that it is a chimeric sequence produced by an unequal crossover recombining two taus inserted at different target sites.

The tight linkage of the divergently transcribed tDNAAs" and tDNAPr0 genes, separated by a trans- posable element, is shared by about half of the yeast tDNA genes (Hauber et al., 1987; Oliver et al., 1992). This highly non-random gene distribu- tion invites the speculation that deletions fre- quently remove the genomic regions comprised between divergently transcribed tDNAs during the evolution of the yeast genome. The strict (TY3) or loose (TY1) targeting of yeast transposons up- stream of genes transcribed by RNA polymerase I11 (Boeke and Sandmeyer, 1991; Ji et al., 1993 and references therein) suggests a possible mechanism, where the insertions of two transposons create an opportunity for excision by unequal crossovers between tandemly repeated LTRs. Figure 6 illus- trates how such events may favour the accumu- lation of closely linked pairs of divergently transcribed tDNAs on the yeast genome. A single

A

t

B

C

t t

Figure 6 . A hypothesis to explain the frequent occurrence of linked tDNAs with divergent transcriptional orientation on the yeast genome. Black arrows denote tDNAs and their transcrip- tional orientations. Open rectangles correspond to transposons, with LTR as shaded boxes. Thin arrows give the transcrip- tional orientation of the transposon. The small filled and open circles indicate the small 5 bp tandem duplications of the target sequence generated by the transposition event. The diagram illustrates the consequences of unequal crossover between transposons inserted upstream of tDNAs located on the same chromosome and having a tail-to-tail (A), head-to-head (B) or head-to-tail (C) transcriptional organization. Linked pairs of divergently transcribed tDNAs will be formed on the yeast genome (A), provided that the deletion of the corresponding region is not counter-selected, that transposons are preferably inserted upstream of tDNAs, and that their insertion can occur in either orientation. The latter two properties are well- documented in S. cerevisiue (Boeke and Sandmeyer, 1991 and references therein). Ty3 transposons are invariably and pre- cisely targeted at RNA polymerase I11 initiation sites, and Tyl/Ty2 transposons are also preferentially inserted in the upstream region of tDNAs (Boeke et ul., 1993). The figure illustrates the simple situation where two transponsons with identical LTRs are inserted upstream of two tDNAs initially separated by a large chromosome fragment, and where unequal crossover occurred between the LTRs to generate a chimeric solo element. However, recombination could also occur be- tween the unique central domain of the two transposons, generating a full-sized chimeric transposon. More complex events may occur if, as is often the case, the transposons were implicated in transposition cycles leading for example to nested insertion of transposable elements within each other.

Page 10: XIV. Yeast sequencing reports. Organization of the centromeric region of chromosome XIV in Saccharomyces cerevisiae

532 D. LALO ETAL.

transposon or solo LTR should be reconstituted by the unequal crossover event, but this chimeric element should have no tandem duplication of its 5 bp target sequence, as is indeed the case for the tau34 element separating the two tDNAs on chromosome XIV.

ACKNOWLEDGEMENTS

We thank Catherine Doira for oligonucleotide synthesis; K. Arndt for communicating results prior to publication, Catherine Jackson for care- fully correcting our English and Andrk Sentenac and Piotr Slonimski for stimulating discussions.

REFERENCES Bandziulis, R. J., Swanson, M. S. and Dreyfuss, G.

(1989). RNA binding domains of developmental fac- tors. Genes Dev. 3, 431-437.

Boeke, J. D. and Sandmeyer, S. B. (1991). Yeast trans- posable elements. In Broach, J. R., Pringle, J. and Jones E. (Eds), The Molecular Biology of the Yeast Saccharomyces. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, pp. 193-262.

Cassard, A. M., Bouillaud, F., Mettei, M. T., Hentz, E., Raimbault, S., Thomas, M. and Ricquier, D. (1990). Human uncoupling protein gene: structure, compari- son with rat gene, and assignment to the long arm of chromosome 4. J. Cell. Biochem. 43, 255-264.

Chisholm, G. E., Genbauffe, F. S. and Cooper, T. G. (1984). Tau, a repeated DNA sequence in yeast. Proc. Natl Acad. Sci. USA 81, 2965-2969.

Colleaux, L., Richard, G. F., Thierry, A. and Dujon, B. (1992). Sequence of a segment of yeast chromosome XI identifies a new mitochondrial carrier, and a protein with the PAAKK motif of the HI histones. Yeast 8, 325-336.

Cummins, C. M., Gaber, R. F., Culbertson, R. F., Mann, R. and Fink, G. R. (1981). Frameshift sup- pression in Saccharomyces cerevisiae. Isolation and genetic properties of group 111 suppressors. Genetics 95. 855-875.

Cummins, C. M., Donahue, T. F. and Culbertson, M. R. (1982). Nucleotide sequence of the SUF2 frameshift suppressor gene of Saccharomyces cere- visiae. Proc. Natl Acad. Sci. USA 79, 3565-3569.

Dequard-Chablat, M., Riva, M., Carles, C. and Sentenac, A. (1991). RPC19, the gene for a subunit common to yeast RNA polymerases A(I) and C(II1). J. Biol. Chem. 266, 15300-15307.

Elledge, S. J. and Davis, R. W. (1988). A family of versatile centromeric vectors designed for use in the sectoring-shuffle mutagenesis assay in Saccharomyces cerevisiae. Gene 70, 303-312.

regulation of a nuclear gene in Saccharomyces cere- Fearon, K. and Mason, T. L. (1988) Structure and

visiae that specifies MRP7, a protein of the large subunit of the mitochondrial ribosome. Mol. Cell. Biol. 8, 3636-3646.

Graf, R., Baum, B. and Braus, G. H. (1993). YMCl, a yeast gene encoding a new putative mitochondrial carrier protein. Yeast 9, 289-294.

Hauber, J., Stucka, R., Krieg, R. and Feldmann, H. (1987). Analysis of yeast chromosomal regions carry- ing members of the glutamate tRNA gene family: various transposable elements are associated with them. Nucl. Acids Res. 16, 10624-10633.

Henikoff, S. (1984). Unidirectional digestion with exo- nuclease 111 creates targeted breakpoints for DNA sequencing. Gene 28, 351-359.

Hieter, P., Mann, C., Snyder, M. and Davis, R. W. (1985). Mitotic stability of yeast chromosomes: A colony color assay that measures nondisjunction and chromosomes loss. Cell 40, 381-392.

Huisman, O., Raymond, W., Froehlich, K. U., Errada, P., Kleckner, N., Botstein, D. and Hoyt, A. (1987) A Tn 10-lacZ-kanR-URA3 gene fusion transposon for insertion mutagenesis and fusion analysis of yeast and bacterial genomes. Genetics 116, 191-199.

Ji, H., Moore, D. P., Blomberg, M. A., Braiteman, L. T., Voytas, D. F., Natsoulis, G. and Boeke, J. D. (1993). Hotspots for unselected Tyl transposition events on yeast chromosome 111 are near tRNA genes and LTR sequences. Cell 73, 1007-1018.

Keith, G. and Pixa, G. (1984). The nucleotide sequence of asparagine tRNA from brewer’s yeast. Biochimie

Klapholz, S. and Easton-Esposito, R. (1982a). Chromo- somes XIV and XVII of Saccharomyces cerevisiae constitute a single linkage group. Mol. Cell. Biol. 2, 1399-1409.

Klapholz, S. and Easton-Esposito, R. (1982b). A new mapping method employing a meiotic rec mutant of yeast. Genetics 100, 387412.

Kolarov, J., Kolarova, N. and Nelson, N. (1990). A third ADP/ATP translocator gene in yeast. J. Biol. Chem. 265, 12711-12716.

Kyte, J. and Doolittle, R. F. (1982). A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157, 105-132.

Lalo, D., Stettler, S., Mariotte, S., Slonimski, P. and Thuriaux, P. (1993a). Two yeast chromosomes are related by a fossil duplication of their centromeric regions. C. R. Acad. Sci. Paris, Life Sciences 316, 367-373.

Lalo, D., Mariotte, S. and Thuriaux, P. (1993b). Two distinct yeast proteins are related to the mammalian ribosomal polypeptide L7. Yeast 9, 1085-1091.

Lapeyre., B., Bourbon, H. M. and Amalric, F. (1987). Nucleoline, the major nucleolar protein of growing eukaryotic cells: an unusual protein structure revealed by the nucleotide sequence. Proc. Nut1 Acud. Sci. USA 894, 1472-1476.

66, 639-643.

Page 11: XIV. Yeast sequencing reports. Organization of the centromeric region of chromosome XIV in Saccharomyces cerevisiae

CENTROMERIC REGION OF S. CEREVlSM CHROMOSOME XIV 533

Lawson, J. E. and Douglas, M. G. (1988). Separate genomic plasid bank based on a centromere- genes encode functionally equivalent ADPlATP car- containing shuttle vector. Gene 60, 237-243. rier protein in Saccharomyces cerevisiae. J. Biol. Rothstein, L. R. (1983). One-step gene disruption in Chem. 263, 14812-14818. yeast. Methods Enzymol. 101,202-210.

Lee, M. G., Young, R. A. and BeggS, J. D. (1984). Sanger, F., Nicklen, S. and Coulson, A. R. (1977). DNA Cloning the RNA2 gene of Saccharomyces cerevisiae. sequencing with chain terminating inhibitors. Proc. EMBO J. 3,2825-2830. Natl Acad. Sci. USA 74, 5463-5467.

Lewin, A. S., Hines, V. and Small, G. M. (1990). Citrate Sharp, p. M. and Cowe, E. (1991). Synonymous codon SYnthase encoded by the cIT2 gene of Saccharomyces usage in Saccharomyces cerevisiae. Yeast 7, 657-478. cerevisiae is peroxisomal. Mol. Cell. Biol. 10, 1399- Skala, J., Purnelle, B. and Goffeau, A. (1992). The

complete sequence of a 10.8 kb segment distal of 1405. K. T. (l99l). sum on the right am of chromosome 111 from

Characterization of SISI, a Saccharomyces cerevisiae Saccharomyces cerevisiae reveals Seven open reading homologue of bacterial dnaJ proteins. J. Cell Biol. frames including the Rvs161, ADPI and pGK genes,

Yeast 8, 409417.

(Ed.), Hybridomas in the Biosciences and Medicine. Plenum Press, New York, p, 397.

Luke, M. M.2 Sutton, A. and

114, 623-638. c. (1988). DNA strider: a program for the fast Snyder, M, and Davis, R. W. (1985). In Springer, T, analysis of DNA and protein sequences on the Apple

Macintosh family of computers. Nucl. Acids. Res. 16, 1829-1836.

Mortimer, R. K., Contopoulou, C. R, and King, J. S. (1992). Genetic and physical maps of Saccharomyces

Stettler, s., Mariotte, s., Riva, M., Sentenac, A. and and specific subunit Thuriaux, p. (1992). An

of RNA polymerase III(C) is encoded by gene RPC34 cerevisiae. Yeast 8, 817-902.

21390-21395. and characterization of the gene for a yeast mitochon- drial import receptor. Nature 347, 488491. Suissa, M., Suda, K. and Schatz, G. (1984). Isolation of

Myers, A. M., Pape, L. K. and Tzagoloff, A. (1985). the nuclear Yeast genes for citrate synthase and fifteen Mitochondria1 protein synthesis is required for main- other mitochondria1 proteins by a new screening tenance of intact mitochondrial genome. EMBO J. 8, method. EMBo 3, 1773-1781. 208 7-2092. Wiesenberger, G., Link, T. A,, Von Ahsen, U.,

Neitz, M. and Carbon, J. (1985). Identification and Waldherr, M. and SchweYen, R. J. (1991). MRS3 and characterization of the centromere from chromosome MRS4, two SuPPresSors of mtDNA Splicing defect in XIV in Saccharomyces cerevisiae. Mol. Cell. Biol. 5, yeast, are new members of the mitochondria1 carrier 2887-2893. family. J. Mol. Biol. 217, 23-37.

Oliver, S. G., et al., (147 authors). (1992). The complete Williamson, D. H. (1985). The yeast ARS element, six DNA sequence of yeast chromosome 111. Nature 357, years on: a progress report. Yeast 1, 1-14. 3846. Zarrilli, R., Oates, E. L., McBride, 0. W., Lerman,

Pearson, W. R. (1990). Rapid and sensitive sequence M. I., Chan, J. Y., Santisteban, P., Ursini, M. V., comparison with FASTP and FASTA. Methods Notkins, A. L. and Kohn, L. D. (1989). Sequence and Enzymol. 183, 63-98. chromosomal assignment of a novel complementary

Phelpos, A., Schobert, C. T. and Wohlrab, H. (1991). DNA identified by immunoscreening of a thyroid Cloning and characterization of the mitochondrial expression library. Similarity to a family of mitochon- phosphate transport gene from the yeast Saccharomy- drial solute carrier proteins. Mol. Endocrinol. 3, 1498- ces cerevisiae. Biochemistry 30, 248-252. 1508.

Rose, M. D., Novick, P., Thomas, J. H., Botstein, D. and Fink, G. R. (1987). A Saccharomyces cerevisiae

Murakami, H,, Blobel, G. and Pain, D, (1990). Isolation in Saccharomyces cerevisiae. J. Biol. chew. 267,


Recommended