Identification of new human cadherin genes using a combination of protein motif search and gene finding methods

JC Höng, NV Ivanov, P Hodor, M Xia, N Wei… - Journal of molecular …, 2004 - Elsevier
JC Höng, NV Ivanov, P Hodor, M Xia, N Wei, R Blevins, D Gerhold, M Borodovsky, Y Liu
Journal of molecular biology, 2004Elsevier
We have combined protein motif search and gene finding methods to identify genes
encoding proteins containing specific domains. Particularly, we have focused on finding new
human genes of the cadherin superfamily proteins, which represent a major group of cell–
cell adhesion receptors contributing to embryonic neuronal morphogenesis. Models for
three cadherin protein motifs were generated from over 100 already annotated cadherin
domains and used to search the complete translated human genome. The genomic …
We have combined protein motif search and gene finding methods to identify genes encoding proteins containing specific domains. Particularly, we have focused on finding new human genes of the cadherin superfamily proteins, which represent a major group of cell–cell adhesion receptors contributing to embryonic neuronal morphogenesis. Models for three cadherin protein motifs were generated from over 100 already annotated cadherin domains and used to search the complete translated human genome. The genomic sequence regions containing motif “hits” were analyzed by eukaryotic GeneMark.hmm to identify the exon–intron structure of new genes. Three new genes CDH-J, PCDH-J and FAT-J were found. The predicted proteins PCDH-J and FAT-J were classified into protocadherin and FAT-like subfamilies, respectively, based on the number and organization of cadherin domains and presence of subfamily-specific conserved amino acid residues. Expression of FAT-J was shown in almost all tested tissues. The exon–intron organization of CDH-J was experimentally verified by PCR with specifically designed primers and its tissue-specific expression was demonstrated. The described methodology can be applied to discover new genes encoding proteins from families with well-characterized structural and functional domains.
Elsevier