Gene Information

Name : HMPREF4655_21077 (HMPREF4655_21077)
Accession : YP_005770164.1
Strain : Helicobacter pylori 35A
Genome accession: NC_017360
Putative virulence/resistance : Virulence
Product : cag pathogenicity island protein
Function : -
COG functional category : -
COG ID : -
EC number : -
Position : 1059362 - 1064755 bp
Length : 5394 bp
Strand : +
Note : COG: COG2948; Pfam: PF07337, PF03743; InterPro: IPR010853

DNA sequence :
ATGAATGAAGAAAACGATAAACTTGAAACTTCTAAAAAAACCCAACAACATTCACCCCAAGATTTATCCAATGAAGAAGC
AACAGAAGCCAATCACTTTGAAGATTCTTCAAAAGAATCCAAAGAAAGATCAGAACATCATCTTGACAACCCCACAGAAA
CTAAAACCAATTTTGATGAATACGAGTCAGAAGAAACCCAAACTCAAATGGATTCTGGAGGTAATGAAACTTCAGAATCT
AGCAATGGCAGTCTAGCAGACAAGTTATTCAAAAAAGCTAGAAAATTAGTTGATAATAAAAGACCTTTCACTCAGCAAAA
GAATTTAGATGAAGAAATCCAAGAACCGAACGAAGAAGACGATCAGGAAAATAATGGGTATCAAGAAGAAACTCAAATGG
ACTTAATTGATGATGAAACTTCTAAAAAAACCCAACAACATTCACCCCAAGATTTATCCAATGAAGAAGCAACAGAAGCC
AATCACTTTGAAGATTCTTCAAAAGAATCCAAAGAAAGCTCAGAACATCATCTTGACAACCCCACAGAAACTAAAACCAA
TTTTGATGAATACGAGTCAGAAGAAATAACTAACGATTCTAACGATCAAGAGATTATCAAAGGAAGCAAAAAGAAATACA
TTATTGGTGGCATTGTAGTCGCTGTTCTTATCGTGATTATTTTATTTTCTAGAAGCATTTTTCACTACTTCATGCCTTTG
GAAGATAAAAGCTCTCGTTTTAGCAAAGATAGGAATCTTTATGTCAATGATGAAATCCAAATAAGGCAAGAGTATAACCG
ATTGCTGAAAGAACGGAATGAAAAAGGCAATATGATCGATAAGAATCTTTTCTTCAATGACGATCCCAATAGAACCTTAT
ACAACTATTTGAATATTGCAGAAATTGAGGACAAAAACCCATTGAGGGACTTTTATGAATGTATTAGTAATGGTGGCAAC
TATGAAGAATGTTTGAAGCTTATCAAAGACAAAAAACTTCAAGATCAAATGAAAAAGACTTTAGAGGCTTATAATGACTG
CATCAAAAATGCCAAAACTGAAGAAGAAAGGATCAAGTGTTTAGATCTAATCAAAGATGAAAACCTGAAAAAAAGCTTAC
TGAACCAACAAAAAGTTCAAGTGGCGCTAGATTGTTTGAAAAACGCTAAAACCGATGAAGAAAGAAACGAGTGCCTAAAA
CTCATAAATGACCCTGAGATTAGAGAGAAATTCCGTAAGGAATTAGGGCTTCAAAAAGAGCTTCAAGAGTATAAGGATTG
TATCAAAAACGCCAAAACAGAAGCTGAGAAAAACGAATGCTTGAAAGGCTTGTCTAAAGAAGCTATAGAAAGATTGAAAC
AGCAAGCGCTAGATTGTTTGAAAAACGCTAAAACCGATGAAGAACGAAAAGAGTGCTTGAAAAATATTCCCCAAGACTTG
CAAAAAGAACTACTAGCTGATATGAGCGTCAAGGCTTACAAGGATTGCGTATCAAAAGCTAGGAATGAAAAAGAGAAAAA
AGAATGCGAAAAATTACTCACGCCTGAAGCGAAAAAAAAGTTAGAACAACAGGTTCTAGATTGTTTGAAAAACGCTAAAA
CTGATGAAGAACGAAAAAAATGTTTGAAAGATCTCCCTAAAGACTTACAAAGCGATATTTTAGCCAAAGAAAGTCTTAAA
GCTTATAAAGACTGCGTTTCAAGAGCTAGGAATGAAAAAGAGAAAAAAGAATGCGAGAAGTTACTCACGCCTGAAGCGAA
AAAACTTTTAGAAGAAGAAGCCAAAGAGAGCGTTAAGGCTTATTTGGACTGCGTATCTCAAGCCAAAACTGAAGCTGAGA
AAAAAGAATGCGAGAAATTGCTCACGCCTGAAGCGAAAAAAAAGTTAGAAGAAGCTAAAAAAAGCGTTAAGGCTTATTTG
GATTGCGTATCTCAAGCCAAAACTGAAGCTGAGAAAAAAGAATGCGAGAAATTACTCACGCCTGAAGCAAAAAAGCTTTT
AGAGCGACAAGCGCTAGATTGTTTGAAAAACGCTAAAACTGATGAAGAACGAAAAAAGTGTTTGAAAGATCTCCCTAAAG
ACTTGCAGAAAAAGGTTTTAGCCAAAGAGAGCGTTAAGGCTTACTTGGATTGCGTATCTCAAGCTAAAACTGAAGCTGAG
AAAAAAGAATGCGAGAAATTACTCACCCCTGAAGCGAAAAAGCTTTTAGAAGAAGCTAAAGAAAGTCTTAAAGCTTATAA
AGACTGCGTTTCAAGAGCTAGGAATGAAAAAGAGAAAAAAGAATGCGAGAAGTTACTCACGCCTGAAGCGAAAAAACTTT
TAGAAGAAGAAGCCAAAGAGAGCGTTAAGGCTTATTTGGACTGCGTATCTCAAGCCAAAACTGAAGCTGAGAAAAAAGAA
TGCGAGAAATTGCTCACGCCTGAAGCGAAAAAAAAGTTAGAAGAAGCTAAAAAAAGCGTTAAGGCTTATTTGGATTGCGT
ATCTCAAGCCAAAACTGAAGCTGAGAAAAAAGAATGCGAGAAATTACTCACGCCTGAAGCAAAAAAGCTTTTAGAGCGAC
AAGCGCTAGATTGTTTGAAAAACGCTAAAACTGATGAAGAACGAAAAAGGTGTGTCAAAGATCTTCCTAAAGACTTGCAG
AAAAAGGTTTTAGCTAAAAAAAGCGTTAAAGCTTATAAAGACTGCGTTTCAAGAGCTAGGAATGAAAAAGAGAAAAAAGA
ATGCGAGAAATTACTCACCCCTGAAGCGAAAAAGCTTTTAGAAGAAGCTAAAGAAAGTCTTAAAGCTTATAAAGACTGCG
TTTCAAGAGCTAGGAATGAAAAAGAGAAAAAAGAATGCGAGAAATTACTCACCCCTGAAGCGAAAAAGCTTTTAGAAGAA
GCTAAAGAAAGTCTTAAAGCTTATAAAGACTGCGTTTCAAGAGCTAGGAATGAAAAAGAGAAAAAAGAATGCGAGCAATT
ACTCACCCCTGAAGCGAAAAAGCTTTTAGAGCAACAAGCGCTAGATTGTTTGAAAAACGCTAAAACCGAAGCTGAGAAAA
AAAGGTGTGTCAAAGATCTTCCTAAAGACTTGCAGAAAAAGGTTTTAGCCAAAGAGAGCGTTAAGGCTTATTTGGACTGC
GTTTCAAGAGCTAGGAATGAAAAAGAGAAAAAAGAATGCGAGCAATTACTCACCCCTGAAGCGAAAAAGCTTTTAGAAGA
AGCTAAAGAGAGTCTTAAAGCTTATAAAGACTGCCTCTCTCAAGCTAGAAATGAAGAAGAAAGGAGAGCTTGCGAGAAAT
TACTCACCCCTGAAGCGAAAAAACTCTTAGAGCAAGAAGTTAAGAAGAGCGTTAAGGCTTATTTGGACTGCGTATCAAAA
GCTAGGAATGAAAAAGAGAAACAAGAATGCGAGAAATTACTCACCCCTGAAGCGAGAAAATTTTTAGCGAAGCAAGTGCT
AAGTTGTTTGGAAAAAGCTAGAAATGAAGAAGAAAGAAAAGCGTGTCTTAAAAATATCCCTAAAGACTTACAGAAAAATG
TTTTAGCTAAAGAGAGTCTTAAAGCTTATAAAGACTGCCTCTCTCAAGCTAGAAATGAAGAAGAAAGGAGAGCTTGCGAG
AAATTACTCACCCCTGAAGCGAGAAAACTCTTAGAGCAAGAAGTTAAGAAGAGCGTTAAGGCTTATTTGGACTGCGTTTC
AAGAGCTAGGAATGAAAAAGAGAAACAAGAATGCGAGAAATTACTCACCCCTGAAGCGAGAAAATTTTTAGCGAAAGAAC
TCCAACAAAAAGATAAAGCGATCAAAGATTGCTTGAAAAACGCCGATCCTAACGACAGAGCGGCTATTATGAAGTGTTTG
GATGGTTTGAGCGATGAAGAGAAGCTCAAATACCTGCAAGAAGCTAGAGAAAAGGCTGTCTTGGATTGTTTGAAAACGGC
TAGGACCGATGAAGAAAAAAGAAAATGTCAAAACCTTTATAGCGATTTGATCCAAGAAATCCAAAATAAAAGAATACAAA
GCAAACAAAATCAATTGAGTAAAACAGAAAGATTGCATCAAGCAAGCGAGTGCTTGGATAACTTAGATGACCCTACTGAT
CAAGAAGCCATAGAACAATGTTTAGAAGGCTTGAGCGATAGTGAAAGGGCGCTAATTCTAGGAATTAAACGACAAGCTGA
TGAAGTGGATCTGATTTATAGCGATCTAAGAAACCGCAAAACCTTTGATAACATGGCGGCTAAAGGTTATCCATTGTTGC
CAATGGATTTCAAAAATGGCGGCGATATTGCCACTATTAACGCCACTAATGTTGATGCGGACAAAATAGCTAGCGATAAT
CCTATTTATGCTTCCATAGAGCCTGATATTACTAAGCAATACGAAACAGAAAAAACCATTAAGGATAAGAATTTAGAAGC
TAAATTAGCTAAGGCTTTAGGTGGCAATAAAAAAGATGACGATAAAGAAAAAAGTAAAAAATCCACAGCAGAAGCTAGAG
TAGAAAGCAATAAGATAGACAAAGATGTCGCAGAAACTGCCAAAAATATCAGTGAAATCGCTCTTAAGAACAAAAAAGAA
AAGAGTGGGGAATTTGTAGATGAAAATGGTAATCCCATTGATGACAAAAAGAAAACAGAAACACAAGATGAAACAAGCCC
TGTCAAACAGGCCTTTATAGGCAAGAGTGATCCCACATTTGTTTTAGCGCAATACACCCCTATTGAAATCACTCTGACTT
CTAAAGTAGATGCCACTCTCACAGGTATAGTGAGTGGGGTTGTAGCCAAAGATGTATGGAACATGAACGGCACTATGATC
TTACTAGACAAAGGCACTAAGGTGTATGGGAATTACCAAAGCGTGAAGGGTGGCACACCCATTATGACACGCTTAATGAT
AGTCTTTACTAAAGCCATTACGCCTGATGGTGTGATAATACCTCTAGCAAACGCTCAAGCAGCAGGCATGCTGGGTGAAG
CAGGGGTAGATGGCTATGTGAATAACCACTTTATGAAGCGCATAGGCTTTGCTGTGATAGCAAGCGTGGTTAATAGCTTC
TTGCAAACTGCGCCTATCATAGCTCTAGATAAACTCATAGGCCTTGGCAAAGGTAGAAGTGAAAGGACACCTGAATTTAA
TTACGCTTTGGGTCAAGCTATCAATGGTAGTATGCAAAGTTCAGCTCAGATGTCTAATCAAATTCTAGGGCAACTGATGA
ATATCCCCCCAAGTTTTTACAAAAATGAGGGCGATAGTATTAAGATTCTCACAATGGACGACATTGATTTTAGTGGCGTG
TATGATGTTAAAATTACCAACAAATCTGTGGTAGATGAAATTATCAAACAAAGCACTAAAACTTTGTCTAGAGAGCATGA
AGAAATCACCACAAGCCCCAAAGGTGGCAATTAA

Protein sequence :
MNEENDKLETSKKTQQHSPQDLSNEEATEANHFEDSSKESKERSEHHLDNPTETKTNFDEYESEETQTQMDSGGNETSES
SNGSLADKLFKKARKLVDNKRPFTQQKNLDEEIQEPNEEDDQENNGYQEETQMDLIDDETSKKTQQHSPQDLSNEEATEA
NHFEDSSKESKESSEHHLDNPTETKTNFDEYESEEITNDSNDQEIIKGSKKKYIIGGIVVAVLIVIILFSRSIFHYFMPL
EDKSSRFSKDRNLYVNDEIQIRQEYNRLLKERNEKGNMIDKNLFFNDDPNRTLYNYLNIAEIEDKNPLRDFYECISNGGN
YEECLKLIKDKKLQDQMKKTLEAYNDCIKNAKTEEERIKCLDLIKDENLKKSLLNQQKVQVALDCLKNAKTDEERNECLK
LINDPEIREKFRKELGLQKELQEYKDCIKNAKTEAEKNECLKGLSKEAIERLKQQALDCLKNAKTDEERKECLKNIPQDL
QKELLADMSVKAYKDCVSKARNEKEKKECEKLLTPEAKKKLEQQVLDCLKNAKTDEERKKCLKDLPKDLQSDILAKESLK
AYKDCVSRARNEKEKKECEKLLTPEAKKLLEEEAKESVKAYLDCVSQAKTEAEKKECEKLLTPEAKKKLEEAKKSVKAYL
DCVSQAKTEAEKKECEKLLTPEAKKLLERQALDCLKNAKTDEERKKCLKDLPKDLQKKVLAKESVKAYLDCVSQAKTEAE
KKECEKLLTPEAKKLLEEAKESLKAYKDCVSRARNEKEKKECEKLLTPEAKKLLEEEAKESVKAYLDCVSQAKTEAEKKE
CEKLLTPEAKKKLEEAKKSVKAYLDCVSQAKTEAEKKECEKLLTPEAKKLLERQALDCLKNAKTDEERKRCVKDLPKDLQ
KKVLAKKSVKAYKDCVSRARNEKEKKECEKLLTPEAKKLLEEAKESLKAYKDCVSRARNEKEKKECEKLLTPEAKKLLEE
AKESLKAYKDCVSRARNEKEKKECEQLLTPEAKKLLEQQALDCLKNAKTEAEKKRCVKDLPKDLQKKVLAKESVKAYLDC
VSRARNEKEKKECEQLLTPEAKKLLEEAKESLKAYKDCLSQARNEEERRACEKLLTPEAKKLLEQEVKKSVKAYLDCVSK
ARNEKEKQECEKLLTPEARKFLAKQVLSCLEKARNEEERKACLKNIPKDLQKNVLAKESLKAYKDCLSQARNEEERRACE
KLLTPEARKLLEQEVKKSVKAYLDCVSRARNEKEKQECEKLLTPEARKFLAKELQQKDKAIKDCLKNADPNDRAAIMKCL
DGLSDEEKLKYLQEAREKAVLDCLKTARTDEEKRKCQNLYSDLIQEIQNKRIQSKQNQLSKTERLHQASECLDNLDDPTD
QEAIEQCLEGLSDSERALILGIKRQADEVDLIYSDLRNRKTFDNMAAKGYPLLPMDFKNGGDIATINATNVDADKIASDN
PIYASIEPDITKQYETEKTIKDKNLEAKLAKALGGNKKDDDKEKSKKSTAEARVESNKIDKDVAETAKNISEIALKNKKE
KSGEFVDENGNPIDDKKKTETQDETSPVKQAFIGKSDPTFVLAQYTPIEITLTSKVDATLTGIVSGVVAKDVWNMNGTMI
LLDKGTKVYGNYQSVKGGTPIMTRLMIVFTKAITPDGVIIPLANAQAAGMLGEAGVDGYVNNHFMKRIGFAVIASVVNSF
LQTAPIIALDKLIGLGKGRSERTPEFNYALGQAINGSMQSSAQMSNQILGQLMNIPPSFYKNEGDSIKILTMDDIDFSGV
YDVKITNKSVVDEIIKQSTKTLSREHEEITTSPKGGN

• Homologs from PAI DB

GeneGenBank Accn Product Virulance or Resistance PAI or REI Alignment Type E-val Identity
HP0527 BAD13833.1 cag pathogenicity island protein Virulence cag PAI Protein 0.0 98
cagY YP_005777271.1 cag pathogenicity island protein Y VirB10-like protein Virulence cag PAI Protein 0.0 98
HP0527 BAD14052.1 cag pathogenicity island protein Virulence cag PAI Protein 0.0 98
cagY YP_005774542.1 cag pathogenicity island protein Virulence cag PAI Protein 0.0 97
HP0527 BAD13970.1 cag pathogenicity island protein Virulence cag PAI Protein 0.0 96
cagY AGC69792.1 cag pathogenicity island protein Y Virulence cag PAI Protein 0.0 96
HP0527 NP_207323.1 cag pathogenicity island protein (cag7) Virulence cag PAI Protein 0.0 95
HP0527 BAD14026.1 cag pathogenicity island protein Virulence cag PAI Protein 0.0 94
cagY AGC69786.1 cag pathogenicity island protein Y Virulence cag PAI Protein 0.0 94
cagY AGC69789.1 cag pathogenicity island protein Y Virulence cag PAI Protein 0.0 94
HP0527 BAD13888.1 cag pathogenicity island protein Virulence cag PAI Protein 0.0 93
cagY YP_005775730.1 cag pathogenicity island protein Y VirB10-like protein Virulence cag PAI Protein 0.0 92
HP0527 BAD13860.1 cag pathogenicity island protein Virulence cag PAI Protein 0.0 92
cagY AGC69787.1 cag pathogenicity island protein Y Virulence cag PAI Protein 0.0 90
HP0527 BAD13806.1 cag pathogenicity island protein Virulence cag PAI Protein 0.0 87
HP0527 BAD13998.1 cag pathogenicity island protein Virulence cag PAI Protein 0.0 84

• Homologs from VFDB (virulence genes)

GeneGenBank Accn Product ID of source DB Alignment Type E-val Identity
HMPREF4655_21077 YP_005770164.1 cag pathogenicity island protein VFG0287 Protein 0.0 95