Advertisement

反向输出dna序列_在DNA序列中寻找反向重复序列

阅读量:

该DNA序列较为冗长,并且其两侧翼区域存在两个特定位置的反向互补片段。

输入是:

cgtacacgagtagtcgtagctgtcagtcgatcgtacgtacgtagctgctgtagcactatcgaccccacacgtgtgtacacgatgcacagtcgtctatcacatgctagcgctgcccgtacgGATGGCCAAGGCCATCcgatcgctagctagcgccgcgcgtagcccgatcgagacatgctagcagttgtgctgatgtcgagatagctgtgatgcgatgctagcgccgcctagccgcctcgtgtaggctggatgcga的tcgatcgatgctagcggcgcgatcga tgcactagcc gtagcg ct ag ct g at cg at cg ta GATGGCCAAGGCCATCc gc g tag ata c g ac a t c c gg gg gt at a taa

这是我的代码:

use strict;

use warnings;

my input= ARGV[0];

chomp $input;

open (my $fh_in, "

my $dna= ;

chomp $dna;

#######################################################################################

if ($dna=~ /[^ACGT]/gi) {

print "This is not a valid DNA sequence, it has unknown base(s)\n";

}

$dna=~ tr/[acgt]/[ACGT]/;

######################################################################################

print "Minimum length of palindromic sequence?\n";

my $min= ;

chomp $min;

print "Maximum length of palindromic sequence?\n";

my $max= ;

chomp $max;

print "Minimum length of spacer region?\n";

my $min_spacer= ;

chomp $min_spacer;

print "Maximum length of spacer region?\n";

my $max_spacer= ;

chomp $max_spacer;

######################################################################################

my dna_length= length(dna);

my (length , offset , string_1 , string_2);

for (offset= 0 ; offset <= dna_length-max-max-max_spacer ; $offset++) {

for (length= min ; length <= max ; $length++) {

string_1= substr (dna, offset, length);

string_2= reverse string_1;

$string_2=~ tr/[ACGT]/[TGCA]/;

if (dna=~ /((string_1)([ACGT]{min_spacer,max_spacer})($string_2))/) {

print "IR starts at offset => 2***3***4\n$1\n\n";

}

}

}

带参数:

min = 6, max = 12, min_spacer = 4, max_spacer = 12

我得到的输出是:

IR starts at 26 => TCGATCGATGCTAGCGGCGCGATCGA

TCGATCGATGCTAGCGGCGCGATCGA

IR starts at 27 => CGATCGATGCTAGCGGCGCGATCG

CGATCGATGCTAGCGGCGCGATCG

IR starts at 118 => CGGATGGCCAAGGCCATCCG

CGGATGGCCAAGGCCATCCG

IR starts at 118 => CGGATGGCCAAGGCCATCCG

CGGATGGCCAAGGCCATCCG

IR starts at 118 => CGGATGGCCAAGGCCATCCG

CGGATGGCCAAGGCCATCCG

IR starts at 119 => GGATGGCCAAGGCCATCC

GGATGGCCAAGGCCATCC

IR starts at 119 => GGATGGCCAAGGCCATCC

GGATGGCCAAGGCCATCC

IR starts at 120 => GATGGCCAAGGCCATC

GATGGCCAAGGCCATC

IR starts at 136 => CGATCGATGCTAGCGGCGCGATCG

CGATCGATGCTAGCGGCGCGATCG

IR starts at 164 => CGATCGATGCTAGCGGCGCGATCG

CGATCGATGCTAGCGGCGCGATCG

IR starts at 252 => CGATCGATGCTAGCGGCGCGATCG

CGATCGATGCTAGCGGCGCGATCG

IR starts at 254 => ATCGATGCTAGCGGCGCGATCGAT

ATCGATGCTAGCGGCGCGATCGAT

IR starts at 254 => ATCGATCGATGCTAGCGGCGCGATCGAT

ATCGATCGATGCTAGCGGCGCGATCGAT

IR starts at 255 => TCGATCGATGCTAGCGGCGCGATCGA

TCGATCGATGCTAGCGGCGCGATCGA

IR starts at 256 => CGATCGATGCTAGCGGCGCGATCG

CGATCGATGCTAGCGGCGCGATCG

IR starts at 258 => ATCGATGCTAGCGGCGCGATCGAT

ATCGATGCTAGCGGCGCGATCGAT

IR starts at 274 => CGATCGATGCTAGCGGCGCGATCG

CGATCGATGCTAGCGGCGCGATCG

IR starts at 276 => ATCGATGCTAGCGGCGCGATCGAT

ATCGATGCTAGCGGCGCGATCGAT

IR starts at 304 => ATCGATGCTAGCGGCGCGATCGAT

ATCGATGCTAGCGGCGCGATCGAT

IR starts at 304 => ATCGATCGATGCTAGCGGCGCGATCGAT

ATCGATCGATGCTAGCGGCGCGATCGAT

IR starts at 305 => TCGATCGATGCTAGCGGCGCGATCGA

TCGATCGATGCTAGCGGCGCGATCGA

IR starts at 306 => CGATCGATGCTAGCGGCGCGATCG

CGATCGATGCTAGCGGCGCGATCG

IR starts at 314 => GATGGCCAAGGCCATC

GATGGCCAAGGCCATC

然而,在核查我的首次点击所处区域(在输入中以粗体标出)时, 此定位的offset似乎不在位置26. 有人能指出我的代码存在什么问题吗? 感谢

全部评论 (0)

还没有任何评论哟~