如何从文本文件中按顺序或控制方式提取2个字符串之间的特定数据如果满足多个这样的实例

rocky 发表于 Dev

岩石的

Sample Input Data file :
================

Session Initiation Protocol (REGISTER)
temp data here
Rocky1
Rocky2
Rocky3
Rocky4
CSeq: 3 REGISTER

Session Initiation Protocol (REGISTER)
temp data here
Jocky1
Jocky2
Jocky3
Jocky4
CSeq: 3 REGISTER

Session Initiation Protocol (REGISTER)
Hello
world
Bye
temp data here
CSeq: 3 REGISTER

例如，在上面的数据中，我想提取变量1->会话启动协议（REGISTER）和变量2-> CSeq之间的数据：3 REGISTER

这里的临时数据

Rocky1
Rocky2
Rocky3
Rocky4

现在，在下面的输入文件中，变量1和变量2多次出现，但是数据不同，因此希望控制这些变量的每次出现以进行进一步操作。

下面是用于提取数据的程序，该程序实际上是从所有出现的事件中提取数据，但没有控制权如果我想提取直到变量1和变量2的第一次出现，才提取数据

#!/usr/bin/perl

use strict;
use warnings;
my $file = "output.txt";


my $kw1 = "Session Initiation Protocol (REGISTER)";
my $kw2 = "CSeq: 3 REGISTER";   

while (<DATA>) {

   if ( /\Q$kw2\E/ ... /\Q$kw1\E/ ) {
      print;
   }
}

在此处添加了最新一期

#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };

my $kw1 = 'Session Initiation Protocol (REGISTER)';
my $kw2 = 'CSeq: 3 REGISTER';

my $instance_counter;
my @first;
my @next;
my $myfile = "Input.txt";
open my $out_file1, '>', 'hello1.txt' or die "$!";
open my $out_file2, '>', 'hello2.txt' or die "$!";


open DATA, $myfile or die "Can't open file: $!";

while (<DATA>) {
    if (my $match = (/\Q$kw1/ .. /\Q$kw2/)) {
        ++$instance_counter if 1 == $match;

        if (1 == $instance_counter) {
            push @first, $_ if /$kw1/;

        } else {
            @next = @first if 1 == $match;
            shift @next;
            push @next , $_;
        }


    }
    print $out_file1 @first;
    print $out_file2 @next;
}

让我们在下面说我的输入数据：

Session Initiation Protocol (REGISTER)
temp data here
Rocky1
Rocky2
Rocky3
Rocky4
I don't know the text here
CSeq: 3 REGISTER

Session Initiation Protocol (REGISTER)
temp data here
Jocky1
Jocky2
Jocky3
Jocky4
I don't know the text here
CSeq: 3 REGISTER


I want my output to look like as 

output_1.txt
temp data here
Rocky1
Rocky2
Rocky3
Rocky4
I don't know the text here

output_2.txt
temp data here
Jocky1
Jocky2
Jocky3
Jocky4
I don't know the text here


#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };

my $kw1 = 'Session Initiation Protocol (REGISTER)';
my $kw2 = 'CSeq: 3 REGISTER';

my $instance_counter;
my @first;
my @next;
my $myfile = "Input.txt";
open my $out_file1, '>', 'hello1.txt' or die "$!";
open my $out_file2, '>', 'hello2.txt' or die "$!";
open my $out_file3, '>', 'hello3.txt' or die "$!";

open DATA, $myfile or die "Can't open file: $!";

while (<DATA>) {
    if (my $match = (/\Q$kw1/ .. /\Q$kw2/)) {
        ++$instance_counter if 1 == $match;

        if (1 == $instance_counter) {
          print $out_file1 $_;
        } 
        elsif (2 == $instance_counter){
        print $out_file2 $_;
        }
        else {
           print $out_file3 $_;
        }


    }

}

我现在进入单独的输出文件中。我可以针对从文件中找到的任何实例都将其归纳吗？

疾病

问题1：您的范围向后，应该从$ kw1开始，到$ kw2结束。另外，您不清楚为什么要使用...而不是..，因为两个表达式都不会在同一行上匹配。

请注意，范围运算符返回迭代号，E0最后一行的末尾，因此您可以轻松捕获到最后一个表达式匹配时：

while (<DATA>) {
    if (my $match = (/\Q$kw1/ .. /\Q$kw2/)) {
        print;
        last if $match =~ /E0/;
    }
}

因此，要相互比较第一个实例，您可以执行以下操作：

#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };

my $kw1 = 'Session Initiation Protocol (REGISTER)';
my $kw2 = 'CSeq: 3 REGISTER';

my $instance_counter;
my @first;
my @next;

while (<DATA>) {
    if (my $match = (/\Q$kw1/ .. /\Q$kw2/)) {
        ++$instance_counter if 1 == $match;

        if (1 == $instance_counter) {
            push @first, $_ if /ocky\d/;

        } else {
            @next = @first if 1 == $match;
            shift @next if /ocky\d/
                        && substr($_, 1) eq substr $next[0], 1;
        }

        if ($match =~ /E0$/ && $instance_counter > 1) {
            if (@next) {
                say scalar @next, " ockies missing in instance $instance_counter";
            } else {
                say "instance $instance_counter ok";
            }
        }
    }
}

__DATA__
Session Initiation Protocol (REGISTER)
temp data here
Rocky1
Rocky2
Rocky3
Rocky4
CSeq: 3 REGISTER

Session Initiation Protocol (REGISTER)
temp data here
Jocky1
Jocky2
Jocky3
Jocky4
CSeq: 3 REGISTER

Session Initiation Protocol (REGISTER)
Qocky1
Qocky2
Hello
world
Bye
temp data here
CSeq: 3 REGISTER

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。