Skip to main content
  • Home
  • Development
  • Documentation
  • Donate
  • Operational login
  • Browse the archive

swh logo
SoftwareHeritage
Software
Heritage
Archive
Features
  • Search

  • Downloads

  • Save code now

  • Add forge now

  • Help

https://doi.org/10.5281/zenodo.14747860
27 January 2025, 11:47:22 UTC
  • Code
  • Branches (0)
  • Releases (1)
  • Visits
    • Branches
    • Releases
      • 1
      • 1
    • 244c21a
    • /
    • plantpollinator-RedYellowMimulus-58e3e17
    • /
    • gcms-blank-parser.pl
    Raw File Download

    To reference or cite the objects present in the Software Heritage archive, permalinks based on SoftWare Hash IDentifiers (SWHIDs) must be used.
    Select below a type of object currently browsed in order to display its associated SWHID and permalink.

    • content
    • directory
    • snapshot
    • release
    origin badgecontent badge
    swh:1:cnt:3b5725f3a05e6fef916a635ebd5ba24703c26056
    origin badgedirectory badge
    swh:1:dir:51ea45d2c2f4e266d673273634cfdb7f0ef41b28
    origin badgesnapshot badge
    swh:1:snp:7048fb6efbf07dc1a620fa952e9c40eaff097d6d
    origin badgerelease badge
    swh:1:rel:185dcfd622bee2de06e2003b221d684d5d3fad1a

    This interface enables to generate software citations, provided that the root directory of browsed objects contains a citation.cff or codemeta.json file.
    Select below a type of object currently browsed in order to generate citations for them.

    • content
    • directory
    • snapshot
    • release
    (requires biblatex-software package)
    Generating citation ...
    (requires biblatex-software package)
    Generating citation ...
    (requires biblatex-software package)
    Generating citation ...
    (requires biblatex-software package)
    Generating citation ...
    gcms-blank-parser.pl
    #!/usr/bin/perl -w
    
    use strict;
    use List::MoreUtils qw(any);
    
    # help
    unless($ARGV[1] =~ /^all/) {
        print "gcms-blank-parser.pl blankfile datafile\n";
        exit;
    }
    
    # get arguments, frob files
    my @filestoread = @ARGV;
    my $blankfile = $ARGV[0];
    my $datafile = $ARGV[1];
    my $outfile = $datafile . ".parsed.tsv";
    open(BLANKFILE,$blankfile);
    open(DATAFILE,$datafile);
    open(OUTFILE,">$outfile");
    print OUTFILE "Infile\tRealRT\tBlankRT\tPotentialBlankRTs\tArea\tBlankArea\tAreavsBlank\tHit1\tHit2\tHit3\n";
    
    # declare data hashes
    my %blankdata;
    my %realdata;
    
    # read blank file into a data structure
    while(defined(my $blankline = <BLANKFILE>)) {
        next if ($blankline =~ /RT/); # strip out first line
        last unless ($blankline =~ /^\S/); # strip out last blank line
        my @blanklinedata = split(/\t/,$blankline);
        my $blanklinert = $blanklinedata[0];
        my @blanklinetokeep = ();
        $blanklinetokeep[0] = $blanklinedata[1];
        $blanklinetokeep[1] = $blanklinedata[2];
        $blanklinetokeep[2] = $blanklinedata[6];
        $blanklinetokeep[3] = $blanklinedata[10];
        $blankdata{$blanklinert} = [@blanklinetokeep];
    }
    # for my $rt (keys %blankdata) {
    #     print "blankdata: $rt: @{$blankdata{$rt}}\n";
    # }
    
    # read data file into a data structure
    while(defined(my $dataline = <DATAFILE>)) {
        next if ($dataline =~ /RT/); # strip out first line
        last unless ($dataline =~ /^\S/); # strip out last blank line
        my @datalinedata = split(/\t/,$dataline);
        my $datalinert = $datalinedata[0];
        my @datalinetokeep = ();
        $datalinetokeep[0] = $datalinedata[1];
        $datalinetokeep[1] = $datalinedata[2];
        $datalinetokeep[2] = $datalinedata[6];
        $datalinetokeep[3] = $datalinedata[10];
        $realdata{$datalinert} = [@datalinetokeep];
    }
    #for my $rt (keys %realdata) {
    #    print "realdata: $rt: @{$realdata{$rt}}\n";
    #} 
    
    # sort keys (retention times)
    # now a sorted list of RTs
    # for all of those (sorted keys), which ones match each other/are close enough to the peak I'm testing against
    # and then pull the values from the hash and do the comparator
    
    my @realrts = sort keys %realdata;
    my @blankrts = sort keys %blankdata;
    my @realhits;
    my @blankhits;
    my $realarea;
    my $blankarea;
    my $arearatio;
    my $blankrt;
    
    foreach my $realrt (@realrts) {
        my $lowerend = $realrt - 0.1;
        my $upperend = $realrt + 0.1;
        my $alreadyhit = 0;
        my $hitcounter = 0;
        my $hitcounterhuman = $hitcounter+1;
        my @timehits = grep({$lowerend < $_ < $upperend} @blankrts);
        my $nohits = scalar(@timehits);
    	@realhits = @{$realdata{$realrt}};
    	$realarea = shift(@realhits);
        if ($nohits == 0) {
    	print OUTFILE "$datafile\t$realrt\tNA\tNA\t$realarea\tNA\tNA\t$realhits[0]\t$realhits[1]\t$realhits[2]\n";
    	print "\n\nNo relevant blank hits at RT $realrt, moving on...\n\n";
        } else {
    	print "\n\nFound $nohits potential blank peak(s) at real RT $realrt which is/are @timehits\n";
    	    while($alreadyhit != 1) {
    		if($hitcounter < $nohits) {
    		    my $timehit = $timehits[$hitcounter];
    		    print "Checking peak $hitcounterhuman of $nohits at blank RT $timehit\n";
    		    @blankhits = @{$blankdata{$timehit}};
    		    $blankarea = shift(@blankhits);
    		    if (grep({$_ eq $blankhits[0]} @realhits) or
    			grep({$_ eq $blankhits[1]} @realhits) or
    			grep({$_ eq $blankhits[2]} @realhits)) {
    #			print "Found $blankhits[0] or $blankhits[1] or $blankhits[2] in @realhits\n";
    			$arearatio = $realarea / $blankarea;
    			print OUTFILE "$datafile\t$realrt\t$timehit\t@timehits\t$realarea\t$blankarea\t$arearatio\t$realhits[0]\t$realhits[1]\t$realhits[2]\n";
    			print "Ratio of peaks is $arearatio...\n";
    			$alreadyhit = 1;
    			print "Since we found a match, going on to the next real peak...\n";
    		    } else {
    #			print "Didn't see the blank names @blankhits in the real names @realhits\n";
    			print "No matches seen to $realrt for peak $hitcounterhuman\n";
    			$hitcounter++;
    			$hitcounterhuman++;
    		    }
    		} else {
    			print "No hits found at all, going on to the next real peak...\n";
    			print OUTFILE "$datafile\t$realrt\tNA\t@timehits\t$realarea\tNA\tNA\t$realhits[0]\t$realhits[1]\t$realhits[2]\n";
    			$alreadyhit = 1;
    		}
    	    }
        }
    }
    
    close(BLANKFILE);
    close(DATAFILE);
    close(OUTFILE);
    

    back to top

    Software Heritage — Copyright (C) 2015–2026, The Software Heritage developers. License: GNU AGPLv3+.
    The source code of Software Heritage itself is available on our development forge.
    The source code files archived by Software Heritage are available under their own copyright and licenses.
    Terms of use: Archive access, API— Content policy— Contact— JavaScript license information— Web API