Skip to main content
  • Home
  • Development
  • Documentation
  • Donate
  • Operational login
  • Browse the archive

swh logo
SoftwareHeritage
Software
Heritage
Archive
Features
  • Search

  • Downloads

  • Save code now

  • Add forge now

  • Help

https://github.com/nbraffman/CylK-homologs
10 September 2025, 15:48:34 UTC
  • Code
  • Branches (1)
  • Releases (0)
  • Visits
    • Branches
    • Releases
    • HEAD
    • refs/heads/main
    • 732a38ef8ab73988203ea1933158238ea90361b0
    No releases to show
  • 03ea778
  • /
  • script_Nterm.sh
Raw File Download
Take a new snapshot of a software origin

If the archived software origin currently browsed is not synchronized with its upstream version (for instance when new commits have been issued), you can explicitly request Software Heritage to take a new snapshot of it.

Use the form below to proceed. Once a request has been submitted and accepted, it will be processed as soon as possible. You can then check its processing state by visiting this dedicated page.
swh spinner

Processing "take a new snapshot" request ...

To reference or cite the objects present in the Software Heritage archive, permalinks based on SoftWare Hash IDentifiers (SWHIDs) must be used.
Select below a type of object currently browsed in order to display its associated SWHID and permalink.

  • content
  • directory
  • revision
  • snapshot
origin badgecontent badge Iframe embedding
swh:1:cnt:c2a24dbecc31f7c3712d2d7e9a736805a3d55c0f
origin badgedirectory badge Iframe embedding
swh:1:dir:03ea778787084907598b932648568a69c4d99c91
origin badgerevision badge
swh:1:rev:732a38ef8ab73988203ea1933158238ea90361b0
origin badgesnapshot badge
swh:1:snp:c8e163a9a31072f05ae0e334e64580bd4c6b0c6d

This interface enables to generate software citations, provided that the root directory of browsed objects contains a citation.cff or codemeta.json file.
Select below a type of object currently browsed in order to generate citations for them.

  • content
  • directory
  • revision
  • snapshot
Generate software citation in BibTex format (requires biblatex-software package)
Generating citation ...
Generate software citation in BibTex format (requires biblatex-software package)
Generating citation ...
Generate software citation in BibTex format (requires biblatex-software package)
Generating citation ...
Generate software citation in BibTex format (requires biblatex-software package)
Generating citation ...
Tip revision: 732a38ef8ab73988203ea1933158238ea90361b0 authored by nbraffman on 29 November 2021, 18:59:23 UTC
Update README.md
Tip revision: 732a38e
script_Nterm.sh
#!/bin/sh
filename='input_all_homologs.fasta'
Counter=0
rm out_homologs.fasta
while read line; do
    if [ $Counter -eq 0 ]; then #copy first line metadata of fasta file to temp.txt
        metadata=$line          #copy name to variable for log output
        echo $line > temp.txt
        echo $line > temp.fasta
        Counter=$((Counter+1)) #increase counter index to proceed to second line
    else
        echo $line >> temp.txt #copy second line sequence data to temp.txt
        echo $line >> temp.fasta #secondary copy for binning
        Counter=$((Counter-1)) #decrease counter index back to 0 to reset for next sequence
        
        filename='licheniforme.fasta'   #in the meantime... while we are on each sequence data line:
        while read line; do
            echo $line >> temp.txt      #copy reference metadata and sequence to temp.txt 
        done < 'licheniforme.fasta'

        muscle -in temp.txt -out temp.afa   #run muscle alignment of sequence against ref, output temp.afa

        filename='temp.afa'     #open temp.afa
        Counter2=0
        while read line; do
            Counter2=$((Counter2+1))    #determine number of lines because it will vary between alignments
        done < 'temp.afa'

        filename='temp.afa'     #re-open temp.afa
        Counter3=0
        homolog=""              #define variables homolog and refseq to exract data as strings from alignment
        refseq=""
        while read -r line; do          #very important to have -r here to recognize carriage returns
            Counter3=$((Counter3+1))

            if [ $Counter3 -eq 1 ]; then #at first line of alignment file, skip metadata
                echo
            else                                                #at all other lines
                if [ $Counter3 -le $((Counter2/2)) ]; then      #if we are in the first half of text file (homolog of interest)
                homolog="$homolog$line"                         #conatenate string to remove returns
                else
                    if [ $Counter3 -gt $((Counter2/2+1)) ]; then    #at all other lines second half of text file (reference seq)
                        refseq="$refseq$line"                       #concatenate string to remove returns
                    fi
                fi
            fi
            #echo $homolog > temp2.txt #write extracted strings to temp2.txt and temp3.txt for analysis
            #echo $refseq > temp3.txt
        done < 'temp.afa'

        hash_counter=0                                      #define variable to count breaks in alignment from left to right
        for (( i=0; i<${#refseq}; i++ )); do                #iterate through reference sequence (string)
            if [ $i -le $((hash_counter+240)) ]; then       #continue until the index is <= the number of breaks + 240
                if [ "${refseq:$i:1}" = "-" ]; then         #this is how we are defining the N-terminus, containing something between R105 and K240
                    hash_counter=$(($hash_counter+1))
                fi
            fi
        done
        echo "Query Sequence: "$metadata
        echo "N-Terminal breaks in ref sequence: "$hash_counter
        
        hash_counter2=0
        index=0
        for (( i=0; i<${#homolog}; i++ )); do              #repeat with homolog sequence (string) for the index length as above
            if [ $i -le $((hash_counter+240)) ]; then      #NOTE this should be hash_counter not hash_counter2
                index=$(($index+1))
                if [ "${homolog:$i:1}" = "-" ]; then        
                    hash_counter2=$(($hash_counter2+1))    #but do keep track of hash count in this sequence for analysis
                fi
            fi
        done
        echo "N-terminal breaks in query sequence: "$hash_counter2
        echo "# of residues before K240 equivalent: "$((index-hash_counter2))

        if [ $((index-hash_counter2+hash_counter)) -ge 135  ]; then
            filename='temp.fasta'
            while read line; do
                echo $line >> out_homologs.fasta
            done <'temp.fasta'
        fi
    fi
done < 'input_all_homologs.fasta'

rm temp.fasta
rm temp.afa
rm temp.txt

back to top

Software Heritage — Copyright (C) 2015–2025, The Software Heritage developers. License: GNU AGPLv3+.
The source code of Software Heritage itself is available on our development forge.
The source code files archived by Software Heritage are available under their own copyright and licenses.
Terms of use: Archive access, API— Content policy— Contact— JavaScript license information— Web API