Skip to main content
  • Home
  • Development
  • Documentation
  • Donate
  • Operational login
  • Browse the archive

swh logo
SoftwareHeritage
Software
Heritage
Archive
Features
  • Search

  • Downloads

  • Save code now

  • Add forge now

  • Help

https://gitlab.com/mcoavoux/mtgpy-release-findings-2021.git
04 December 2021, 19:36:23 UTC
  • Code
  • Branches (1)
  • Releases (1)
  • Visits
    • Branches
    • Releases
    • HEAD
    • refs/heads/master
    • c9972219cd75049269d26632d2bb79619d661298
    • v1.0
  • a387b78
  • /
  • src
  • /
  • tree.py
Raw File Download
Take a new snapshot of a software origin

If the archived software origin currently browsed is not synchronized with its upstream version (for instance when new commits have been issued), you can explicitly request Software Heritage to take a new snapshot of it.

Use the form below to proceed. Once a request has been submitted and accepted, it will be processed as soon as possible. You can then check its processing state by visiting this dedicated page.
swh spinner

Processing "take a new snapshot" request ...

Permalinks

To reference or cite the objects present in the Software Heritage archive, permalinks based on SoftWare Hash IDentifiers (SWHIDs) must be used.
Select below a type of object currently browsed in order to display its associated SWHID and permalink.

  • content
  • directory
  • revision
  • snapshot
origin badgecontent badge Iframe embedding
swh:1:cnt:8d15beb9e55a62a5180f107c99c995d3ffd40a30
origin badgedirectory badge Iframe embedding
swh:1:dir:18c8d5ffba1b141866b03a2717f92cafe55d5c89
origin badgerevision badge
swh:1:rev:c9972219cd75049269d26632d2bb79619d661298
origin badgesnapshot badge
swh:1:snp:c3b19ab77fec904d36694903d5dade0c8b1c98fc
Citations

This interface enables to generate software citations, provided that the root directory of browsed objects contains a citation.cff or codemeta.json file.
Select below a type of object currently browsed in order to generate citations for them.

  • content
  • directory
  • revision
  • snapshot
Generate software citation in BibTex format (requires biblatex-software package)
Generating citation ...
Generate software citation in BibTex format (requires biblatex-software package)
Generating citation ...
Generate software citation in BibTex format (requires biblatex-software package)
Generating citation ...
Generate software citation in BibTex format (requires biblatex-software package)
Generating citation ...
Tip revision: c9972219cd75049269d26632d2bb79619d661298 authored by mcoavoux on 20 May 2021, 13:04:44 UTC
up readme
Tip revision: c997221
tree.py

class Token:
    # Leaf of a tree
    header = None
    def __init__(self, token, i, features=None):
        self.token = token
        self.features = features # Only used for POS tags for now which should be self.features[0]
        self.i = i
        self.parent = None
    
    def get_tag(self):
        if len(self.features) > 0:
            return self.features[0]
        return None

    def set_tag(self, tag):
        self.features[0] = tag

    def is_leaf(self):
        return True

    def get_span(self):
        return {self.i}

    def __str__(self):
        return "({} {}={})".format(self.features[0], self.i, self.token)

class Tree:
    def __init__(self, label, children):
        self.label = label
        self.children = sorted(children, key = lambda x: min(x.get_span()))
        self.span = {i for c in self.children for i in c.get_span()}
        self.parent = None
        for c in self.children:
            c.parent = self

    def is_leaf(self):
        assert(self.children != [])
        return False

    def get_span(self):
        return self.span
   
    def get_yield(self, tokens):
        # Updates list of tokens
        for c in self.children:
            if c.is_leaf():
                tokens.append(c)
            else:
                c.get_yield(tokens)
    
    def merge_unaries(self):
        # Collapse unary nodes
        for c in self.children:
            if not c.is_leaf():
                c.merge_unaries()

        if len(self.children) == 1 and not self.children[0].is_leaf():
            c = self.children[0]
            self.label = "{}@{}".format(self.label, c.label)
            self.children = c.children
            for c in self.children:
                c.parent = self

    def expand_unaries(self):
        # Cancel unary node collapse
        for c in self.children:
            if not c.is_leaf():
                c.expand_unaries()

        if "@" in self.label:
            split_labels = self.label.split("@")
            t = Tree(split_labels[-1], self.children)
            for l in reversed(split_labels[1:-1]):
                t = Tree(l, [t])
            self.label = split_labels[0]
            self.children = [t]
            t.parent = self

    def get_constituents(self, constituents):
        # Update set of constituents
        constituents.add((self.label, tuple(sorted(self.span))))
        for c in self.children:
            if not c.is_leaf():
                c.get_constituents(constituents)

    def __str__(self):
        return "({} {})".format(self.label, " ".join([str(c) for c in self.children]))

def get_yield(tree):
    # Returns list of tokens in the tree (in surface order)
    tokens = []
    tree.get_yield(tokens)
    return sorted(tokens, key = lambda x: min(x.get_span()))

def get_constituents(tree, filter_root=False):
    # Returns a set of constituents in the tree
    # Ignores root labels (from PTB, Negra, and Tiger corpora) if filter_root
    constituents = set()
    tree.get_constituents(constituents)
    if filter_root:
        constituents = {(c, i) for c, i in constituents if c not in {'ROOT', 'VROOT', 'TOP'}}
    return constituents


Software Heritage — Copyright (C) 2015–2025, The Software Heritage developers. License: GNU AGPLv3+.
The source code of Software Heritage itself is available on our development forge.
The source code files archived by Software Heritage are available under their own copyright and licenses.
Terms of use: Archive access, API— Contact— JavaScript license information— Web API

back to top