https://github.com/kawu/partage4xmg
Raw File
Tip revision: 4d6e404d08f5fb5feb030480e718710313085663 authored by Jakub Waszczuk on 09 August 2019, 09:13:43 UTC
Update README.md
Tip revision: 4d6e404
README.md
ParTAGe4XMG
===========

**ParTAGe4XMG** is a command-line tool which allows to use [ParTAGe][partage], a
Haskell library dedicated to parsing *tree adjoining grammars* (TAGs), with
[XMG][xmg]-generated TAG grammars.


Installation
------------

It is recommanded to install *ParTAGe4XMG* using the
[Haskell Tool Stack][stack], which you will need to download and install on your
machine beforehand.
Then:
* Create an empty directory, which will be dedicated for the ParTAGe source code,
* Clone the `xmg` branch of the [ParTAGe][partage] repository into this empty directory,
* Clone [this][this] repository into the same directory,
* Run `stack install` in the local copy of the `partage4xmg` repository.

Under linux, you can use the following sequence of commands to set up ParTAGe4XMG:

    mkdir partage-src
    cd partage-src
    git clone -b xmg https://github.com/kawu/partage.git
    git clone https://github.com/kawu/partage4xmg.git
    cd partage4xmg
    stack install

The final command will by default install the `partage4xmg` command-line tool in
the `~/.local/bin` directory.  You can either add this directory to your `$PATH`,
or use the full path to run `partage4xmg`:

    $ ~/.local/bin/partage4xmg --help
    

### Update

To update the *ParTAGe4XMG* tool, use `git pull` in both repositories downloaded
during installation and run `stack install --force-dirty` in the local copy of
`partage4xmg`.
Under linux, assuming that you are in the `partage-src` directory, you can use
the following sequence of commands to perform the update:

    cd partage
    git pull
    cd ../partage4xmg
    git pull
    stack install --force-dirty

The usage of `--force-dirty` ensures that `stack` does not overlook any of the
modifications pulled from the upstream repositories.


Examples of usage
-----------------

Provided that you have a grammar `grammar.xml` file, a lexicon `lemma.xml` file
and a morphology `morph.xml` file, you can retrieve all the derived trees for a
given sentence using the following command:

    echo "a sentence to parse" | partage4xmg parse -g grammar.xml -l lemma.xml -m morph.xml -s S
    
where the argument of the `-s` option specifies the axiom symbol.
Note that in this mode the parser can take some time to read the grammar,
especially if the input `.xml` files are big.
If you have several sentences to parse, you can write them in a single file and
provide it as input for the parser, which will then read and process them one by
one.


### Derivations

If you want the parser to pretty print derivations rather than derived trees,
use the `-d` (`--derivations`) option, as in:

    echo "a sentence to parse" | partage4xmg parse -g grammar.xml -l lemma.xml -m morph.xml -s S -d


### Feature structures

To enable support for feature structures (FSs), use the `-u` (`--use-features`)
option. Note that at the moment the parser reports FSs only for derivations and
not for derived trees.

Note that currently the parser provides support for *flat* FSs only (with the
*top*/*bottom* distinction).


### Tokenization

*At the moment the command-line tool does not implement any smart tokenization
strategies and it assumes that you supply the input sentences with words
separated by spaces.*




[this]: https://github.com/kawu/partage4xmg
[partage]: https://github.com/kawu/partage#partage
[xmg]: http://dokufarm.phil.hhu.de/xmg/
[stack]: http://docs.haskellstack.org "Haskell Tool Stack"
back to top