https://github.com/cran/XML
Tip revision: eb6c6e359b36cd19faaf4374f6db416b3ef1b259 authored by Duncan Temple Lang on 14 September 2006, 00:00:00 UTC
version 0.99-93
version 0.99-93
Tip revision: eb6c6e3
Changes
Version 0.99-93
Changes from Martin Morgan
* import normalizePath from utils.
* Changes to configure.win to find 3rd party DLLs in bin/ directory, not lib/
Version 0.99-92
* Fix for setting DTD entity field uncovered by the strict type checking in R internals.
Version 0.99-91
* Added an encoding argument to saveXML(), initially for use in the Sxslt package.
Version 0.99-9
* Example of using namespaces in getNodeSet()
* Examples for xmlHashTree().
Version 0.99-8
* Introduced initial version of flat trees for storing the DOM in a
non-hierarchical data structure in R. This allows us to work with
a mutable tree and to perform certain operations across all the
nodes more efficiently, i.e. non-recursively. Importantly, one
can find the parent node of a given node in the tree which is not
possible with the list of list approach. It does mean more
computation for some common operations, specifically parsing.
Indeed, it can be 25 times slower for a non-trivial file, i.e. one
with. However, for a file with 7700 nodes, it still only takes 2
1/2 seconds. So there is a trade-off. While there are a few
versions in the code, xmlHashTree() is the one to use for speed
reasons. xmlFlatListTree() is another and xmlFlatTree() is
excruciatingly slow. See tests/timings.R for some comparisons.
xmlGetElementsByTagName and other facilities work on these types
of trees.
More functions and methods can and should be provided to work with
these trees if they turn out to be used in any significant way.
* add the R attribute 'namespaces' to an XML node's attributes
vector so that one can differentiate between conflicting attribute
names with different namespaces.
* added parseURI() to return the elements of a URI from a string.
Version 0.99-7
* Example of reading HTML tables using XPath and internal nodes in bondsTables.R
* Some additional methods for XMLInternalNode.
Version 0.99-6
* configure does not require the GNU sed, but can use any version of sed now that the
use of + in the regular expression has been removed.
Version 0.99-5
* Added append.XMLNode and append.xmlNode to the exported symbols from the NAMESPACE
file.
Version 0.99-4
* Fix for addComment() in xmlOutputDOM().
* Removed all the compilation warnings about interchanging xmlChar* and char*.
Version 0.99-3
* Added support in print methods for XML objects for indent = FALSE,
and tagSeparator, which defaults to "\n". These can be used to print
a faithful representation of an original XML document, but only when
used in combination with
xmlTreeParse( skipBlanks = FALSE, trim = FALSE)
Version 0.99-2
* Problems compiling with libxml2-2.5.11 and libxml2-2.6.{1,2}, so
we now test for a recent version of libxml. The test uses sed -r
which may cause problems. If one really wants to avoid the tests
set the environment variable FORCE_XML2 to any value before running
R CMD INSTALL XML.
* Documentation for getNodeSet() didn't refer to the new namespaces argument.
Version 0.99-1
* getNodeSet() takes a namespaces argument which is named character vector of
prefix = URI pairs of namespaces used in the XPath expression.
* Handlers for xmlEventParse() can include startDocument and endDocument elements
to catch those particular events. Useful for closing connections and general cleanup,
especially in the "pull" data source, i.e. connections or functions.
* xmlEventParse() when called with a function as the data source now doesn't have
a new line appended to each string returned to the parser by the function.
* Passing a connection to xmlEventParse() now uses a regular R function to call
readLines(con, 1) and no longer does this via C code to call readLines().
* Fix to the example in xmlEventParse() using the state variable.
Version 0.99-0
* Implementation for the endElement in the xmlEventParse() for saxVersion == 2.
* In xmlEventParse( , saxVersion = 2), the namespaces come as a named vector
in the fourth argument.
Version 0.98-1
* Messages from errors are now more informative. Using saxVersion = 2 in xmlEventParse(), you get
get the line and column information about the error.
Version 0.98
* Added saxVersion parameter to xmlEventParse() to control which interface is used at the C level.
This changes the arguments to the startElement handler, adding the namespace for the
element.
* Added xmlValidity() function to set the value of the default validity action. This allows us to do the
setting in the R code. This is currently not exported.
* Added recursive parameter to xmlElementsByTagName() function. This provides functionality
similar to getElementsByTagName() in XML parsing APIs for other languages.
* xmlTreeParse() called with no handlers and useInternalNodes returns a reference to the
C-level xmlDocPtr instance. This is an object of class "XMLInternalDocument". This can be
used in much the same way as the regular "XMLDocument" tree returned by xmlTreeParse,
e.g. xmlRoot, etc.
* Added getNodeSet() to evaluate XPath expressions on an XMLInternalDocument object.
* Added a validate parameter to the xmlEventParse() function.
Version 0.97-8
* Fix error where CDATA nodes and potentially other types of nodes (without element names) were being
omitted from the R tree in a simple call to xmlTreeParse("filename") (i.e. with no handlers).
Version 0.97-7
* Documentation updates.
Version 0.97-6
* useInternalNodes added to xmlTreeParse() and htmlTreeParse().
This allows one to avoid the overhead of converting the contents of nodes to
R objects for each handler function call. Also, can access parents, siblings,
etc. from within a handler function.
* Included parameterizations for Windows from Uwe Ligges to aid automated-building
and finding the libxml DLL at run time.
Version 0.97-5
* Methods for accessing component of XMLInternalDocument and XMLInternalNode objects,
e.g. xmlName, xmlNamespace, xmlAttrs, xmlChildren
* saveXML.XMLInternalDOM now supports specification of a Doctype (see Doctype).
* saveXML uses NextMethod and arguments are transferred. Identified by Vincent Carey.
* Suppress warnings from R CMD check.
* Change of the output file in saveXML() example to avoid conflict with Microsoft
Windows use of name con.xml.
Version 0.97-4
* Quote URI values in namespace definitions in print.XMLNode.
Version 0.97-3
* Added a method for xmlRoot for HTMLDocument
* Changed the maintainer email address.
Version 0.97-2
* Added cdata to the collection of functions that are used in the handlers
for xmlEventParse(). Omission identified by Jeff Gentry.
* Fixed the maintainer email address to duncan@wald.ucdavis.edu
Version 0.97-1
* Put the correct S3method declarations in the NAMESPACE.
Version 0.97-0
* Using a NAMESPACE for the package
Version 0.96-0
* Using libxml2 by default rather than libxml.
* Fixed typo. in PACKAGE when initializing the library.
Version 0.95-7
* When creating a namespace identifier, if the namespace doesn't have an href, then we put
in an <NA> string.
Version 0.95-6
* Documentation updates for synchronization with the code.
Version 0.95-5
* Trivial bug of including extra arguments in call to UseMethod for
dtdElementValidEntry that generated warnings.
Version 0.95-4
* Configuration now tries to find libxml 1, then libxml 2 unless explicitly
instructed to find libxml 2 via --with-libxml2. So the change is to pick
up libxml 2 if libxml 1 is not found rather than signal an error.
Version 0.95-3
* Remove the need to define xmlParserError. Instead, set the value of the error
routine/function pointer to our error handler in the different default handlers
in libxml. We now initialize these default objects when we load the library.
* When setting the environment variables LIBXML_INCDIR and LIBXML_LIBDIR, one
needs to specify the -I and -L prefixes for the compiler and linker respectively
in front of directory names.
* Detect whether the routine for xmlHashScan (in libxml2) provides a return value
or not. This changed in version 2.4.21 of libxml2.
Version 0.95-2
* Configuration detects Darwin and handles multiplicity of xmlParserError
symbol.
Version 0.95-1
* Configuration now supports the specification of the xml-config script
to use via the environment variable XML_CONFIG or the --with-xml-config
as in --with-xml-config=xml2-config
* Recognize file:/// prefix as URL and not switch to treating file name as
XML text.
Version 0.95-0
* Event-driven parsing (SAX) can take a connection object or a function
that is called when the parser needs more input. See the documentation
for xmlEventParse().
* Classes and methods explicitly created during the installation.
This will cause problems with namespaces until the saving of the image
model works with namespaces.
Version 0.94-1
* Minor change to configuration script to avoid -L-L in specification of
directory for XML library (libxml).
Version 0.94-0
* Use registration of C routines
* Added methods for saveXML for XMLNode and XMLOutputStream objects.
Version 0.93-4
* replaceEntities argument for xmlEventParse.
* S4 SAX methods assigned to the correct database.
Version 0.93-3
* Correct support for DTDs and namespaces in the internal nodes
used in xmlTree(). Errors identified by Vincent Carey.
Version 0.93-2
* Bug in trimming white space discovered by Ott Toomet.
Version 0.93-1
* Documentation updates. Included xmlGetAttr.Rd.
Version 0.93-0
* Added toString.XMLNode
* Fixed the printing of degenerate namespaces in an XML node,
i.e. the spurious `:'.
Version 0.92-2
* Fixed C bug caused by using namespace without a suffix,
e.g. xmlns="http:...." assumed prefix was present.
Thanks to David Meyer.
Version 0.92-1
* Display the namespace definitions when printing an XMLNode object.
* New addAttributeNamespaces argument for xmlTreeParse() that controls whether
namespaces are included in attribute names.
Version 0.92-0
* XMLNode class now contains a field for namespace definitions
The `namespace' field is a character string identifying the prefix's
namespace. The `namespaceDefinition' field contains the full definitions
of each of the namespaces defined within a node.
* Printing of XLM nodes displays the namespace.
* xmlName() takes a `full' argument that controls whether the
namespace prefix is prepended to the tag name.
Version 0.91-0
* Added a mechanism to the SAX parser to allow a state object
be passed between the callbacks and returned as the result of
the parsing. This avoids the need for closures. Also, works
with S4 classes and the genericSAXHandlers() methods by allowing
one to write methods for these generic callbacks that dispatch
based on the type of the state object.
* Fix to make work properly with S4 class system.
Version 0.9-1
* Formatting of the help files to avoid long lines
identified by Ott Toomet
* Addition of `ignoreComments' argument for xmlValue()
* Date in the DESCRIPTION file corrected (thanks to Doug Bates).
Version 0.9-0
* Added addCData() and addPI() to the handlers of the different
XMLOutputStream classes.
Code for XMLInternalDOM (i.e. xmlTree()) from Byron Ellis.
* print() method for XMLProcessingInstruction node has the terminating `?'
as in <?pi-name text ?>.
Version 0.8-2
* Changes to support libxml2-2.4.21 (specifically the issues with
the headers and parse error regarding xmlValidCtxt). Thanks to
Wolfgang Huber for identifying this.
* Ignoring R_VERSION now, so dependency is R >= 1.2.0
Version 0.8-1
* Added an `attrs' argument to the xmlOutputBuffer and xmlTree
functions for specifying the top-level node.
Version 0.8-0
* xmlValue() extended to work recursively if a node has
only one child.
* T and F replaced by TRUE and FALSE
Version 0.7-4
* Support for Windows
Version 0.7-3
* Documents without <DOCTYPE ..> are handled correctly.
* Configuration tweak to set LD_LIBRARY_PATH to handle the case
that the user specifies LIBXML_LIBDIR and it is needed to run the
version test.
* Keyword XML changed to IO.
Version 0.7-2
* Fix for printing XMLNode objects to handle comments and elements
with name "text". Identified by Andrew Schuh.
Version 0.7-1
* Minor fixes for passing R CMD check.
Version 0.7-0
* Generating XML trees using internal libxml structures:
xmlTree(), newXMLDoc(), newXMLNode(), saveXML().
* Support parsing HTML (htmlTreeParse()) using DOM.
Suggestion from Luis Torgo.
* Additional updates for libxml2, relating to DTDs.
Version 0.6-3
* Installation using --with-xml2 now attempts to link against libxml2.so
and the appropriate header files.
* Use libxml's xml-config or xml2-config scripts if these are available.
Version 0.6
* xmlDOMApply for recursively applying a function to each node in a tree.
Version 0.5-1
* simplification of xmlOutputBuffer so that it doesn't put
the namespace definition in each and every tag.
* configuration changes to support libxml2-2.3.6
(look for libxml2, check if xmlHashSize is available)
* now dropping nodes if the handler function returns NULL.
Updated documentation.
* spelling correction in the documentation
Version 0.5
* xmlOutputBuffer now accepts a connection.
* Fixes for using libxml2, specifically 2.2.12.
Also works for libxml2.2.8
* Enhanced configuration script to determine what features are available.
Version 0.4
* `namespace' handler in xmlTreeParse is called when a namespace
declaration is encountered. This is called before the child nodes
are processed.
* More documentation, in Tour.
* xmlValue, xmlApply, xmlSApply, xmlRoot, xmlNamespace, length, names
* Constructors for different types of nodes: XMLNode, XMLTextNode, XMLProcessingInstruction.
* Methods for print(), subsetting ([ and [[), accessing the fields
in an XMLNode object.
* New classes for the different node types (e.g. XMLTextNode)
* Event driven parsing available via libxml. Expat is not needed but
can be used.
* Document sources can be URLs (ftp and http) when using the libxml parser.
* Examples for processing MathML and SVG files. See examples/ directory.
* Examples for event driven parsing.
* Class of result from xmlTreeParse is XMLDocument.
* Comments, Entities, Text, etc. inherit from XMLNode
in addition to defining their own XML<type> class.