Converting XML To R Dataframes

Kevin Feasel

2018-07-02

R

Joachim Zuckarelli announces a new package:

The new R package flatxml provides functions to easily deal with XML files. When parsing an XML document fxml_importXMLFlat produces a special dataframe that is ‘flat’ by its very nature but contains all necessary information about the hierarchical structure of the underlying XML document (for details on the dataframe see the reference for the fxml_importXMLFlat function). flatxml offers a set of functions to work with this dataframe.

Apart from representing the XML document in a dataframe structure, there is yet another way in which flatxml relates to dataframes: the fxml_toDataFrame function can be used to extract data from an XML document into a dataframe, e.g. to work on the data with statistical functions. Because in this case there is no need to represent the XML document structure as such (it’s all about the data contained in the document), there is no representation of the hierarchical structure of the document any more, it’s just a normal dataframe.

Very interesting.  I’ve struggled a bit more with the xml2 package than I’d care to admit, so I might give this one a try.  H/T R-bloggers

Related Posts

AzureR Packages In Cran

David Smith points out that the Azure packages for R are now in CRAN: The suite of AzureR packages for interfacing with Azure services from R is now available on CRAN. If you missed the earlier announcements, this means you can now use the install.packages function in R to install these packages, rather than having to install from the […]

Read More

R htmlTable Updates

Max Gordon has some updates to the htmlTable package: Even more common than grouping columns is probably grouping data by rows. The htmlTable allows you to do this by rgroup and tspanner. The most common approach is by using rgroupas the first row-grouping element but with larger tables you frequently want to separate concepts into separate sections. Here’s a […]

Read More

Categories

July 2018
MTWTFSS
« Jun Aug »
 1
2345678
9101112131415
16171819202122
23242526272829
3031