Converting XML To R Dataframes

Kevin Feasel

2018-07-02

R

Joachim Zuckarelli announces a new package:

The new R package flatxml provides functions to easily deal with XML files. When parsing an XML document fxml_importXMLFlat produces a special dataframe that is ‘flat’ by its very nature but contains all necessary information about the hierarchical structure of the underlying XML document (for details on the dataframe see the reference for the fxml_importXMLFlat function). flatxml offers a set of functions to work with this dataframe.

Apart from representing the XML document in a dataframe structure, there is yet another way in which flatxml relates to dataframes: the fxml_toDataFrame function can be used to extract data from an XML document into a dataframe, e.g. to work on the data with statistical functions. Because in this case there is no need to represent the XML document structure as such (it’s all about the data contained in the document), there is no representation of the hierarchical structure of the document any more, it’s just a normal dataframe.

Very interesting.  I’ve struggled a bit more with the xml2 package than I’d care to admit, so I might give this one a try.  H/T R-bloggers

Related Posts

Using cdata To Created Faceted Plots

Nina Zumel shows how to use the cdata package to create faceted ggplot2 plots: First, load the packages and data: library("ggplot2") library("cdata") iris <- data.frame(iris) Now define the data-shaping transform, or control table. The control table is basically a picture that sketches out the final data shape that I want. I want to specify the x and y columns of the plot […]

Read More

Using wrapr For A Consistent Pipe With ggplot2

John Mount shows how you can use the wrapr pipe to perform data processing and building a ggplot2 visual: Now we can run a single pipeline that combines data processing steps and ggplot plot construction. data.frame(x = 1:20) %.>% mutate(., y = cos(3*x)) %.>% ggplot(., aes(x = x, y = y)) %.>% geom_point() %.>% geom_line() %.>% ggtitle("piped ggplot2") Check […]

Read More

Categories

July 2018
MTWTFSS
« Jun Aug »
 1
2345678
9101112131415
16171819202122
23242526272829
3031