Multiple Result Sets With ML Services

Kevin Feasel

2017-11-07

R

Dave Mason figures out how to create multiple result sets with SQL Server ML Services:

Of course for this strategy to work, I’d have to know ahead of time how many data frames/HTML tables there are. Hmmm. Can dynamic T-SQL help me here? If I could find out at run time how many data frames there are, and which ones I may or may not want, then why not? Here’s some R code that reads HTML tables into a variable as a list of data frames(line 8), iterates through the list (starting at line 18), decides if the HTML table has any data in it (lines 21, 24), and adds the HTML table number (the element number in the list) to a different data frame (line 27). The output shows us we would want HTML tables 1, 2, and 4. (Yeah, I really didn’t want #4. But that can be fixed by enhancing the R code to be more selective. Let’s just go with it for now.)

The method is a bit disappointing (and it’s arguably worse for inputs); I do hope the ML Services team can improve upon this experience.

Related Posts

ElasticMapReduce And RStudio

Tanzir Musabbir demonstrates how to set up Amazon ElasticMapReduce to include an RStudio edge node: RStudio Server provides a browser-based interface for R and a popular tool among data scientists. Data scientist use Apache Spark cluster running on  Amazon EMR to perform distributed training. In a previous blog post, the author showed how you can install RStudio Server on Amazon […]

Read More

Mutating Data Frames Without dplyr

John Mount points out that there is a built-in function to mutate data frames in R: The notation we used above is the “explicit argument” variation we recommend for readability. What a lot of dplyr users do not seem to know is: base-R already has this functionality. The function is called transform(). To demonstrate this, let’s first detach dplyr to show that […]

Read More

Categories

November 2017
MTWTFSS
« Oct Dec »
 12345
6789101112
13141516171819
20212223242526
27282930