Author: Kevin Feasel

Saving Docker Images

Published 2017-03-08 by Kevin Feasel

Andrew Pruski shows how to save a Docker image:

What I’m going to do now is save my custom image to a location [C:\temp] on my host server. So I’ll run: –

1

docker save -o myfirstimage.tar myfirstimage

N.B. – I’m saving the file as a .tar as this seems to be the format that works best for loading the image. I’ve tried exporting to other formats (.zip) but it always seems to become corrupt. The .tar format has always worked for me so I’ll stick with it.

If I’m understanding Andrew correctly, that’s because docker save will only generate a tar archive. Keep reading for more details, including how to restore a Docker image to another machine.

Comments closed

Using Desired State Configuration To Install SQL Server

Published 2017-03-08 by Kevin Feasel

Chris Lumnah shows how to use Powershell Desired State Configuration to automate installation and configuration of a new SQL Server instance:

So the installation of SQL Server is now fairly straightforward. The wizard does a nice job of guiding you along the way. 2016 even includes best practice suggestions for tempdb and instance file initialization. Along the way, Microsoft as given us ways to automate the installation of SQL Server. You can sysprep an instance, but this does not really automate the installation. It just helps create a template of an instance. At the end of the day, you still need to do things manually. You can also use a configuration file to assist here. This is a great step forward, but it does not allow for all of the things you need to do to configure a SQL server.

Powershell does. Desired State Configuration (DSC) is functionality built into Powershell that allows for the installation and configuration of a SQL Server.

Chris includes his script as well as a link for more information on DSC in case you aren’t familiar with the concept.

Comments closed

Virtual Function Tables

Published 2017-03-07 by Kevin Feasel

Ewald Cress continues his descent into the bowels of SQL Server, this time looking at vftables:

The first and simpler GetData() overload doesn’t show up in a vftable, but the second does. Oddly, the vftable for the second one lives at an offset of +0x1448 into the class instance – you’re going to have to trust me on this one. So the rcx passed into either variation will actually be the same one! But if the virtual version is called, it needs to find its position relative to +0x1448 dynamically, by doing a data lookup. We can confirm that by peeking at what is saved four bytes earlier at +0x1444, and that is indeed the value zero.

Ewald explains how this is vital to multiple inheritance and this post is only guaranteed to make your brain hurt a little bit.

Comments closed

Garbage Collection In Hadoop

Published 2017-03-07 by Kevin Feasel

Ranjan Banerjee explains how the Java garbage collector works, using Hadoop as an example:

The reason why we all love Java is due to the fact that we can be careless with memory creations and the work of cleaning the mess is performed by the JVM. On a high level, Java heap memory is classified into two phases:

1) Young (eden) space

2)Old space

The eden space is where newly created objects goto. There are various algorithms for garbage collection, but all of them try to first free memory from the young space and for those long lasting memory objects, they are transferred to the old space.

One common issue that can be noticed in running Map Reduce Applications are GC overhead limit exceeded.

Read on for more, including where you can find GC logs.

Comments closed

Time Series Errors

Published 2017-03-07 by Kevin Feasel

Alex Smolyanskaya explains some common errors when doing time series analysis:

Non-zero model error indicates that our model is missing explanatory features. In practice, we don’t expect to get rid of all model error—there will be some error in the forecast from unavoidable natural variation. Natural variation should reflect all the stuff we will probably never capture with our model, like measurement error, unpredictable external market forces, and so on. The distribution of error should be close to normal and, ideally, have a small mean. We get evidence that an important explanatory variable is missing from the model when we find that the model error doesn’t look like simple natural variation—if the distribution of errors skews one way or another, there are more outliers than expected, or if the mean is unpleasantly large. When this happens we should try to identify and correct any missing or incorrect model features.

It’s an interesting article, especially the bit about cross-validation, which is a perfectly acceptable technique in non-time series models.

Comments closed

Data Lake 3.0

Published 2017-03-07 by Kevin Feasel

Vinod Kumar Vavilapalli describes the modern data lake:

During the past few years though, end-to-end business use-cases have evolved to another level.

The end-to-end business problems are now mostly solved by multiple applications working together.

As the platform matured, users have increasingly started wanting to solely focus on the business application layers, and getting impatient to get on with developing their main business-logic.

However, YARN, and for that matter any other related platform, hasn’t catered to this evolving need, leaving the users to unwillingly get involved in the painstaking details of wiring applications together, keeping them up, manually scaling them as need arises etc.

Manual plumbing of all these different colored services in tiresome! Further, there is a clear need for seamless aggregate deployment, lifecycle management and application wireup. This is the gap that needs to be bridged between what these end-to-end business use-cases need from the platform and what the platform offers today. If these features are provided, then the business use cases authors can singularly focus on the business logic.

This is a higher-level “where are we at?” kind of post which could be helpful if you’re new to the data lake concept.

Comments closed

Disabling VMware In-Guest Clock Updates

Published 2017-03-07 by Kevin Feasel

David Klee explains when VMware will update the internal clocks for VMs and shows how to disable that:

These time sync actions can move a guest’s time backwards as well as forwards. More details about this conflict of settings are found in VMware KB1189. If the host time is out of sync, such as when a BIOS battery fails, bad things can happen. This action is extremely detrimental to the state of SQL Server high availability features, such as Availability Groups and Failover Cluster Instances, which depend on the in-guest time closely aligning with the Active Directory synchronized time. This action must be explicitly disabled to ensure that these maintenance items do not trigger an unexpected failover of the SQL Server HA solution. To disable this action, perform the following tasks.

I’d imagine that the ideal would be everything being synched to a single NTP source.

Comments closed

Retaining system_health Metrics

Published 2017-03-07 by Kevin Feasel

Taiob Ali looks at the system_health Extended Events session:

What this article does not tell you is your individual file size is 5 MB and number of maximum rollover file is 4. Meaning you will only get 20 MB of data. Once that limit is reached your older data is lost and there is no way to recover.

Often time you have an issue with the application pointing to SQL Server and you want to diagnose the problem with system_health files. You find that files already rolled over and information you are looking for is not available. I will explain how you can solve this problem and retain system_health session files for longer period.

Read on for the solution.

Comments closed

Using Prophet For Stock Price Predictions

Published 2017-03-07 by Kevin Feasel

Marcelo Perlin looks at Facebook’s Prophet to see if it works well for predicting stock price movements:

The previous histogram shows the total return from randomly generated signals in 10^{4} simulations. The vertical line is the result from using prophet. As you can see, it is a bit higher than the average of the distribution. The total return from prophet is lower than the return of the naive strategy in 27.5 percent of the simulations. This is not a bad result. But, notice that we didn’t add trading or liquidity costs to the analysis, which will make the total returns worse.

The main results of this simple study are clear: prophet is bad at point forecasts for returns but does quite better in directional predictions. It might be interesting to test it further, with more data, adding trading costs, other forecasting setups, and see if the results hold.

This is a very interesting article, worth reading. H/T R Bloggers

Comments closed

Copying Azure SQL Databases Between Subscriptions

Published 2017-03-07 by Kevin Feasel

Arun Sirpal shows that it’s pretty easy to copy an Azure SQL Database from one subscription to another:

If you ever need to move a copy of a SQL database in Azure across servers then here is a quick easy way.

So let’s say you need to take a copy of database called [Rack] within Subscription A that is on server ABCSQL1 and name it database [NewRack] within subscription B on server called RBARSQL1 (The SQL Servers are in totally different data centers too).

Read on for the answer.

Comments closed

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31