Day: November 3, 2017

Creating A Poekr AI In Python

Published 2017-11-03 by Kevin Feasel

Kevin Jacobs has a fairly simple framework for building poker-playing bots:

The bot uses Monte Carlo simulations running from a given state. Suppose you start with 2 high cards (two Kings for example), then the chances are high that you will win. The Monte Carlo simulation then simulates a given number of games from that point and evaluates which percentage of games you will win given these cards. If another King shows during the flop, then your chance of winning will increase. The Monte Carlo simulation starting at that point, will yield a higher winning probability since you will win more games on average.

If we run the simulations, you can see that the bot based on Monte Carlo simulations outperforms the always calling bot. If you start with a stack of $100,-, you will on average end with a stack of $120,- (when playing against the always-calling bot).

It’s a start, and an opening for more sophisticated logic and analysis.

Comments closed

The Importance Of Distributions

Published 2017-11-03 by Kevin Feasel

Jocelyn Barker explains distributions using role-playing games as an example:

We see that for the entire curve, our odds of success goes down when we add criticals and for most of the curve, it goes up for 3z8. Lets think about why. We know the guards are more likely to roll a 20 and less likely to roll a 1 from the distribution we made earlier. This happens about 14% of the time, which is pretty common, and when it happens, the rogue has to have a very high modifier and still roll well to overcome it unless they also roll a 20. On the other hand, with 3z8 system, criticals are far less common and everyone rolls close to average more of the time. The expected value for the rogue is ~10.5, where as it is ~14 for the guards, so when everyone performs close to average, the rogue only needs a small modifier to have a reasonable chance of success.

It’s a nice spin on a classic statistics lesson.

Comments closed

BI Announcements At PASS Summit

Published 2017-11-03 by Kevin Feasel

Chris Webb points out some good news released at PASS Summit:

Power BI Report Server

There’s a new release of Power BI Report Server available, and you can read all about it here:

https://powerbi.microsoft.com/en-us/blog/new-version-of-power-bi-report-server-now-available/

The blog post highlights the fact that you can connect to SSRS shared datasets via OData – which is basically what I was talking about here.

This is the big one for me, but there are a few other product updates that Chris covers as well.

Comments closed

Kafka 1.0 Released

Published 2017-11-03 by Kevin Feasel

Neha Narkhede announces Apache Kafka 1.0:

And Kafka 1.0.0 is no mere bump of the version number. The Apache Kafka Project Management Committee and the broader Kafka community has packed a number of valuable enhancements into the release. Let me summarize a few of them:

Since its introduction in version 0.10, the Streams API has become hugely popular among Kafka users, including the likes of Pinterest, Rabobank, Zalando, and The New York Times. In 1.0, the API continues to evolve at a healthy pace. To begin with, the builder API has been improved (KIP-120). A new API has been added to expose the state of active tasks at runtime (KIP-130). The new cogroup API makes it much easier to deal with partitioned aggregates with fewer StateStores and fewer moving parts in your code (KIP-150). Debuggability gets easier with enhancements to the print() and writeAsText() methods (KIP-160). And if that’s not enough, check out KIP-138 and KIP-161 too. For more on streams, check out the Apache Kafka Streams documentation, including some helpful new tutorial videos.
Operating Kafka at scale requires that the system remain observable, and to make that easier, we’ve made a number of improvements to metrics. These are too many to summarize without becoming tedious, but Connect metrics have been significantly improved (KIP-196), a litany of new health check metrics are now exposed (KIP-188), and we now have a global topic and partition count (KIP-168). (That last one sounds so simple, but you’ve wanted it in the past, haven’t you?) Check out KIP-164 and KIP-187 for even more.
We now support Java 9, leading, significantly faster TLS and CRC32C implementations. Over-the-wire encryption will be faster now, which will keep Kafka fast and compute costs low when encryption is enabled.

And there are more where that came from. Congratulations to the Kafka team for hitting this big milestone.

Comments closed

Collation Compatibility And Linked Servers

Published 2017-11-03 by Kevin Feasel

Greg Low points out an important property which can help linked server performance:

The on-premises versions of SQL Server have the ability to connect one server to another via a mechanism called Linked Servers.

Azure-based SQL Server databases can communicate with each other by a mechanism called External Tables. I’ll write more about External Tables soon.

With Linked Servers though, I often hear people describing performance problems and yet there’s a configuration setting that commonly causes this. In Object Explorer below, you can see I have a Linked Server called PARTNER.

Read on for more.

Comments closed

Multiple Data Sets And SQL Server R Services

Published 2017-11-03 by Kevin Feasel

Robert Sheldon has a workaround for SQL Server R Services’s limitation of a single input data set:

Despite the ease with which you can run an R script, the sp_execute_external_script stored procedure has an important limitation. You can specify only one T-SQL query when calling the procedure. Of course, you can create a query that joins multiple tables, but this approach might not always work in your circumstances or might not be appropriate for the analytics you’re trying to perform. Fortunately, you can retrieve additional data directly within the R script.

In this article, we look at how to import data from a SQL Server table and from a .csv file. We also cover how to save data to a .csv file as well as insert that data into a SQL Server table. Being able to incorporate additional data sets or save data in different formats provides us with a great deal of flexibility when working with R Services and allows us to take even greater advantage of the many elements available to the R language for data analytics.

Another option is using the rodbc package to connect back to SQL Server to retrieve more data.

Comments closed

Using Tokens In SQL Agent Jobs

Published 2017-11-03 by Kevin Feasel

Raul Gonzalez builds a SQL Agent job and uses the JOBID token to help him log step output effectively:

It’s usually a good idea to write the output of your SQL Agent jobs to a file, so you can investigate should any issue occur.

But when you define the output file, you need to choose between Appending the output to the same file over and over, or to overwrite it, but that defeats the purpose IMHO.

On the other hand, if you forget to roll over the files, they can grow quite large and then finding any error can become a nightmare.

So some time ago, I wrote a stored procedure that rolls the files for me and place them sorted so it’s easy to find any particular date.

This is a clever solution, but read through to the bottom for a warning.

Comments closed

Finding Queries In Need Of Indexing

Published 2017-11-03 by Kevin Feasel

Jeff Schwartz continues his series on index tuning:

Table 1 shows examples of queries that potentially need tuning based upon the number of executions, total reads, total duration, total CPU time, and average reads per execution. This kind of report immediately focuses attention on the queries that might benefit the most from either index or query tuning. The five queries highlighted in Table 1 underscore these criteria. The ones highlighted in yellow were the worst offenders because their executions collectively performed the most reads with the worst one totaling 3.5 BILLION reads. The ones highlighted in light green and orange accounted for the most CPU time as well as the longest total duration. The one highlighted in slate ran the most times, and the ones highlighted in gray performed the most reads per execution. This information is vital when determining where query and index tuning should be focused.

Jeff walks through some of his data collection and analysis process in this post, making it worth a read.

Comments closed

Security Issue In Oracle Identity Manager

Published 2017-11-03 by Kevin Feasel

Oracle has a security advisory with a CVSS base score of 10.0 (which is pretty awful):

This Security Alert addresses CVE-2017-10151, a vulnerability affecting Oracle Identity Manager. This vulnerability has a CVSS v3 base score of 10.0, and can result in complete compromise of Oracle Identity Manager via an unauthenticated network attack. The Patch Availability Document referenced below provides a full workaround for this vulnerability, and will be updated when patches in addition to the workaround are available.

Due to the severity of this vulnerability, Oracle strongly recommends that customers apply the updates provided by this Security Alert without delay.

Catalin Cimpanu explains:

The affected product is Oracle Identity Manager (OIM), a user management solution that allows enterprises to control what parts of their network employees can access. OIM is part of Oracle’s highly popular Fusion Middleware offering and is one of its most used components.

Oracle describes the issue — tracked under the CVE-2017-10151 identifier — as a “default account” vulnerability, an umbrella term that’s usually used to describe accounts with no password or hardcoded credentials (a.k.a. backdoor accounts).

“This vulnerability is remotely exploitable without authentication, i.e., may be exploited over a network without requiring user credentials,” Oracle said in a security alert.

Oracle has patched this. If you have it installed, please update ASAP.

Comments closed

Azure SQL DB Automatic Tuning FAQ

Published 2017-11-03 by Kevin Feasel

Arun Sirpal has a self-Q&A session regarding Azure SQL Database’s automatic tuning options:

What are the options?

CREATE INDEX that identifies the indexes that may improve performance of your workload, creates the indexes, and verifies that they improve performance of the queries.
DROP INDEX that identifies redundant and duplicate indexes, and indexes that were not used in the long period of time.
PLAN REGRESSION CORRECTION that identifies SQL queries that are using execution plan that are slower than previous good plan, and uses the last known good plan instead of the regressed plan.

Very useful information.

Comments closed