Press "Enter" to skip to content

Month: December 2017

Protecting Sensitive Data In Docker

Jatin Demla shows how to create Docker secrets:

Managing the password, access tokens and private keys are being tedious in the application. Any small mistakes accidentally expose all the secret information. Even storing such thing in docker images can be easily accessible one should just run the image in the interactive mode container and all your application code is available in containers. Docker provides secrets to protect all secret data.

This blog explains the low-level of storage information as well as secured access to docker secret. so, let’s get started.

Read the whole thing, especially if you’ve gone container-happy.

Comments closed

Taking A Random Walk

Dan Goldstein describes the basics of Brownian motion:

I was sitting in a bagel shop on Saturday with my 9 year old daughter. We had brought along hexagonal graph paper and a six sided die. We decided that we would choose a hexagon in the middle of the page and then roll the die to determine a direction:

1 up (North)
2 diagonal to the upper right (Northeast)
3 diagonal to the lower right (Southeast)
4 down (South)
5 diagonal to the lower left (Southwest)
6 diagonal to the upper left (Northwest)

Our first roll was a six so we drew a line to the hexagon northwest of where we started. That was the first “step.”

After a few rolls we found ourselves coming back along a path we had gone down before. We decided to draw a second line close to the first in those cases.

We did this about 50 times. The results are pictured above, along with kid hands for scale.

Javi Fernandez-Lopez then shows how to generate an animated GIF displaying Brownian motion:

Last Monday we celebrated a “Scientific Marathon” at Royal Botanic Garden in Madrid, a kind of mini-conference to talk about our research. I was talking about the relation between fungal spore size and environmental variables such as temperature and precipitation. To make my presentation more friendly, I created a GIF to explain the Brownian Motion model. In evolutionary biology, we can use this model to simulate the random variation of a continuous trait through time. Under this model, we can notice how closer species tend to maintain closer trait values due to shared evolutionary history. You have a lot of information about Brownian Motion models in evolutionary biology everywhere!

Another place that this is useful is in describing stock market movements in the short run.

Comments closed

API Security: Contrasting Token-Based With Key-Based Security

Vincent-Philippe Lauzon contrasts token-based security (like OAuth) with key-based security for locking down APIs:

Instead of giving a nice & neatly formatted pros & cons table where all the pros have a corresponding cons, let’s just discuss the major aspects:  security & complexity.

Basically, in general, OAuth is more secure but more complex for both clients (i.e. consumer) and services.

Why is OAuth more secure?  Relying parties never see credentials & secrets in an OAuth authentication scheme.  They see a token.  Token are revoked after a while ; often minutes, maximum a few hours.

Read on for more.  My preference is OAuth, but it’s not always trivial to set up.

Comments closed

SQL Vulnerability Assessment

Ronit Reger shows off the new SQL Vulnerability Assessment, available in SQL Server 2012 and later:

Not only does VA expose some of the possible security flaws you have in your database system, it also provides remediation scripts to resolve issues within a couple of mouse clicks. In addition, you can accept specific results as your approved baseline state, and the VA scan report will be customized accordingly to expect these values.

Ronit and Alan Yu also mention it being available via the latest version of Management Studio, 17.4:

The VA service runs a scan directly on your SQL database or server. VA employs a knowledge base of rules that flag security vulnerabilities and deviations from best practices, such as misconfigurations, excessive permissions, and exposed sensitive data. The rule base grows and evolves over time, to reflect the latest security best practices recommended by Microsoft.

Results of the assessment include actionable steps to resolve each issue and provide customized remediation scripts where applicable. An assessment report can be customized for each customer environment and tailored to specific requirements. This process is managed by defining a security Baseline for the assessment results, such that only deviations from the custom Baseline are reported.

VA is supported for SQL Server 2012 and later, and can also be run on Azure SQL Database.

This looks like a good reason to upgrade SSMS.

Comments closed

More Collation Comparison Fun

Solomon Rutzky wraps up his collation comparison series:

Just in case you were wondering: No, the test directly above test does not prove the documentation (as quoted at the top of Part A) correct. Yes, the documentation did state that characters would be converted to the Code Page specified by the Collation of the Database or Column, which does appear to be what is going on here. But, the differences are:

  1. The documentation states that the transformation happens if you reference a Unicode datatype, but what we just saw in the most recent test is the exact opposite:

    • only the NVARCHAR columns of the “Subscript 2” row match because they are still the “Subscript 2” character, while the NVARCHAR columns of the other two rows do not match due to being either “2” or “?”.
    • transformation did occur in the Latin1 and Hebrew VARCHAR columns, which is how it matched both rows with “2” in the Latin1 column and both rows with “?” in the Hebrew column.
  2. In the scenario involving another column where it would help to prefix the string literal with a capital-“N” (i.e. a VARCHAR column using a Collation that specifies a different Code Page than the Database’s Code Page), the Collation of the Database is not used for the transformation; it is only the referenced column’s Collation. The assumption here is that the string literal without the capital-“N” is being used in a Database where the Collation specifies a Code Page that has all of the characters.

  3. In the scenarios where the Database’s Collation, via its specified Code Page, did transform a string literal that was not prefixed with a capital-“N” (the two tests in the previous post), there would have already been unintended behavior no matter how the string literal was used.

I still think it’s a duck.

Comments closed

Using List.Accumulate

Reza Rad shows off a Power Query function which is not available in the UI:

List.Accumulate is a function that can easily save a number of steps in your Power Query transformations, instead of applying multiple steps, you can simply use List.Accumulate to overcome what you want. List.Accumulate function loops through the list and accumulate a value as a result. This function needs usually three parameters; the list itself, seed, and accumulator. Here are parameters explained in details;

best way to learn about seed and accumulator is through some examples, let’s apply some transformations with List.Accumulate and see how these two parameters are working.

Read on to see how to use it.

Comments closed

Optimized Bitmaps On Columnstore Indexes

Joe Obbish digs into bitmap filters and clustered columnstore indexes:

The position of the bitmap has changed so that it’s evaluated after the key lookup. That makes sense because the key lookup returns the column to be filtered against. However, the bitmap filter still reduces the estimated number of key lookups from 3000000 to 3000. This is impossible. The filter can only be applied after the key lookup, so it does not make sense for the bitmap to reduce the number of estimated executions of the key lookup.

Performance is significantly worse with the query now requiring 12199107 logical reads from the rowstore table and 13406 CPU time overall. We can see that the query did three million key lookups:

This is a fairly deep post, so you’ll probably want to check out the Paul White post on bitmaps first.

Comments closed

Units Of Measure In The ShowPlan Schema

Grant Fritchey shows off the ShowPlan Schema:

Because the showplan schema contains notes throughout stating what the units of measure are, what each of the values means. For instance, I can explain why SerialDesiredMemory, DesiredMemory, RequestedMemory are identical:

…Provide memory grant estimate as well as actual runtime memory grant information. Serial required/desired memory attributes are estimated during query compile time for serial execution. The rest of attributes provide estimates and counters for query execution time considering actual degree of parallelism. SerialRequiredMemory: Required memory in KB if the query runs in serial mode. The query will not start without this memory. SerialDesiredMemory: Memory estimated to fit intermediate results in KB if the query runs in serial mode. RequiredMemory: Required memory in KB for the chosen degree of parallelism. If the query runs in serial mode, this is the same as SerialRequiredMemory. …

That’s taken directly from the 2017 schema. The units of measure are KB.

I’d never seen this before, so that’s going on my to-read list.

Comments closed

Kafka Streams And Time-Based Batching

Vladimir Vajda provides a warning for people using Kafka Streams:

To completely understand the problem, we will first go into detail how ingestion and processing occur by default in Kafka Streams. For example purposes, the punctuate method is configured to occur every ten seconds, and in the input stream, we have exactly one message per second. The purpose of the job is to parse input messages, collect them, and, in the punctuate method, do a batch insert in the database, then to send metrics.

After running the Kafka Stream application, the Processor will be created, followed by the initmethod. Here is where all the connections are established. Upon successful start, the application will listen to input topic for incoming messages. It will remain idle until the first message arrives. When the first message arrives, the process method is called — this is where transformations occur and where the result is stored for later use. If no messages are in the input topic, the application will go idle again, waiting for the next message. After each successful process, the application checks if punctuate should be called. In our case, we will have ten process calls followed by one punctuate call, with this cycle repeating indefinitely as long as there are messages.

A pretty obvious behavior, isn’t it? Then why is one bolded?

Read on for more, including how to handle this edge case.

Comments closed

Kafka And GDPR

Ben Stopford has some ideas for using Kafka in a GDPR world:

The simplest way to remove messages from Kafka is to simply let them expire. By default, Kafka will keep data for two weeks, and you can tune this to arbitrarily large periods of time as required. There is also an Admin API that lets you delete messages explicitly if they are older than some specified time or offset. But what if we are keeping data in the log for a longer period of time, say for Event Sourcing architectures or as a source of truth? For this, you can make use of compacted topics, which allow messages to be explicitly deleted or replaced by key.

Data isn’t removed from compacted topics in the same way as in a relational database. Instead, Kafka uses a mechanism closer to those used by Cassandra and HBase where records are marked for removal then later deleted when the compaction process runs. Deleting a message from a compacted topic is as simple as writing a new message to the topic with the key you want to delete and a null value.  When compaction runs the message will be deleted forever.

Click through for more information.

Comments closed