The idea behind this method is to store the latest ETL run time in a config or log table, and then in the next ETL run just load records from the source table that have modified (with their modified date greater than or equal to) after the latest ETL run datetime. This will create the change set for the data table. The change set might contains inserted, updated, or deleted records. to identify which change happened on the record you need to compare the change set with existing records and separate inserted, updated, and deleted records. This change set with the action on each record can be inserted into staging tables, and then be used to apply on the fact table based on appropriate action.
In my experience, the hardest part about this is making sure people update ModifiedTime when they update rows in the table.
Our customers use the SQL platform to store volumes of high-valued data assets for their organization. With Always Encrypted, we want to deliver additional security while ensuring complete integrity of stored user data. To achieve this, in a regular workflow SQL Server/Azure SQL Database disallows an application to insert data directly into an encrypted column. If the application has not enabled Always Encrypted in the connection string, an insert statement targeting an encrypted column will fail. To insert a value into a column protected with Always Encrypted, the application must connect to the database with Always Encrypted enabled in the connection string and pass a plaintext value of the datatype configured for the target column. Subsequently, the SQL client driver encrypts the value and sends the ciphertext to the database. This ensures plaintext data is encrypted and stored appropriately.
Read the whole thing.
I already had my data flow populated with the lookup for MSA. I set it to full cache and entered a query in the connection to initially populate the fields that would be returned (simply my lookup query without the where clause).
Next, I opened the data flow properties, located Expressions and clicked on the ellipses to open the Property Expression Editor. I found the SQLCommand property for my MSA lookup and set it to my package variable that contained my query.
I had issues in the past with full cached lookups and variables. Fortunately, you can get around a lot of problems with expressions.
In my previous post, I provided the design pattern and BIML for a pure Type 2 Slowly Changing Dimension (SCD). When I say “pure Type 2 SCD”, I mean an ETL process that adds a new row for a change in any field in the dimension and never updates a dimension attribute without creating a new row. In practice, I tend to create more hybrid Type 2 SCDs where updates to some attributes require a new row and others update the value on the existing rows. A similar pattern that I find I implement more often than a pure Type 2 is a Type 6 SCD. A Type 6 SCD builds on the Type 2 technique by adding current attributes alongside the historical attributes so related measures can be grouped by the historical or current dimension attribute values. The only difference between what I call a hybrid Type 2 and a Type 6 is that in the Type 6, there are no Type 1 attributes in the dimension that do not also have a Type 2 version in the dimension to capture the historical values.
Dear Mr. President: there are too many types these days. Please eliminate three. I am NOT a crackpot.
“In an age of the SSIS Catalog, why would one ever employ this kind of metadata collection, Andy?” That’s a fair question. The SSIS Catalog is an awesome data integration execution, logging, and externalization engine. There are a handful of use cases, though, where enterprises may opt to continue to execute SSIS packages from the file system or the MSDB database. Perhaps the biggest reason to do so is that’s the way the enterprise is currently executing SSIS. When SSDT-BI converts pre-Catalog-era (2005, 2008, 2008 R2) SSIS packages to current, it imports these packages in a “Package Deployment Model” SSIS Project. This allows developers to upgrade the version of their SSIS project to SSIS 2016 (and enjoy many benefits for so doing) while continuing to execute SSIS packages in the file system. Kudos to the Microsoft SSIS Development Team for this backwards compatibility!
Andy asks the question I wanted to ask and gives a good answer.
The way MergeUi used to work was that it enumerated the schemas and tables in a project and let you create a merge statement in the post-deploy script. The problem with that is that you may want to have different versions of the table for different environments or you may want to put the merge in a different script and either reference it using :r imports or not reference it at all.
The new way it works is that instead of enumerating tables it now enumerates script files (pre, post, included, not included etc) and lets you choose which table to add whether or not the table has been added before.
I’ve not used this tool before, but it’s good to know that it’s available via Github.