This means that I could have multiple email accounts that I have to use in order to sign into the portal. Using a password manager such as 1Password, not usually a big deal and more of an annoyance rather than a headache.
Within the past month or so, Microsoft has updated the portal to allow me to easily switch accounts. Previously you had to log out of the portal and then log back in.
This is quite convenient. Prior to this change, switching to a different account could goof with other sites I had open (like if I was sending an Outlook e-mail through one account, switching the Azure Portal signed-in account would log me out from Outlook). It’s still not a perfect experience but it’s a lot better.
Or you can go with Amazon RDS (Relational Database Service). This is more of a managed service where Amazon looks after some aspects of your database server for you. In return you give up some of the control you would have with your own server or VM. You can still pick the version of SQL Server you want installed, usually down to which cumulative update you want – though note that RDS normally lags behind the latest box version of SQL by 3 months or so. RDS is what’s known as a PaaS offering (Platform as a Service).
So, what do you give up and what do you gain? Here’s a quick summary of a few things I’ve noticed. This is not intended to be comprehensive and please bear in mind that AWS is a fast-moving beast – changes happen regularly.
There are some good tips here, so check them out.
The major problem of the Lambda architecture is that we have to build two separate pipelines, which can be very complex, and, ultimately, difficult to combine the processing of batch and real-time data, however, it is now possible to overcome such limitation if we have the possibility to change our approach.
Databricks Delta delivers a powerful transactional storage layer by harnessing the power of Apache Spark and Databricks File System (DBFS). It is a single data management tool that combines the scale of a data lake, the reliability and performance of a data warehouse, and the low latency of streaming in a single system. The core abstraction of Databricks Delta is an optimized Spark table that stores data as parquet files in DBFS and maintains a transaction log that tracks changes to the table.
It’s an interesting contrast and I recommend reading the whole thing.
When creating a new PowerApp using the Power BI integration, you get an additional data source – PowerBIIntegration that serves as the connection to the Power BI report. Whenever a filtering action occurs in the Power BI report, this information is available in this property.
During the PowerApps creation action I selected the action to add a new form which in the next step needs to get a connection to the Article table (which holds the additional article details).
Check out the entire series too.
Cloud Shell is a lightweight way to run scripts using either Bash or PowerShell. You can run scripts in a browser using the Azure portal or shell.azure.com, with the Azure mobile app, or using the VS Code Azure Account extension. If you have seen the “Try it now” links in Azure documentation pages, that will direct you to use Cloud Shell.
The rest of this post focuses on using PowerShell with Cloud Shell.
Click through for the demo.
As we’ve discussed many times, the performance of the storage layer has an outsized impact on the total cost of ownership (TCO) for your complete analytics pipeline. This is due to the fact that every percentage point improvement in storage performance results in that same percentage reduction in the requirement for the very expensive compute layer. Given that the disaggregated storage model allows us to scale compute and storage independently, that percentage reduction in compute requirement results in almost the same (compute typically equates to 90 percent of the TCO) reduction in TCO.
So, when I say that ADLS Gen2 provides performance improvements ranging from 10-50 percent, depending on the nature of the workload over existing storage solutions, this equates to VERY significant reductions in the monthly analytics spend. It also has the added benefit of providing your insights sooner!
Check out all of the changes.
Google is doing more SQL, or at least shifting towards relational SQL databases as a way of storing data. At least, some of their engineers see this as a better way to store data for many problems. Since I’m a relational database advocate, I found this to be interesting.
When Google first started to publish information on BigTable and other new ways of dealing with large amounts of data, I felt that these weren’t solutions I’d use or problems that many people had. The idea of Map Reduce is interesting and certainly applicable to the problem space Google had of a global database of sites, but that’s not a problem I’ve ever encountered. Instead, most of the struggles I’ve had with relational systems are still better addressed in a relational system.
Read the whole thing. Note that this is slightly different than Feasel’s Law, as Steve is focusing more on the consistency side of things rather than the interface.
Also, just going to leave this here:
In this post, we’ll discuss how to prevent or mitigate compromise of credentials due to certain classes of vulnerabilities such as Server Side Request Forgery (SSRF) and XML External Entity (XXE) injection. If an attacker has remote code execution (RCE) or local presence on the AWS server, these methods discussed will not prevent compromise. For more information on how the AWS services mentioned work, see the Background section at the end of this post.
Protecting Your Credentials
There are many ways that you can protect your AWS temporary credentials. The two methods covered here are:
Enforcing where API calls are allowed to originate from.
Protecting the EC2 Metadata service so that credentials cannot be retrieved via a vulnerability in an application such as Server Side Request Forgery (SSRF).
Read the whole thing if you’re an AWS user.
We are happy to announce General availability of Business Critical tier in Azure SQL Managed Instance – architectural model built for high-performance and IO demanding databases.
Azure SQL Managed Instance Business Critical tier is built for high performance databases and applications that require low IO latency of 1-2ms in average with up to 100K IOPS that can be achieved using fast local SSD that this tier uses to place database files.
Click through to see what Business Critical tier in particular has to offer.
For me, the more people use Apache Kafka, the more business I get. As I teach Apache Kafka online on Udemy (links at https://kafka-tutorials.com/), the prospect of having an entire user base from AWS wanting to learn Apache Kafka is exciting! And as an Apache Kafka consultant, it’s always more fun to spend time deploying data pipelines than deploying infrastructure.
Unfortunately what AWS released today misses the mark. I think it’s reminiscent of managed services of open source software in AWS overall: they’re released early and lack features that I think should be MVP. In my opinion this will deter future users.
Based on Stephane’s reading, this is a product which should have sat in development for another 3-6 months to flesh out the features, upgrade the version of Kafka used, etc. Definitely read this before jumping on AWS MSK.