Press "Enter" to skip to content

Category: Data Types

Comparing Numeric Data Types between MySQL and PostgreSQL

Aisha Bukar compares data types:

Numeric data types serve as a foundation in DBMS. It helps represent numbers in different forms which helps in storing and manipulating arithmetic operations on numerical data.

When dealing with numeric data types in databases like MySQL and PostgreSQL, it’s important to understand key concepts like range, precision(fixed-point numbers, floating-point numbers), and storage. This helps you understand how data is stored, processed, and retrieved.

Read on for quick descriptions of the three primary properties of numeric data types (range, precision, and storage), as well as how the two relational database platforms handle numeric data.

Comments closed

Glyphs and Character Encodings in PostgreSQL

Cristophe Pettis has a two-parter (so far). First up is an explanation of several linguistic terms:

This is part one of a series on PostgreSQL and collations, and how to use them without tears. This is an introduction to the general concepts of glyphs, character encodings, collations, and locales.

Part two covers character encodings in PostgreSQL:

From the point of view of the computer, a character string is just a sequence of bytes (maybe terminated by a zero byte, maybe with a length). If the only thing PostgreSQL had to do with character strings was store them and return them to the client, it could just ignore that character encodings even exist.

However, databases don’t just store character strings: they also compare them, build indexes on them, change them to upper case, do regex searches on them, and other things that mean they need know what the characters are and how to manipulate them. So, PostgreSQL needs to know what character encoding applies to the text it is storing.

Read on for a detailed explanation for PostgreSQL. A lot of this also applies to SQL Server, though there are some inevitable differences that pop up.

Comments closed

Thoughts on Blobs in PostgreSQL

Stefanie Janine Stölting shares some information on using blobs in PostgreSQL:

PostgreSQL does not have a BLOB data type as specified in the SQL standard. The nearest implementation is the data type BYTEA. Since PostgreSQL 9.0 it does handle data by standard as hexadecimal data.

Click through to learn more about how that data type works, some of the limitations around it, and why it’s not a smart idea to store blob-like data in PostgreSQL The arguments are very similar to not doing this in SQL Server, and the counter-arguments are similarly there and exceptional in nature.

Comments closed

Dealing with the Lack of Identity Columns in Microsoft Fabric

Nikola Ilic forges a new identity:

If you’ve ever worked with traditional relational database management systems (RDBMS) and/or data warehouses, and you’re now trying to be a “modern data platform professional” and apply your skills in Microsoft Fabric, you may find yourself in uncharted territory. Not only because of the SaaS-ification of the environment, but also due to many puzzling “solutions”, or maybe it’s better to say – lack of the features that we were taking for granted in the “previous” (pre-Fabric) life.

The goal of this article is to introduce you with different approaches for overcoming the limitation of non-existency of the identity columns in Microsoft Fabric. Please keep in mind that all of these approaches are considered workarounds and it may happen that Microsoft in the future provide the out-of-the-box solution

Missing the identity column attribute can be a bit annoying when building out dimensions, so Nikola provides a few tips on how to emulate this functionality.

Comments closed

Data Types and Arithmetical Calculations in DAX

Marco Russo and Alberto Ferrari do some counting:

The VertiPaq engine is basically data type-independent. This means that it does not matter whether a column is a string, a floating point, or a date: because of the dictionary encoding happening inside VertiPaq, all these data types use around the same amount of memory and perform at nearly the same speed.

However, when mixing different data types in the same expression, DAX will likely need to perform conversions between data types. Some of these conversions are nearly free, whereas others require the intervention of the formula engine, with a related performance impact.

We have already written about possible errors occurring during data type conversion here: Understanding numeric data type conversions in DAX and here: Rounding errors with different data types in DAX. The issue with conversion errors is mostly due to the fact that the precisions of fixed decimals (also known as Currency) and decimals (also known as floating point) are different. This article starts with a focus on performance.

Read on to see what Marco and Alberto have for us this time.

Comments closed

Fixing Implicit Conversion without Changing Queries

Vlad Drumea solves a challenge:

Why wouldn’t you be able to change the query?

The two most common scenarios I’ve ran into are:

  • the software vendor does not want to change the code
  • a legacy application that’s no longer maintained and nobody has access to the code base

Read on for Vlad’s solution to a fairly common problem. The real fix, of course, is to use NVARCHAR everywhere and not have to worry about VARCHAR to NVARCHAR conversion. The secondary fix is to get your queries right and make sure your data types are consistent.

Comments closed

Implicit Conversions in SQL Server

Vlad Drumea explains what it means implicitly to convert:

If you’re here, you most likely know what a data type conversion is, but, in short, it’s the operation of converting a value from one data type to another.

There are two types of conversions in SQL Server:

  • explicit – which are done by explicitly applying the CAST and CONVERT functions on a column, variable, or value.
  • implicit – when CAST and CONVERT are not used explicitly, but SQL Server ends up doing the conversation behind the scenes due to two distinct data types being compared.

Read on to learn more about which types of implicit conversion are relevant for performance and what you can do instead.

Comments closed

Data Type Changes in Snowflake

Kevin Wilkie makes some changes:

When working with data, I usually have an idea of what type of data I will push into a field. Sometimes, for whatever reason, it is decided to change the type of data allowed in the field. Today, I want to show how that’s done in Snowflake.

Click through to learn how, and how it’s not quite the same as SQL Server.

Comments closed

Choosing between Data Types

Ben Johnston shares some advice:

An entire post on choosing the correct data types either seems like overkill or much needed and overdue. The perspective might vary based on the databases you’ve worked with recently. I decided to write this after seeing some code with data type decisions that I would classify as questionable. There are many decisions in technology that can be ambiguous, but the correct date type should be based on business rules and a set of technical guidelines. I am going to share my thought process for deciphering the correct type here.

Selecting a data type is an important part of database and table design. The column represents an actual business attribute, is used to support data integrity, or is used for performance considerations. Care should be used when selecting the definition for each column. Choosing the wrong type can impact each of these areas, makes the system difficult to work with, and makes integrations harder than necessary.

Read on for Ben’s selection criteria and further thoughts. Most of it I find quite sensible.

I do, however, strongly disagree on part of Ben’s Unicode character strings recommendation, as I am an NVARCHAR Everywhere kind of guy. My counter-recommendation is always to use NVARCHAR over VARCHAR (I tend to be a bit more flexible about NCHAR vs CHAR, as those are typically for flags versus user-relevant data), and use row-level or page-level compression on indexes wherever it makes sense. If you don’t have any characters outside of your codepage in any row of that data, the size will be the same as with VARCHAR. If you do have the need for special characters, you don’t need to rebuild everything from scratch as part of a half-year (or longer) internationalization program. And if you’re consistent about always using NVARCHAR, you also eliminate implicit conversion risk.

Comments closed