Generating Realistic-Looking Data With Markov Chains

Phil Factor shows how to use Markov chain generation in T-SQL to generate realistic-looking country names:

How did we do this? We started with a table that took each word, added two spaces at the beginning and a |, followed by two subsequent spaces, at the end. This allowed us to map the frequency of each three-letter combination in a collection of words. Any language is made up of common combinations of characters with a few wild exceptions. For words to look right, they must follow this distribution. This distribution will change in various parts of a word, so you need all this information.

So what would happen if, instead of feeding the name of countries into the batch, we do the names of people?

My favorite name from the list was Kuwatian Samoa.

Related Posts

Power BI And Many-To-Many Joins

Teo Lachev notes a big improvement to Power BI’s data modeling story: The July 2018 preview of Power BI Desktop delivers two killer preview features that solidify the Power BI position as the best data modeling tool on the market. First, Microsoft relaxes the Power BI relationship limitations by letting you create M:M joins between two tables. […]

Read More

Closure Tables: Graph Data In Relational Form

Phil Factor shows how to use the concept of closure tables to represent graph-style data in a relational database: Closure tables are plain ordinary relational tables that are designed to work easily with relational operations. It is true that useful extensions are provided for SQL Server to deal with hierarchies. The HIERARCHYID data type and the common […]

Read More

Categories

July 2018
MTWTFSS
« Jun Aug »
 1
2345678
9101112131415
16171819202122
23242526272829
3031