Using FlashText Instead Of RegEx

Kevin Feasel

2018-06-21

Tools

Leona Zhang compares the FlashText Python library to using regular expressions:

If you have done any text/data analysis, you might already be familiar with Regular Expressions (RegEx). RegEx evolved as a necessary tool for text editing. If you are still using RegEx to deal with text processing, then you may have some problems to deal with. Why? When it comes to large-sized texts, the low efficiency of RegEx can make data analysis unacceptably slow.

In this article, we will discuss how you can use FlashText, a Python library that is 100 times faster than RegEx to perform data analysis.

Learn more on the GitHub repo.  I haven’t used this before but I could see it being handy.

Related Posts

In Praise Of Tabular Editor

Teo Lachev shares a positive review of Tabular Editor, a community tool for working with Tabular models: What tool do you use for Analysis Services Tabular development? SSDT right, what else? Here is a little secret. I almost don’t use SSDT anymore, except for limited tasks, such as importing new tables and visualizing relationships. I […]

Read More

A Tool For Analyzing AG Replica Latency

Simon Su has an interesting tool available: I wrote an article to discuss about data movement latency between AG groups: Now I develop a tool to analyze AG log block movement latency between replicas and create report accordingly. Click through for more info and check it out on Github.

Read More

Categories

June 2018
MTWTFSS
« May Jul »
 123
45678910
11121314151617
18192021222324
252627282930