Non-English Natural Language Processing

The folks at BNOSAC have announced a new natural language processing toolkit for R:

BNOSAC is happy to announce the release of the udpipe R package (https://bnosac.github.io/udpipe/en) which is a Natural Language Processing toolkit that provides language-agnostic ‘tokenization’, ‘parts of speech tagging’, ‘lemmatization’, ‘morphological feature tagging’ and ‘dependency parsing’ of raw text. Next to text parsing, the package also allows you to train annotation models based on data of ‘treebanks’ in ‘CoNLL-U’ format as provided at http://universaldependencies.org/format.html.

The package provides direct access to language models trained on more than 50 languages.

Click through to check it out.

Related Posts

wrapr 1.5.0 Now On CRAN

John Mount announces wrapr 1.5.0: wrapr includes a lot of tools for writing better R code: let() (let block) %.>% (dot arrow pipe) build_frame() / draw_frame() ( data.frame builders and formatters ) qc() (quoting concatenate) := (named map builder) %?% (coalesce) NEW! %.|% (reduce/expand args) NEW! uniques() (safe unique() replacement) NEW! partition_tables() / execute_parallel() NEW! DebugFnW() (function debug wrappers) λ() (anonymous function builder) John also includes an example using the coalesce operator %?%.

Read More

Methods For Detecting Anomalies In Business Metrics

Sergey Bryl’ gives us four methods for detecting anomalies in business data: In this article, by  business metrics, we mean numerical indicators we regularly measure and use to track and assess the performance of a specific business process. There is a huge variety of business metrics in the industry: from conventional to unique ones. The latter […]

Read More

Categories

January 2018
MTWTFSS
« Dec Feb »
1234567
891011121314
15161718192021
22232425262728
293031