Non-English Natural Language Processing

The folks at BNOSAC have announced a new natural language processing toolkit for R:

BNOSAC is happy to announce the release of the udpipe R package (https://bnosac.github.io/udpipe/en) which is a Natural Language Processing toolkit that provides language-agnostic ‘tokenization’, ‘parts of speech tagging’, ‘lemmatization’, ‘morphological feature tagging’ and ‘dependency parsing’ of raw text. Next to text parsing, the package also allows you to train annotation models based on data of ‘treebanks’ in ‘CoNLL-U’ format as provided at http://universaldependencies.org/format.html.

The package provides direct access to language models trained on more than 50 languages.

Click through to check it out.

Related Posts

Loops Versus Apply: Speed Comparison

Mike Spencer compares lapply (single core and its multi-core version) versus a for loop in R: But how fast were they? Can we get faster? Thankfully R provides `system.time()` for timing code execution. In order to get faster, it makes sense to use all the processing power our machines have. The ‘parallel’ library has some […]

Read More

Quoted Concatenation In R

John Mount has a quick tip for R users: Here is an R tip. Need to quote a lot of names at once? Use qc(). This function is part of wrapr.

Read More

Categories

January 2018
MTWTFSS
« Dec Feb »
1234567
891011121314
15161718192021
22232425262728
293031