Skip to the content.

Introduction to NLP Course

version release language last-update last-update

Free hands-on course with the implementation (in Python) and description of several Natural Language Processing (NLP) algorithms and techniques, on several modern platforms and libraries.

Although it is not intended to have the formal rigor of a book, we tried to be as faithful as possible to the original algorithms and methods, only adding variants, when these were necessary for didactic purposes.

Quick Start

The best way to get the most out of this course is to carefully read each selected problem, try to think of a possible solution (language independent) and then look at the proposed Python code and try to reproduce it in your favorite IDE. If you already have knowledge of the Python language, then you can go directly to programming your solution and then compare it with the one proposed in the course.

If you want to play with these notebooks online without having to install any library or configure hardware, you can use the following service:

What is NLP?

Natural Language Processing project with Python frameworks. NLP is a discipline where computer science, artificial intelligence and cognitive logic are intercepted, with the objective that machines can read and understand our language for decision making.

NLP Header

Contents

1. NLP with spaCy
2. Semantic Enrichment of Entities
3. Spell Checker/Corrector
4. Word Embedding with Gensim
5. Relationship between Words
6. Introduction to Stanza (Stanford CoreNLP)

Data

Books in plain text, both in English and Spanish. The enrichment of the entities is done from DBpedia.

Python Dependencies

    conda install -c conda-forge spacy
    python -m spacy download en_core_web_sm
    python -m spacy download es_core_news_sm
    conda install -c conda-forge sparqlwrapper
    pip install pyspellchecker
    conda install -c anaconda gensim
    conda install -c conda-forge wordcloud
	conda install -c conda-forge stanza

Software Version

Contributing and Feedback

Any kind of feedback/suggestions would be greatly appreciated (algorithm design, documentation, improvement ideas, spelling mistakes, etc…). If you want to make a contribution to the course you can do it through a PR.

Author

License

This project is licensed under the terms of the MIT license.

Acknowledgments

I would like to thank Project Gutenberg for sharing the books in English and Peter Norvig for the spell checker algorithm.