If you’ve ever wanted to know what a document or piece of text is about without reading the entire thing, you’ll be glad to know you can do so using keywords. Keywords, in this context, are words or short phrases that concisely describe the contents of a larger text. This post describes the working of a relatively new approach to automatically generating keywords from a given document, called Rapid Automatic Keyword Extraction (RAKE).
(Level: Intermediate)
We’re back! In the previous regex tutorial, we covered character classes and anchors in some detail. We also explained the use of raw strings when defining regular expressions in Python. Today’s post discusses quantifiers in detail and introduces the ideas of alternation and grouping, which are explained by building our own URL regex. It also has several regex challenges based on the concepts covered so far that will test your regex-building skills. Let’s begin.
(Level: Beginner to Intermediate)
This week’s post is about stemming. It’s a little different from our previous articles because we’ll discuss some English grammar before getting into the technicalities and coding. But even before that, try out this quick experiment (10 seconds, max) and you’ll probably immediately understand what stemming is all about.
(Level: Beginner)
“The beginning of wisdom is to call things by their proper name.” – Confucius
Hello there, we apologize for the delay in publishing this article. The last two weeks have been pretty hectic.
Now that you are equipped with the basics of text processing, it is high time that we move to some NLP specific concepts. This week’s article is about Named Entities, as the title suggests. You will understand what they are, why they are important, and how to identify them.
(Level: Beginner)
In the first regex post, we discussed the concept of regular expressions, some of their applications, and made a small program to extract years from a text. We also looked at some important character classes like uppercase letters, word characters, digits and whitespace. In this regex tutorial, we will learn in greater depth about character classes and anchors.
(Level: Beginner)
“Constantly talking isn’t necessarily communicating.” – Charlie Kaufman
So far, we have covered the basics of regular expressions and tokenization. It must be evident by now how simple, yet fundamental these concepts are. Today’s lesson covers another important concept that is almost absolutely essential to any NLP task; stopwords filtering. You will understand what stopwords are, why we need to filter them and how to remove them.
Continue reading
(Level: Beginner)
“A computer is only as smart as its programmer.” – Unknown
Last week’s tutorial covered the basics of regular expressions or regexs, along with some sample code for your understanding. This week is about tokenization. Sounds fancy? Easy to understand, yet extremely powerful. By the end of this tutorial, you’ll understand what it is, why you will need it, and how you can build your own tokenizer.
(Level: Beginner)
This is our first post in which we’ll really get our hands dirty with some coding. Today’s concept is an extremely useful one – regular expressions or regex. Once you get started with regex, there’s no turning back. An extremely powerful concept, it can be used to do things like – batch renaming of files, checking whether a given bit of text is a valid phone number, scraping useful information from a webpage, correcting a mistake you made repeatedly in a file (or tens, hundreds, even thousands of files at a time), and MUCH more.
Greetings!
Most journeys have a destination. It helps us plan our journey in advance, in order to get the best of the experiences. That’s what today’s post is about. A little about what we plan on doing over the course of the next few months, how we plan to do that and what we’ll need to get there. Exploring new places is an exciting activity. Traveling helps open your mind to so many different things that you may not have observed previously. More importantly, it helps you understand a great deal about yourself. If you haven’t tried it already, go ahead and explore a new place as soon as you can. You might have often heard people say that the journey itself is more important than the destination. We strongly believe in that idea, and that’s why we hope to make the journey an exquisite experience for you. Remember, we’re in this together.
This blog is run by two curious people who have always had a fascination with computer science. Over the years, we have explored diverse areas ranging from game development to artificial intelligence, always seeking to expand our knowledge and never being satisfied with incomplete explanations, boring textbooks and uninspired teachers. Learning is supposed to be an enjoyable activity, and that is what this blog aims to achieve.