You might have come across innumerable claims and statements involving numbers, especially in marketing campaigns and ads. “9 out of 10 doctors recommend Colgate toothpaste”, or “Dettol kills 99.9% of bacteria” are classic examples of numerical claims. The statistical validity of such statements regarding a parameter can be tested if…
Generate word cloud from top results for a Google search query
When you search something on Google, millions of results get thrown at you, of which you are likely to go through top few relevant ones. What if you get a snapshot of what has been written in the top results for a search query in the form of a word-cloud…
Python modules containing built-in datasets and ways to access them
Built-in datasets prove to be very useful when it comes to practicing ML algorithms and you are in need of some random, yet sensible data to apply the techniques and get your hands dirty. Many modules in python house some…
Text cleaning as part of preprocessing for Text Analytics
Removal of punctuation is a necessary step in cleaning the text data before performing text analytics. Python offers numerous ways to deal with punctuation. Below given is a simple implementation using ‘re’ and ‘string’ modules.
import re import string
The punctuation attribute of ‘string’ module is used as the reference list to look for all possible punctuation in the text data. Then, substitute function from ‘re’ is used to replace all punctuation from the target string or text data.
s = "A@p,p!!le#" punctuation = '['+string.punctuation+']' re.sub(punctuation,'',s)
The output of the last line above is:
Thank you! Stay tuned for more interesting things you can do with Python!
Anjana K V
Data Science Professional | 7+ years of experience in data science & analytics across various domains — retail, insurance, finance and digital marketing