I’m sure you can already think of all the amazing possibilities and use cases of this new knowledge. You can try other SMS examples to see the outcome. With that, we come to the end of this tutorial. If you’ve followed all the steps above, then you should see your server running as shown below:Įnter in the address bar to connect to the application.
Prediction Author: Aboze Brain John Jnr Project: SMS spam detection system using Machine Learning, Python, Flask and Vonage API Machine Learning Prediction Īt the end of the Python file, add this code to start a local server: The supporting predict.html file in the templates directory looks like this:
predict (tfidf_data ) return render_template ( 'predict.html', prediction =my_prediction ) split ( ) ) ) # tfidf_model = TfidfVectorizer() split ( ) if term not in stop_words ) )ĭata = data. replace (, 'emailaddress' ) # Replace urls with 'webaddress'ĭf = df. # Replace email address with 'emailaddress'ĭf = df. The necessary libraries for this project can be imported into project_notebook.ipynb as follows: Here, we will apply a variety of techniques to analyze the data and get a better understanding of it. These procedures will be carried out in a Jupyter notebook, which from our file directory is named 'project_notebok' Exploratory Data Analysis (EDA) With this data, we will train a machine learning model that can correctly classify SMS as ham or spam. More about the dataset can be found here. The datasets contain 5574 messages with respective labels of spam and ham (legitimate). The spam dataset for this project can be downloaded here.
Now that your environment is ready, you’re going to download the SMS training data and build a simple machine learning model to classify the SMS messages. Jupyter lab interface is shown here Jupyterlab Build and Train the SMS Detection Model This opens the popular Jupyter lab interface in your web browser, where you are going to carry out some interactive data exploration and model building.
pandas is for manipulating and wrangling structured data.nltk is for natural language operations.python-dotenv is a package for managing environment variables such as API keys and other configuration values.matplotlib, plotly, plotly-express are for data visualization.nexmo is a Python library for interacting with your Vonage account.lightgbm is the machine learning algorithm for building our model.flask is for creating the application server and pages.jupyterlab is for model building and data exploration.Here are some details about these packages: