In this article, we will explore the process of building a Sales Forecasting App using Streamlit. Streamlit is a powerful tool for building interactive and data-driven web applications with Python. We will walk through the various steps involved in integrating a sales forecasting model with Streamlit to create an interactive and user-friendly application.
1.1 Why Sales Forecasting?
Sales forecasting is a critical component of any business strategy. Accurate sales forecasts enable companies to make informed decisions regarding inventory, staffing, and other key resources. With the help of sales forecasting, businesses can also identify potential challenges and opportunities in their sales pipeline, allowing them to adjust their strategy accordingly. However, sales forecasting can be a complex and time-consuming process. This is where machine learning (ML) can help. By leveraging the power of ML algorithms, businesses can develop more accurate and efficient sales forecasting models
1.2 Introduction to Streamlit
1.2.1 Overview of Streamlit as an Ml App building tool.
Streamlit is an open-source platform for building ML applications quickly and easily. With Streamlit, developers can create interactive data visualizations, user interfaces, and machine learning models in a matter of minutes. Streamlit provides a simple and intuitive way to build ML apps, without the need for complex code or specialized expertise. Streamlit is written in Python and integrates easily with popular ML libraries such as TensorFlow, PyTorch, and Scikit-Learn.
1.2.2 Streamlit capabilities and features.
Streamlit provides a range of capabilities and features that make it an ideal platform for building ML apps, including:
- Easy-to-use interface: Streamlit’s simple and intuitive interface makes it easy for developers to build ML apps quickly and efficiently.
- Built-in components: Streamlit provides a range of pre-built components such as sliders, dropdowns, and buttons that can be used to create interactive user interfaces.
- Data visualization: Streamlit makes it easy to create interactive data visualizations, allowing users to explore data in new and meaningful ways.
- Integration with popular ML libraries: Streamlit integrates easily with popular ML libraries such as TensorFlow, PyTorch, and Scikit-Learn, allowing developers to create powerful ML models with minimal effort.
- Deployment: Streamlit provides a simple and efficient way to deploy ML apps, making it easy to share and collaborate with others.
1.2.3 Benefits of using Streamlit for building ML apps
There are many benefits to using Streamlit for building ML apps, including:
- Faster development: Streamlit’s simple and intuitive interface allows developers to build ML apps quickly and efficiently, reducing development time and costs.
- Improved collaboration: Streamlit makes it easy to share and collaborate on ML apps, allowing teams to work together more effectively.
- Enhanced user experience: Streamlit’s interactive data visualizations and user interfaces create a more engaging and intuitive user experience.
- Better accuracy: By leveraging the power of ML algorithms, Streamlit allows businesses to develop more accurate and efficient sales forecasting models, enabling them to make more informed decisions.
2. Preparing the Model for Deployment.
2. 1 Overview of the existing sales forecasting model
The model that will be embedded in this app is a regression model used for forecasting sales. The model utilizes 12 features to make a prediction.
2.2 Converting the sales forecasting model, pipeline and other features to a format that can be deployed with Streamlit
Once the model was trained and fine-tuned, the best version was chosen based on its root mean squared log error(rmlse). This means that this particular model was able to make more accurate predictions on the target variable compared to the other models. The RMSLE measures the difference between the predicted and actual values on a logarithmic scale, so a lower RMSLE indicates that the predicted values were closer to the actual values on average. The final model, along with its pipeline and other associated components, were saved as pickle files, which will be later utilized to build the app.
Created dictionaries to hold the components of interest
# Save the model and the columntransformer
ml_components_1['family'] = family
ml_components_1['Holiday_city'] = Holiday_city
ml_components_1['Store_city'] = Store_city
ml_components_1['Store_state'] = Store_state
ml_components_1['Store_type'] = Store_type
ml_components_1['Cluster'] = Cluster
ml_components_1['Holiday_level'] = Holiday_level
ml_components_1['Type_of_day'] = Type_of_day
ml_components_1['num_cols'] = numerical_attributes
ml_components_1['cat_cols'] = categorical_attributes
ml_components_1['columns'] = train_copy_.columns
ml_components_2['numerical_pipeline'] = num_pipeline
ml_components_2['categorical_pipeline'] = cat_pipeline
ml_components_2['model'] = dec_reg
Saved the dictionaries to pickle files
# saving files
filename = 'ml_components_1.pkl'
pickle.dump(ml_components_1, open(filename, 'wb'))
filename = 'ml_components_2.pkl'
pickle.dump(ml_components_2, open(filename, 'wb'))
2.3 Exporting model/app dependencies and requirements.
Certain modules and frameworks were employed in the creation of the regression model. To ensure smooth integration of the model into the app and avoid any version conflicts between these modules and frameworks, the identical version of these modules and frameworks must be installed within the same environment. In order to obtain the precise names and versions of these frameworks and modules, the session_info module was utilized to extract this information from our Jupyter notebook. Similarly, pip freeze can also be used to achieve the same purpose.
First let’s install session_info.
pip install session_info
Secondly let’s import and show the modules/frameworks.
#import and use session_info
The session_info.show() command generates a list of module names and versions, which can be easily copied, edited, and saved as a requirements.txt file. This file will be subsequently utilized to install the identical versions of these modules and frameworks into the environment needed to build the app.
3. Building the App
In this section, we will discuss the process of building the sales forecasting app using Python and Streamlit. We will cover the necessary steps, including data preprocessing, model training, and interface design. By the end of this section, you will have a solid understanding of how to build a functional and user-friendly app for sales forecasting.
3.1 Setting up an Environment
Building machine learning applications with Streamlit requires creating a Python environment as a crucial step. The Python environment functions as a self-contained space where specific Python packages and dependencies can be installed without affecting the global Python installation on the system. This helps to ensure that the application can access the required libraries and dependencies with the correct versions.
To create the Python environment, the venv module in Python was used to create an environment named streamlit_venv. Before creating the environment, it is necessary to navigate to the folder that contains the application in the terminal. This allows the user to specify the exact location for the environment to be created and ensures that the environment is set up correctly.
To create a python environment in Windows/MacOs/Linux:
python -m venv streamlit_venv
To activate the environment on Windows
To activate the environment on Linus or MacOs
3. 2 Installing dependencies and requirements
Once the Python environment has been successfully set up and activated, the next step is to install the requirements and dependencies for the application. This ensures that the model has access to the exact dependencies and requirements that were used during its training phase.
Installing these dependencies and requirements is a critical step in building a reliable and reproducible machine learning application. By having the same dependencies and requirements installed, the application will produce consistent and accurate results, even when used on different systems or environments. Once installed, the application can be run within the Python environment with confidence that it will perform as intended. First, we would install Streamlit using pip.
pip install streamlit
Secondly let’s install the requirement and dependencies saved in the requiremet.txt file.
pip install -r requirement.txt
3. 3 Creating the App Interface
Creating the app interface with Streamlit involves writing Python code that utilizes Streamlit’s features such as columns, select boxes, sliders, and buttons. The interface is designed to allow the user to input data for sales forecasting. The resulting interface is user-friendly and provides an intuitive way for users to interact with the app.
The interface includes a date input, select boxes for item family, store city, and store state, a number input for crude oil price, a select box for day type, a radio button for store type, and sliders for store number, store cluster, and number of items on promo. It also includes a button for making predictions.
The interface is created using Streamlit’s column layout and expander features.
import streamlit as st
# Creating interface in an expander
image = Image.open('images/justin-lim-JKjBsuKpatU-unsplash.jpg')
st.image(image, caption=None, width=None, use_column_width=None, clamp=False, channels="RGB", output_format="auto")
st.title('Demo Sales Forecasting :red[App]') # create a three column layout
col1, col2, col3 = st.columns(3)
# create a date input to receive date
date = col1.date_input(
"Enter the Date",
datetime.date(2019, 7, 6))
# create a select box to select a family
item_family = col2.selectbox('What is the category of item?',
# create a select box for store city
store_city = col3.selectbox("Which city is the store located?",
3.4 Embedding the model into the interface
This is to ensure that the model can be easily accessed and used by end-users without any technical knowledge.
3.4.1 Collecting and preparing inputs
The app inputs were gathered and transformed into a dataframe, following the format of the training data and using matching column names. The datatypes of the features were verified and converted to their correct format. This ensures that the data is consistent and can be effectively processed by the app’s algorithms.
3.4.2 Feature engineering
Following data collection and preparation, the app generated new features in the same manner as during model training. Most of these new features were date extracts, including day of week, month, quarter, and year, which were created in the same way as during model training. This step is crucial to ensure that the model’s prediction is based on the same data features as those used during training, maintaining consistency between the two processes.
# Creating date extracts
data['Year'] = data.index.year
data['Month'] = data.index.month
data['DayOfMonth'] = data.index.day
data['DaysInMonth'] = data.index.days_in_month
data['DayOfYear'] = data.index.day_of_year
data['DayOfWeek'] = data.index.dayofweek
# Creating the payday column
if row.DayOfMonth == 15 or row.Is_month_end == 1:
3.4.3 Transforming data using pipelines
The pickled pipelines was loaded and utilized to convert the collected and prepared data into a format that is compatible with our model. There were two pipeline: numerical pipeline and categorical pipeline.
with open(filename, 'rb') as file:
data = pickle.load(file)
# load pickle file
ml_compos_2 = load_pickle('ml_components_2.pkl')
# loading pipelines
categorical_pipeline = ml_compos_2['categorical_pipeline']
numerical_pipeliine = ml_compos_2['numerical_pipeline']
The function process_data takes in data, categorical and numerical pipelines, categorical and numerical columns. It sets the index of the data to be the date column and extracts date features. It creates a new feature 'Is_payday' based on the values of 'DayOfMonth' and 'Is_month_end'. It transforms categorical and numerical data using the respective pipelines and returns the processed data.
def process_data(data, categorical_pipeline, numerical_pipeliine, cat_cols, num_cols):
processed_data = data.set_index('date')
processed_data['Is_payday']= processed_data[['DayOfMonth', 'Is_month_end']].apply(payday, axis=1)
processed_data[cat_cols] = categorical_pipeline.transform(processed_data[cat_cols])
processed_data[num_cols] = numerical_pipeliine.transform(processed_data[num_cols])
3.4.4 Making a prediction
The app integrated the model component in the pickle file by loading it, which enabled the use of the predict method on the model to generate sales predictions.
model = ml_compos_2['model']
# Making predictions
st.metric('Predicted Sale', value=model.predict(processed_data))
The provided code checks if the “button” is clicked and if so, it executes two functions. The first function displays a balloon animation on the screen using the st.balloons() function. The second function uses the trained model to predict sales on the processed data and displays the result as a metric using the st.metric() function.
4. Testing the app
Continuous testing and debugging of the app are essential to ensure that it is running smoothly and efficiently. Additionally, user testing and feedback can provide valuable insights into the user experience and identify any areas that need improvement. Therefore, it is crucial to conduct thorough and frequent testing to identify and fix any bugs or issues, as well as to evaluate the user experience and make necessary adjustments to enhance it.
4.1 Running the App Locally
The Streamlit app was executed locally by running the streamlit run command followed by the name of the Python script containing the app definition, which is app.py in this case. To run the app locally, the aforementioned command can be used.
streamlit run app.py
4.2 Debugging and Troubleshooting the App
During the development of my Streamlit app, I faced some technical issues and bugs. These problems were mostly related to the codes I used, but fortunately, I was able to resolve most of them. To debug my app, I mainly used print statements to output debugging information into the console. Moreover, I used the Streamlit debugging mode by setting the debug parameter to True when running the app. This displayed detailed error messages and stack traces in the web browser console, which helped me to identify and fix any issues that arose.
Building an app involves addressing non-technical bugs that can significantly impact the user experience.
The theme — The was theme color of the app plain which makes some of the widgets invisible or easily forgettable.
Input history —Another thing I also realized was the inability of users to download input history.
Insensitive parameters — Additionally, some parameters did not significantly affect predicted sales, leading to confusion and inaccurate predictions. This is associated with the trained model.
A darker theme was added to make the widget more visible and to also reduce eye strain and make it easier to read content in low light settings
The feature to download input history was added for a better user experience.
4.3 Testing the App with Real-World Data
After developing the sales forecasting app, it was tested with real data to ensure it works properly and identify any overlooked issues. During testing, the user experience and accuracy of the predictions was checked, making any necessary adjustments such as theme color. The app predicts one sale at a time, making it suitable for small business owners and individuals making data-driven sales decisions. While not ideal for large-scale sales forecasts, the app provides a user-friendly way to obtain sales predictions.
Photos of the App
This article provides a comprehensive guide on how to build a sales forecasting app using Streamlit, emphasizing the importance of testing with real-world data, addressing non-technical bugs, and embedding the model into the interface.
The complete code can be found on my GitHub page. I would love to get feedback, suggestions and corrections on this article. Thank you.