There are 2 ways you can go about setting up an ML environment. One is locally, and one is on the cloud. Let’s go over both:
Our local machines often don’t have GPUs (Graphics Processing Units). As a result, it becomes difficult to work with images on your local machines sometimes. Moreover, if you are not that seasoned with GitHub yet (but don’t worry, you will be by the end of this Summer xD!), Cloud notebooks make collaboration smooth and seamless (just like sharing a Google Docs link). Even better, you can add comments and pictures in parallel, and document what you are doing in parallel very nicely, all the while breaking your code into separate chunks, and can test each individual function as well (No looking through 20 error messages now to find that missing semicolon now!). For the purpose of this task, we recommend using Google Colab.
To see how you can set this up, have a look here for a text tutorial, and at this amazing playlist for a comprehensive video tutorial
If setting up Colab felt like easy mode, and you feel like local development (aka, running Jupyter Notebooks on your own computer, instead of Google’s servers) is the better fit for you (because of a higher RAM, or if you’re tired of uploading your datasets to Google Drive everytime you want to do a project), check out this for a tutorial on how you can do it.
NOTE: For SOC’23, we would highly recommend you to use Google Colab
In today’s digital era, online marketplaces have become an integral part of our lives. With the convenience of browsing through a vast array of products and making purchases from the comfort of our homes, online shopping has experienced exponential growth. these virtual shopping havens offer a seemingly endless array of products at our fingertips. However, amidst this vast ocean of choices, finding the perfect product at the best price can feel like searching for a needle in a haystack.This assignment sets out on an exhilarating journey to create such a companion—a cutting-edge recommendation and price prediction model that will revolutionize the online shopping experience.
The aim is to build a model that can give recommendations and price prediction for various products on online marketplace by utilising the data about prices on existing online platforms.
Exploratory Data Analysis-EDA( Bonus Task): Explore the scraped data to gain insights and understand the distribution of prices. Calculate the total statistics, such as mean, median, minimum, and maximum prices. To determine the patterns or outliers, you can visualize your figures by means of histograms, box graphs and scatter plots.
Model Selection and Training: Choose a suitable machine learning model for price prediction, such as linear regression, decision trees, or random forests. Split the data into training and testing sets. In view of product characteristics as inputs and corresponding prices being the target variable, train a selected model with training data.
Model Evaluation: Evaluate the trained model’s performance using appropriate metrics like mean squared error (MSE), root mean squared error (RMSE), or mean absolute error (MAE). To evaluate the model’s accuracy and reliability, a comparison between expected prices and actual prices from testing data can be made.