Skip to content

DurgA-5/SalaryPredictor

Repository files navigation

💼 Internship Project Report

🌟 Project Title

Employee Salary Prediction System


🎓 Internship Details

  • Internship Program: IBM SkillsBuild Virtual Internship
  • Supported by: Edunet Foundation
  • Platform: IBM SkillsBuild
  • Duration: 6 Weeks (June 18, 2025 – July 30, 2025)
  • Domain: Artificial Intelligence & Machine Learning
  • Intern: Durga Prasad Papugani

Objective of the Project

To build a smart and interactive web application that predicts employee salaries based on key factors such as experience, education, skills, and job role. The app aims to assist HR teams and employers in making data-driven compensation decisions using ML models and dashboards.


What I Built

An end-to-end Streamlit-based ML application featuring:

  • 🔍 Real-time salary prediction using multiple ML models
  • 🧼 Complete data preprocessing pipeline (cleaning, encoding, scaling)
  • 📈 Model evaluation: R², RMSE, MAE, CV scores
  • 🗓 Trend analysis based on date/time features
  • 📊 Box plots, Violin plots, Outlier detection
  • 🧪 Statistical testing (T-tests, correlation heatmaps)
  • 📅 PDF report generation with predicted results

🤖 Machine Learning Models Used

  • Linear Regression
  • Random Forest Regressor
  • XGBoost Regressor
  • GridSearchCV (for model tuning)
  • K-Fold Cross Validation (for robustness)

📁 Dataset Info

  • Dataset Name: EMPLOYEE_DATASET
  • Source: Kaggle
  • Size: 500+ rows
  • Key Features: Department, Education, KPIs, Experience, Number of Workers, Target Productivity (used as salary proxy)
  • Preprocessing: Outlier removal, encoding, feature scaling, date parsing

🗂️ Tools & Technologies Used

  • Frontend: Streamlit, CSS, HTML
  • Backend/ML: Python, Pandas, Scikit-learn, XGBoost, SHAP
  • Visualization: Matplotlib, Seaborn, Plotly
  • PDF/Resume: PyMuPDF, FPDF, Regex

🎯 Achievements

  • Deployed a complete ML web app from scratch
  • 📋 PDF report generation for predictions
  • 💡 Designed a clean and user-friendly UI
  • 🧠 Tuned and validated models with proper evaluation
  • ☁️ Deployed the project to Streamlit Cloud

🌍 Live Demo

🔗 (https://smartsalarypredictor.onrender.com/about-tech)](https://smartsalarypredictor.onrender.com/)


👨‍💼 My Role

  • Designed the overall system and workflows
  • Handled all data preprocessing and model training
  • Developed frontend and backend logic
  • Built dashboards and PDF generator
  • Managed deployment and UI styling

📚 What I Learned

  • Building and deploying ML apps end-to-end
  • Real-world data preprocessing and feature engineering
  • Data visualization and statistical analysis
  • Importance of user experience in AI tools

🏅 Internship Completion

  • Certificate: Issued via IBM SkillsBuild & Edunet Foundation

🛠️ Source Code

git clone [https://github.com/DurgaPrasadPapugani/employee-salary-app.git](https://github.com/DurgA-5/SalaryPredictor.git)

🔮 Future Enhancements

Work In Progress - Employee Salary Prediction through Machine Learning deployment through Streamlit [https://employeepredictor.streamlit.app/ ]View App

Planned - ChatBot SmartSalaryPredictor

Researching - Expansion to international markets data set (US, UK, Canada).


🙌 Acknowledgements

Special thanks to Edunet Foundation and IBM SkillsBuild for the valuable opportunity. This project helped enhance my AI/ML, software engineering, and deployment skills in a practical, impactful way.


📘 About This Repository

  • Language: Python & Jupyter Notebooks
  • Technologies: Flask, HTML, CSS, JS, Scikit-learn, XGBoost, SHAP, PyMuPDF
  • Topics: Salary Prediction, ML Deployment, Resume Parsing, Explainable AI

  • 📸 Screenshots
image image image image image