IS5126 Hands-on with Applied Analytics - Group Assignment

Course: IS5126 Hands-on with Applied Analytics

Academic Year: 2025/2026 Sem 2

Assignment Structure: 2-Part Assignment

Total Weight: 30% of course grade (Assignment 1: 15% + Assignment 2: 15%)

Dataset: Hotel Reviews (50,000-80,000+ reviews)

Assignment Overview

Business Context

You are a group of Data Science Consultants hired by HospitalityTech Solutions, a SaaS company providing analytics platforms to hotels. They need you to build an intelligent analytics platform that helps hotel managers:

  1. Understand customer satisfaction drivers
  2. Identify improvement opportunities
  3. Benchmark against competitors
  4. Predict future trends
  5. Optimize resource allocation

Your task is to develop a working product that hotel managers can use to make data-driven decisions.

Assignment 1: Data Foundation & Exploratory Analytics (15%)

Duration: Weeks 2-6

Submission Deadline: See Canvas

Weight: 15% of final course grade

Objectives

Build a foundational analytics system with:

Deliverables

1. GitHub Repository Structure

student-name-hotel-analytics/ ├── README.md # Setup and usage instructions ├── requirements.txt # Python dependencies ├── .gitignore ├── data/ │ ├── reviews_sample.db # SQLite with 5000+ sample reviews │ └── data_schema.sql # (Optional) Schema documentation ├── notebooks/ │ ├── 01_data_preparation.ipynb │ ├── 02_exploratory_analysis.ipynb │ ├── 03_competitive_benchmarking.ipynb │ └── 04_performance_profiling.ipynb ├── src/ │ ├── data_processing.py │ ├── benchmarking.py │ └── utils.py ├── app/ │ └── streamlit_app.py # Dashboard application ├── profiling/ │ ├── query_results.txt # Query profiling outputs │ └── code_profiling.txt # Code profiling results └── reports/ └── assignment1_report.pdf # 8-10 pages (excluding cover, references, appendices)

2. Data Requirements

Why 50K-80+K reviews?

This volume ensures that database optimization becomes meaningful and measurable. With smaller datasets, indexing and query optimization have minimal impact. Production-scale data allows you to demonstrate real performance improvements.

3. Technical Report (8-10 pages; Submit on Canvas)

Format: PDF, excluding cover page, references, and appendices

Report Header: GitHub repository URL, student name(s) and ID (start with A...)

Required Sections:

1. Executive Summary

2. Data Foundation

3. Exploratory Data Analysis

4. Performance Profiling & Optimization

5. Competitive Benchmarking Strategy

Business Context: Hotel managers struggle to identify meaningful improvement opportunities. They ask: "Who are my real competitors? A 5-star beachfront resort shouldn't compare itself to a budget city hotel. How do we systematically identify truly comparable properties? What are similar hotels doing better than us? Where should we focus our limited improvement budget?"

  • Your methodology for identifying comparable hotel groups (justify your approach)
  • Performance analysis across different hotel groups
  • Identification of best practices within comparable groups
  • Specific, actionable recommendations for underperforming hotels
  • Validation of your approach (your choice of method will decide what kind of validation can be used. Justify your approach.)

Note: This is an open-ended business problem. Propose and implement YOUR solution. There are multiple valid approaches and you need only one with proper justification.

6. System Architecture & Dashboard

  • User interface with rationale
  • Key features and how they address business problems

7. Conclusion (keep it short)

  • Key observations, deliverables, and any limitations
  • Future enhancements (your thoughts as a team, in 2-3 sentences)

For Group Submissions: Include member contribution summary table in the begining

4. Working Dashboard (Streamlit)

  • Functional dashboard with web interface
  • 3-5 core features that solve business problems
  • User documentation in README (GitHub)

Grading Rubric (15% Total)

Component Weight Evaluation Focus
Technical Report 8% - Data foundation and schema quality
- Data volume compliance (50K-80K reviews)
- Exploratory analysis depth and insights
- Performance profiling with quantified improvements
- Competitive benchmarking strategy and validation
- Writing quality and clarity
Working Product 5% - Functionality and features (3%)
- User experience and interface (1%)
- Error handling and robustness (1%)
Code Repository 2% - Structure and organization (0.5%)
- Code quality and documentation (0.5%)
- Reproducibility (1.0%)
TOTAL 15%

Assignment 2: Production System & Advanced Analytics (15%)

Duration: Weeks 7-12

Submission Deadline: See Canvas

Weight: 15% of final course grade

Enhanced Business Problems

Your foundational system was successful. Now clients want advanced capabilities. Choose and solve 3 out of these 4 problems:

Problem 1: Predictive Intelligence (attempt any 2 out of the 3 bullets below)

  • Forecast hotel ratings for next 3-6 months (time-series forecasting with LSTMs/RNNs)
  • Prediction which reviews will be most helpful to customers (classification; Compare the techniques used to select the best one for client usage)
  • Predict overall rating from aspect ratings (Numeric value prediction; Compare the techniques used to select the best one for client usage)

Problem 2: Text Intelligence & Review Analytics (attempt any 3 out of the 5 bullets below)

  • Automatically extract key themes and topics from review text (topic modeling, LDA, BERTopic)
  • Identify specific aspects customers mention most (aspect extraction, NER)
  • Predict review helpfulness based on text content (NLP + classification)
  • Analyze which specific phrases correlate with high/low ratings
  • (Advanced/Optional) Automated review summarization or periodic reporting to management

Problem 3: Intelligent Automation (attempt any 3 out of the 4 bullets below)

  • Identifying trends (upwards, downwards) in performance of hotels in any given time-line
  • Highlight emerging issues (Performance drift detection and alerts, from trend monitoring above)
  • Recommend cross learning opportunities from high performers to low performers (for comaprable hotels)
  • Real-time review analysis system (streaming ML pipeline)

Problem 4: Knowledge Graphs & Executive QA System (attempt any 2 out of the 3 bullets below)

  • Build knowledge graph from hotel reviews (entities: hotels, aspects, sentiments, locations) [do think of a strategy before doing this...]
  • RAG-based QA system for CEO queries (e.g., "What are guests saying about our breakfast service compared to competitors?")
  • Natural language interface for executives to query

Deliverables

1. Updated GitHub Repository

student-name-hotel-analytics/ ├── notebooks/ │ ├── 05_predictive_modeling.ipynb # Regression, Classification │ ├── 06_deep_learning.ipynb # Neural networks │ ├── 07_time_series_forecasting.ipynb # (if applicable) │ ├── 08_knowledge_graph.ipynb # (if applicable) │ └── 09_optimization.ipynb # (if applicable) ├── models/ │ ├── neural_net.pt # Saved models │ ├── lstm_forecaster.pt # (if applicable) │ └── model_metadata.json ├── api/ │ ├── main.py # API service (optional) │ └── routes/ ├── app/ │ └── streamlit_app.py # Enhanced dashboard ├── tests/ │ └── test_models.py # (Optional) └── reports/ └── assignment2_report.pdf # 8-10 pages

2. Technical Report (8-10 pages)

Format: PDF, excluding cover page, references, and appendices

Report Header: GitHub repository URL, student name(s) and ID(s)

Required Sections:

1. Executive Summary

  • Problems addressed (which 3 of 4?)
  • Technical approach summary
  • Business impact quantification

2. Advanced Analytics Implementation

For each of the 3 problems solved:

  • Business Justification: Why this problem matters
  • Technical Approach: Method chosen and why
  • Architecture: System design and components
  • Implementation: Key technical decisions
  • Evaluation: Performance metrics and results
  • Business Impact: How it creates value

3. System Architecture

  • Overall system design and component interactions
  • Data flow
  • API design (if applicable)
  • Deployment strategy(if applicable)

4. Business Impact Analysis

  • Realistic usage scenario and workflow
  • Quantified business value with calculations
  • ROI analysis

5. Conclusion

  • Technical achievements and challenges
  • System limitations

For Group Submissions: Include member contribution summary table at end

3. Production System

  • Deployable application (not just notebooks)
  • Advanced features integrated
  • Comprehensive error handling
  • Production-quality code

Grading Rubric (15% Total)

Component Weight Evaluation Focus
Technical Report 8% - Advanced analytics quality (3 problems mandatory)
- Appropriate use of deep learning/advanced ML
- Model evaluation rigor
- Business problem-solution fit
- Quantified business impact (mandatory)
- Writing quality and clarity
Working Product 5% - Advanced features functionality (3%)
- System architecture quality (1%)
- Production readiness (1%)
Code Repository 2% - Structure and organization (0.5%)
- Code quality and documentation (0.5%)
- Reproducibility (1.0%)
TOTAL 15%

Technical Requirements

Data & Tools

  • Timeframe: Latest 5 years
  • Volume: 50,000-80,000 reviews after filtering
  • Storage: SQLite database
  • Languages: Python 3.8+
  • Core Libraries: pandas, numpy, scikit-learn, matplotlib/plotly/seaborn
  • Assignment 2 Additional: PyTorch or TensorFlow (for advanced models)

Version Control

  • Platform: GitHub (public or provide TA access)
  • Documentation: Comprehensive README with setup instructions
  • Reproducibility: All notebooks must run without errors

Submission Instructions

See Canvas for detailed submission instructions and deadlines

General requirements:

  • Submit ZIP file named: GroupName_AssignmentX.zip
  • PDF report should clearly list out all group members with their IDs (start with A...)
  • ZIP must contain: Technical report PDF with GitHub URL on cover page
  • For groups: Include member contribution summary in report (after cover page)
  • GitHub repository must be accessible with sample database

Evaluation Philosophy

We Grade Based On:

  • Problem-Solution Fit: Does your approach match business needs?
  • Technical Soundness: Is implementation correct and robust?
  • Practical Value: Can hotels actually use this?
  • Learning Demonstration: Do you understand what you built?

We DON'T Penalize:

  • Different technical choices (if justified)
  • Simpler approaches (if effective)
  • Not using every technique taught (or out there)
  • UI/UX choices

Academic Integrity

  • All work must be your team's original contribution
  • Properly cite any adapted code or ideas from external sources
  • AI tools (ChatGPT, Copilot) allowed for assistance, but you must understand and be able to explain all submitted work. Also, declare the usage as briefed in day-1 'Course Introduction' slides
  • Plagiarism results in zero marks and academic misconduct proceedings
  • Groups may discuss approaches but must implement independently

Getting Help

Use Canvas to discuss questions with classmates, TAs/Instructor. Post questions on the course forum.

Build something you're proud to showcase in Job interviews!

Last Updated: January 2026
Version: 2.0