🎯 Mock Interview — Full Simulation
15 Questions Across All Rounds (60 Minutes)
Instructions: Set a timer. Answer each question aloud or in writing before reading the model answer. This simulates the real interview flow at DecisionTree Analytics.
Round 1: HR Screening (5 minutes)
Q1 (2 min): "Tell me about yourself."
Model Answer:
"I'm [Name], a recent graduate from [University] with a focus on data analytics. I've built hands-on skills in SQL, Python, Tableau, and Power BI through academic projects — including a customer churn prediction model and an interactive sales dashboard. What excites me about data analysis is translating raw numbers into actionable business decisions. DecisionTree's end-to-end analytics approach — from data strategy to GenAI — is exactly the environment where I want to start my career."
Scoring: ✅ Under 90 seconds? ✅ Past-Present-Future structure? ✅ Mentioned the company?
Q2 (3 min): "Why DecisionTree Analytics?"
Model Answer:
"Three specific reasons: (1) Your depth — you go beyond dashboards into ML and GenAI with products like AskNeo. (2) Industry breadth — CPG, Financial Services, E-Commerce means diverse exposure as a fresher. (3) Your culture of 'Innovate. Grow. Belong' — at ~160 employees, I'd learn directly from senior leadership while having structured mentorship."
Round 2: Aptitude (5 minutes)
Q3 (2 min): A product's price increases by 25%. By what percentage must consumption be reduced to keep expenditure constant?
Answer:
Reduction = (Increase / (100 + Increase)) × 100
= (25 / 125) × 100 = 20% ✅
Q4 (3 min): In a group of 100 people, 65 like tea, 45 like coffee, and 30 like both. How many like neither?
Answer:
At least one = Tea + Coffee - Both = 65 + 45 - 30 = 80
Neither = 100 - 80 = 20 ✅
Round 3: SQL (10 minutes)
Q5 (3 min): Write a query to find the top 3 highest-paid employees in each department.
Answer:
WITH ranked AS (
SELECT name, department, salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS rn
FROM employees
)
SELECT name, department, salary
FROM ranked WHERE rn <= 3;
Q6 (3 min): Find customers who placed orders in January but NOT in February.
Answer:
SELECT DISTINCT customer_id
FROM orders
WHERE MONTH(order_date) = 1
AND customer_id NOT IN (
SELECT customer_id FROM orders WHERE MONTH(order_date) = 2
);
Q7 (4 min): Calculate month-over-month revenue growth percentage.
Answer:
WITH monthly AS (
SELECT DATE_FORMAT(order_date, '%Y-%m') AS month,
SUM(amount) AS revenue
FROM orders GROUP BY 1
)
SELECT month, revenue,
LAG(revenue) OVER (ORDER BY month) AS prev_month,
ROUND((revenue - LAG(revenue) OVER (ORDER BY month)) * 100.0 /
LAG(revenue) OVER (ORDER BY month), 2) AS growth_pct
FROM monthly;
Round 4: Python (5 minutes)
Q8 (2 min): Given a DataFrame df with columns product, region, sales, find the top 5 products by total sales.
Answer:
top5 = df.groupby('product')['sales'].sum().nlargest(5)
print(top5)
Q9 (3 min): How would you handle a dataset with 15% missing values in the income column?
Answer:
"First, I'd check if the missing data is random (MCAR) or patterned (MNAR). If random and the column is numerical, I'd use median imputation —
df['income'].fillna(df['income'].median())— because median is robust to outliers. If there's a pattern (e.g., high earners don't report), I'd create a binary indicator columnincome_missingand use model-based imputation. I wouldn't drop 15% of rows — that's too much data to lose."
Round 5: Tableau (5 minutes)
Q10 (2 min): "What is a Context Filter?"
Answer:
"A Context Filter creates a temporary subset of data before other filters are applied. Without it, Tableau applies all filters independently — so 'Top 10 products in Delhi' might first find Top 10 across all cities, then filter to Delhi, giving wrong results. Making the city filter a Context Filter ensures Tableau isolates Delhi data first, then finds the Top 10 within it."
Q11 (3 min): "Explain the difference between FIXED, INCLUDE, and EXCLUDE LOD expressions."
Answer:
"All three compute at a different granularity than the view. FIXED ignores the view entirely —
{FIXED [Region] : SUM(Sales)}always gives regional totals. INCLUDE adds a dimension — increases granularity. EXCLUDE removes a dimension — decreases granularity. For example, if my view is at City level and I use{EXCLUDE [City] : SUM(Sales)}, I get the country total on every city row."
Round 6: Statistics (5 minutes)
Q12 (2 min): "Explain p-value to a non-technical person."
Answer:
"Imagine you flip a coin 100 times and get 90 heads. The p-value asks: 'If the coin is fair, what's the probability of getting a result this extreme?' That probability is nearly zero — so we conclude the coin is rigged. In analytics, if p < 0.05, we trust that a pattern is real, not random luck."
Q13 (3 min): You ran an A/B test. Control: 4.0% conversion (n=5000). Treatment: 4.5% conversion (n=5000). p-value = 0.12. What do you tell the product team?
Answer:
"At p = 0.12, the result is NOT statistically significant at the 95% confidence level. However, the directional improvement of +0.5pp is promising. I'd recommend: (1) extending the test for 2 more weeks to increase sample size, (2) checking if specific segments show stronger effects, and (3) calculating the potential business impact — if each conversion is worth ₹500, the potential gain is ₹12,500/day. I wouldn't roll out yet, but I wouldn't kill it either."
Round 7: ML / Decision Trees (5 minutes)
Q14 (5 min): "Explain how a Decision Tree works and why Random Forest is better."
Answer:
"A Decision Tree recursively splits data by asking questions — it picks the feature that reduces impurity the most (measured by Gini or Entropy). For example, 'Is monthly spend > ₹5000?' splits customers into groups with different churn rates. It keeps splitting until leaves are pure or a stopping condition is met.
The problem is overfitting — a deep tree memorizes training data including noise. Random Forest solves this by building hundreds of trees, each on a random subset of data and features. Individual tree errors cancel out through majority voting. The result is higher accuracy and better generalization. The trade-off: Random Forest is less interpretable, but for production predictions, accuracy matters more."
Rounds 8-10: Case Study + Behavioral (20 minutes)
Q15 (8 min): "A retail client sees 15% revenue decline despite 20% increase in website traffic. Diagnose the problem."
Answer:
Framework: Revenue = Traffic × Conversion Rate × Average Order Value
Given: Traffic ↑ 20%, Revenue ↓ 15%
Step 1: Calculate implied CR × AOV change
Revenue = Traffic × CR × AOV
0.85R = 1.20T × CR_new × AOV_new
CR_new × AOV_new = 0.85/1.20 = 0.708 of original
→ Combined CR × AOV dropped by ~29%
Step 2: Diagnose - which dropped?
Check Conversion Rate separately:
- If CR dropped: Maybe new traffic is low-quality (paid ads
with poor targeting, bot traffic, irrelevant keywords)
- If AOV dropped: Customers buying cheaper items, heavy
discounting, product mix shift
Step 3: Likely scenario
Traffic up 20% from a new campaign → low-intent visitors
These visitors browse but don't buy (low CR)
The few who do buy get attracted by discounts (low AOV)
Step 4: Recommendations
1. Segment traffic: organic vs paid — check CR for each
2. Audit campaign targeting — are we reaching the right audience?
3. Check discount strategy — are we cannibalizing revenue?
4. Create a cohort analysis: new vs returning customer behavior
Scoring Guide
| Area | Score (1-5) |
|---|---|
| HR answers — structured, under time limit | __ / 5 |
| Aptitude — correct and quick | __ / 5 |
| SQL — correct syntax, efficient queries | __ / 5 |
| Python — clean code, explained approach | __ / 5 |
| Tableau — concept clarity | __ / 5 |
| Statistics — accurate, business-oriented | __ / 5 |
| ML — explained with examples | __ / 5 |
| Case Study — structured framework | __ / 5 |
| Total | __ / 40 |
35+ → You're ready. Walk in with confidence. 25-34 → Good base. Revise weak areas one more time. Below 25 → Go through the round-specific guides again. Focus on theory first.