Skip to main content

🚀 Day-Before Cheatsheet — Rapid Revision

Everything in One Page — Read This the Night Before


SQL — Top 20 Patterns

#PatternSyntax
1Select specific columnsSELECT col1, col2 FROM table
2Filter rowsWHERE col = 'value'
3Multiple conditionsWHERE a = 1 AND (b = 2 OR c = 3)
4Pattern matchWHERE name LIKE 'A%'
5Null checkWHERE col IS NULL / IS NOT NULL
6Sort resultsORDER BY col DESC
7Limit rowsLIMIT 10 / TOP 10
8AggregateSELECT dept, COUNT(*), AVG(sal) FROM t GROUP BY dept
9Filter groupsGROUP BY dept HAVING COUNT(*) > 5
10Inner JoinFROM a JOIN b ON a.id = b.id
11Left JoinFROM a LEFT JOIN b ON a.id = b.id
12Self JoinFROM emp e JOIN emp m ON e.mgr_id = m.id
13Subquery in WHEREWHERE sal > (SELECT AVG(sal) FROM emp)
14Subquery in FROMFROM (SELECT ... ) AS sub
15CASE statementCASE WHEN x > 0 THEN 'Yes' ELSE 'No' END
16Window — RANKRANK() OVER (PARTITION BY dept ORDER BY sal DESC)
17Window — Running TotalSUM(amt) OVER (ORDER BY date)
18Window — LAG/LEADLAG(amt) OVER (ORDER BY date)
19CTEWITH cte AS (SELECT ...) SELECT * FROM cte
20Delete duplicatesDELETE FROM t WHERE id NOT IN (SELECT MIN(id) FROM t GROUP BY col)

DAX — Top 10 Formulas

#FormulaUse
1SUM(Sales[Revenue])Basic total
2COUNTROWS(Sales)Count rows
3DIVIDE(A, B, 0)Safe division
4DISTINCTCOUNT(Sales[CustID])Unique count
5CALCULATE(SUM(...), filter)Change filter context
6TOTALYTD(SUM(...), Date[Date])Year-to-date
7SAMEPERIODLASTYEAR(Date[Date])Previous year comparison
8ALL(Table)Remove all filters
9SUMX(table, expr)Row-by-row calculation
10CALCULATE + ALL → % of TotalPercentage analysis

Tableau — 5 Must-Know Concepts

  1. Dimensions vs Measures: Blue = categorical (creates headers). Green = numerical (gets aggregated).
  2. Context Filter: Creates a temp table BEFORE other filters. Use for "Top N within a group."
  3. LOD — FIXED: {FIXED [Region] : SUM([Sales])} — calculates at specified level, ignores the view.
  4. Filter Order: Extract → Data Source → Context → Dimension → Measure → Table Calc.
  5. Live vs Extract: Extract = faster (local copy, .hyper file). Live = real-time but slower. Use Extract 90% of the time.

Statistics — 5 Core Concepts

  1. Mean vs Median: Mean = sum/count (sensitive to outliers). Median = middle value (robust). Use Median for skewed data.
  2. 68-95-99.7 Rule: In a normal distribution, 68% within ±1σ, 95% within ±2σ, 99.7% within ±3σ.
  3. p-value: If p < 0.05 → result is statistically significant → reject H₀.
  4. Type I vs II: Type I = false positive (wrongly rejected H₀). Type II = false negative (missed real effect).
  5. Correlation ≠ Causation: r measures linear relationship. To prove causation → need controlled experiment (A/B test).

Python — 5 One-Liners

# 1. Read CSV and show shape
df = pd.read_csv('data.csv'); print(df.shape)

# 2. Missing values per column
df.isnull().sum()

# 3. Group by and aggregate
df.groupby('city')['revenue'].agg(['sum', 'mean', 'count'])

# 4. Top 10 rows by a column
df.nlargest(10, 'revenue')

# 5. Quick correlation matrix
df[['revenue', 'profit', 'quantity']].corr()

Excel — 5 Key Formulas

FormulaWhat It Does
=INDEX(C:C, MATCH("X", A:A, 0))Better VLOOKUP — looks any direction
=SUMIFS(D:D, B:B, "Delhi", C:C, ">1000")Conditional sum
=COUNTIFS(A:A, "Delhi", B:B, ">50000")Conditional count
=IFERROR(VLOOKUP(...), "Not Found")Handle lookup errors
=TEXT(A1, "MMM-YY")Format dates

Interview Mantras — Say These Lines

TopicPower Line
Outliers"I'd use Median, not Mean, because Mean is sensitive to outliers."
A/B Test"I'd ensure sufficient sample size with a power analysis before starting."
Accuracy Paradox"98% accuracy is meaningless with 2% fraud — I'd use F1 Score and AUC."
Context Filter"It creates a temporary table — subsequent filters work within that subset."
CALCULATE"It modifies the filter context of a DAX expression."
Correlation"Correlation suggests a relationship but doesn't prove causation."
Decision Tree"Easy to interpret but prone to overfitting. Random Forest fixes this via ensemble averaging."
Star Schema"Fact table = events. Dimensions = context. Filters flow dim → fact."