Methodology & reading guide

This page is the long-form reading guide for the simulator. It explains in calm, descriptive language what the simulation does, why it does it, and where its limits are. If something is unclear in the lab, the answer is most likely here.

1. What the simulation is

A model. More precisely: an agent-based model with around 2 000 synthetic profiles meeting on a virtual dating marketplace. You take the architect perspective. You are not on the app yourself; you set the rules. Sliders on the left control selectivity, gender ratio, and population size. One click swaps the recommendation algorithm. You can toggle dark patterns taken from real platform practice. The right pane shows in real time what your changes do.

Everything runs in your browser. No server, no telemetry of your slider moves, no storage beyond theme and language preference. Closing the tab wipes the simulation. Shared seeds in URLs reproduce the same run on another visit.

2. Why it exists

Dating platforms are socio-technical experiments running on millions of people. Their internal rules are not public. What we know comes from aggregated industry reports, whistleblower data, court records, and a few research papers. A simulation can turn this outside view into a traceable mechanism model. You are not seeing a perfect copy of Tinder or Hinge. You are seeing which effects certain architectures structurally enforce.

Goal: after an hour with this lab you no longer think "algorithms are only for mathematicians", but rather "I understand why Hinge's SMR looks the way it does, and which business decision sits behind it."

3. The population

2 000 agents by default (slider 500-10 000). Each agent carries a vector:

Gender - binary 0/1. This is a simplification; the underlying studies are collected binary. Real user populations are more diverse.
Attractiveness - latent, normally distributed around 0.5. Not a real value, but an abstract construct for "how often will this profile be marked positive when seen".
Selectivity - how often the agent right-swipes others. Male agents start at 46 % (in the sim; empirically 33-53 %, the 46 % figure is contested across sources, see section 5), female at 5.5 % (with noise).
Interests - a 32-bit bitmask, 4-8 random bits set. Used for Jaccard similarity in FAIR-MATCH.
Trait - abstract feature that in reality often represents ethnicity. Only takes effect when the "trait bias" dark-pattern toggle is on. Around 25 % of agents receive the flag (see population.ts).
Counters - likes sent/received, matches, messages, days active, frustration.

Gender split 67 % male / 33 % female reflects aggregated platform data 2024/2025. This asymmetry is the precondition for every effect that follows.

4. The five engines - what they do, why they differ

Engine A - ELO

Idea: borrowed from chess. Every agent has a hidden score (start 1500). After a match, both scores shift by an amount that depends on the pre-match difference: being liked by someone with a much higher score earns you more points; being liked by someone much lower earns you almost nothing.

Effect: hierarchies form fast. An elite circulates among itself, the majority loses visibility. Tinder used a system like this in its early phase and has officially moved away. Residues still shape behavior.

Engine B - Gale-Shapley

Idea: Nobel-prize-winning algorithm for stable matchings (the classic "Stable Marriage Problem"). Each iteration: all available men propose, all women decide. No matching is stable if two people would prefer each other over their assigned partner.

Quirk in this sim: agents engage in "aspirational pursuit". They reject proposals whose attractiveness is well below their own aspiration tier. Result: many matches fail because both sides reject profiles outside their league, even though the algorithm identified the pairing as "stable".

Hinge uses this logic for its "Most Compatible" feature.

Engine C - Collaborative filtering

Idea: "people who click like you have similar taste", taken from e-commerce ("customers who bought this also bought...").

Problem: amplifies majority preferences exponentially. Profiles that already have many likes get shown to even more people; profiles with few likes disappear. This is the documented "popularity bias". Marginalized or non-mainstream profiles get systematically under-recommended.

Engine D - FAIR-MATCH

Idea: reciprocal recommender. Instead of one-sided "does X fit Y?", it takes the harmonic mean of two directed probabilities: does X fit Y and does Y fit X. The harmonic mean has a key property: if either value is small, the overall result drops sharply. Asymmetric interest is automatically devalued.

Effect: popularity bias is reduced. Gini drops. But: fewer total matches, because the engine ranks more conservatively.

Model note: The 0.4-weight term J(network) in s(x, y) = 0.6 · J(interests) + 0.4 · J(network) is currently computed using the same interest-Jaccard value as the 0.6 term. The "network" signal is therefore a proxy for shared interests, not an independent signal. The 0.6/0.4 split is cosmetic in the current implementation.

Engine E - FAIR-Honest

Idea: the open-source ideal. Engine E is FAIR-MATCH without any dark-pattern layer. Concretely: the applyDarkPatterns step is skipped entirely. The reciprocal recommendation logic (harmonic mean, same formulas as Engine D) runs unchanged.

Purpose: a reference baseline showing how the system would behave without commercial interventions. No surveillance pricing, no artificial match scarcity, no amplified trait bias.

Limited comparability: the dark-pattern layer is not merely a degradation mechanism - it also drives part of the baseline frustration, churn rate, and stable-pair exit events. Engine E skips this layer entirely. Its burnout, active-user, and exit KPIs are therefore structurally not directly comparable to those of Engines A-D. Differences can reflect the absence of dark patterns, but also the absence of frustration drivers that in reality exist independently of dark patterns. Use Engine E as a directional reference, not a quantitative benchmark.

5. Reading the KPIs

This is a model. The empirical benchmarks below come from external sources (Tinder/Hinge reports, research literature). They serve as a plausibility check for the simulation's output, not as proof that the simulation correctly reproduces reality.

EMPIRICAL external measurement from research or industry report SIM sim behavior or guidance value for this simulation

SMR - Swipe-to-Match-Ratio

SMR = matches ÷ likes sent. The hard indicator of "market value" on the app.

EMPIRICAL Men: average ~5.3 %, median 2 %. Top 10 % > 12.5 %.
EMPIRICAL Women: average ~44 %, median 41 %. Bottom 10 % around 15.8 %.

SIM If your sim shows male SMR of 30 %, you have either pushed female selectivity to atypical highs, or you are running a tiny population on lucky rolls. Realistic are the defaults, which is why they are defaults.

MRR - Message Response Rate

Share of your opening messages that get a reply. Measures whether your conversation even starts. We split MRR "you initiated" vs. "they initiated". The asymmetry is huge: men initiate 79 % of conversations with 12-char openers; women initiate 21 % with 120+ chars. EMPIRICAL

MDC - Match-to-Date Conversion

Share of matches that turn into a simulated real-world date. EMPIRICAL 3-10 % in reality. SIM Lower than that: you likely have dark patterns active or a very uneven distribution.

Gini coefficient

Borrowed from economics. Describes how unequal a distribution is. 0 = everyone gets the same, 1 = one person has it all. EMPIRICAL Hinge sits at around 0.58, more inequality than most national economies show. SIM Our default sim reproduces this value approximately.

Top 10 % share

Share of all female likes that goes to the top 10 % of male profiles. EMPIRICAL Around 58 %. SIM If this climbs to 80 %, Engine C (collaborative) is probably active, or you have enabled trait bias.

Burnout index

Aggregate "frustration" across active agents. Frustration grows when likes go unmatched, when conversations break, when dark patterns degrade UX. Above 75 % the end state triggers: a modal noting simulated mass attrition. EMPIRICAL 79 % of Gen Z, 80 % of millennials report app burnout.

Male selectivity - note on the default value

The default starting value for male selectivity in this simulation is 46 %. EMPIRICAL This figure is contested across sources: estimates range from 43 % to 53 % depending on study, platform, and measurement year. The value sits within the reported range of 33-53 % but should not be read as a settled single-point figure.

6. Dark patterns

Dark patterns are design choices that steer user behavior against user interest. They are explicitly named in this sim, off by default, and one click away. You should see what they do, not because they are cool, but because they exist.

Surveillance pricing

Real platforms have documented varying premium prices by user profile. The best-known example is the 2018 class action against Tinder (California court): users over 30 were charged twice what under-30s paid for the same premium features, i.e. age-based demographic price discrimination. Settlement: USD 60.5 m.

Model note: In this simulation, surveillance pricing is triggered by tenure and match rate (condition: more than 5 days active and fewer than 3 matches). Demographic or time-of-day pricing, as documented in some real cases, is not implemented in this simulation. The connection to the Tinder age case serves as a real-world reference example, not as a description of the model's behavior.

Artificial match scarcity

Highly compatible pairs are withheld and revealed only after a premium purchase. The algorithm knows that X and Y would likely match and only surfaces the suggestion when someone pays. Documented in industry reporting (Groundwork Collaborative 2024).

Trait bias

An abstract profile trait is algorithmically devalued. In this simulation, the effect applies exclusively to the Collaborative Filtering engine (Engine C). Under ELO, Gale-Shapley, and FAIR-MATCH, the trait flag has no effect on candidate selection. Around 25 % of agents receive the trait flag.

We model this as an abstract trait, not ethnicity. Real research (Cornell 2018, OkCupid data) shows exactly this pattern is measurable on around 65 % of studied platforms.

7. Formulas

Gini coefficient

G = 1 - 2 · (Σ cumulative values) / (n · Σ values) + 1/n

Jaccard coefficient

J(A, B) = popcount(A ∧ B) / popcount(A ∨ B)

Harmonic mean (FAIR-MATCH)

r(x, y) = 2 / (1/s(x,y) + 1/s(y,x))

with s(x, y) = 0.6 · J(interests) + 0.4 · J(network) + 0.001

Note: J(network) is currently a proxy for J(interests), not an independent network signal.

ELO update

e(A) = 1 / (1 + 10^((r_B - r_A) / 400)), then r_A <- r_A + K · (1 - e(A)) on a match.

8. What it cannot do

It says nothing about specific individuals. The agents are synthetic.
It is not an empirical model of Tinder/Hinge/Bumble. Internal algorithms are not public.
It does not reproduce rare events (romance scams, harassment, violence risks). These belong in the Safety & Trust chapter.
It is not therapy, counseling, or self-help.

9. Reproducibility

Every run is reproducible via URL query. Example: ?seed=42&pop=2000&ms=0.67&ml=0.46&fl=0.055&engine=fair-match&dp=sp,as. Same URL = bit-identical trajectory. Share it to discuss concrete findings.

10. Sources

Defaults and benchmark values come from the lexicon sources (see the bibliographies in articles 1-10) and the AlgoMatch source document. Where lexicon and simulator disagree, the lexicon wins on substantive claims and the simulator wins on its own technical reality (e.g. default parameters).