Science
Why Your MBTI Type Keeps Changing — The Science of Test-Retest Reliability
Published April 27, 2026 · 11 min read
You took an MBTI test two years ago and got INFJ. You took it last summer and got INFP. You took it this morning and got INTP. Now you don't know what to believe and you're starting to wonder if any of this is real. This is the single most common question MBTI users ask, and the answer is more interesting than "the test is broken." Your type is shifting for specific, predictable reasons, and once you understand them, you can read your own results far more accurately. Here is the actual science.
The headline number: about 50% change at least one letter
The most-cited finding in MBTI psychometrics is that roughly half of people who retake the test within about five weeks come back with at least one letter changed. The original work on this is from Howes and Carskadon in the early 1980s, and subsequent reviews (notably the Pittenger meta-analyses) have replicated the pattern across different MBTI versions and populations.
That number can sound damning, but it actually tells us something specific. Most of those flips are along the dichotomies where the person was scoring close to 50/50 in the first place. Almost nobody flips from a strong Extraversion preference to a strong Introversion preference. What does happen is somebody who scored 53% Thinking and 47% Feeling one week scores 49% Thinking and 51% Feeling the next week, and the test labels them differently. The underlying personality didn't move much. The categorization rule did.
Your type isn't really changing. The test is forcing a binary answer to a question that's actually a slider, and you happen to be sitting near the middle of the slider.
Reason 1: You're near the midpoint on a dichotomy
This is the dominant cause and worth understanding deeply. MBTI treats each of the four dimensions as a binary choice. But the underlying psychological traits are continuous — they distribute along a roughly normal curve where most people cluster near the middle. If you're near the midpoint on, say, the Thinking/Feeling dimension, you have a roughly equal chance of being labeled T or F on any given administration, because tiny day-to-day variation in your answers swings you across the cutoff.
This isn't a flaw in you. It's a flaw in any system that forces continuous data into binary categories. The fix isn't to retake the test until you "get the real answer" — the fix is to accept that you have weak preferences on that dimension, which itself is a valid and useful piece of self-knowledge. People with strong preferences (above 70% on a dichotomy) almost never flip. People near the middle flip constantly.
Reason 2: Mood and context shift your answers
Personality tests are self-report instruments. You're not measuring an objective trait directly — you're measuring your current perception of yourself, filtered through your mood, recent experiences, and the framing you bring to each question.
If you take the test after a productive week at work where you crushed your goals, you'll skew Judging. If you take it Sunday night after a chaotic weekend, you'll skew Perceiving. If you just had a great conversation with a friend, you'll skew Feeling. If you just lost an argument and felt frustrated by emotional reasoning, you'll skew Thinking. These shifts are small but they're enough to flip your letter when you're already near the midpoint.
- Stressful periods inflate Introversion scores (you crave alone time more than usual).
- Demanding work periods inflate Judging and Thinking (you're in execution mode).
- Relationship conflict inflates Feeling (your emotional bandwidth is loaded).
- Creative or curious phases inflate Intuition (you're scanning for possibilities).
Reason 3: You actually grew
Personality is more stable than people assume but more changeable than personality psychologists used to claim. Big Five longitudinal research shows meaningful drift over years — people tend to become more conscientious and agreeable through their twenties and thirties, with neuroticism gradually decreasing for most. MBTI captures echoes of these same changes.
If you got a different type at 18 vs 28 vs 38, some of that genuinely reflects who you became. A formerly anxious Perceiver who developed real productivity habits in their thirties might now legitimately answer Judging questions differently. A formerly closed-off Thinker who went to therapy and learned to identify emotions might now legitimately answer some Feeling questions differently. These shifts are real growth, not noise.
Reason 4: You're answering aspirationally
This is the trickiest source of variance. When you read questions like "I usually plan ahead in detail," you might answer based on the planner you'd like to be rather than the planner you actually are. This is called self-report aspiration bias, and it's why people who say they're highly conscientious on questionnaires often don't show conscientious behavior in observable measures.
On any given test administration, your aspiration level fluctuates. The morning after a goal-setting conversation, you'll answer aspirationally. After a week where reality humbled you, you'll answer more accurately. The fix is to answer based on how you actually behaved last week, not how you wish you behaved. We try to phrase our questions to surface real behavior over self-image, but the bias is genuinely hard to eliminate completely.
Before each question, imagine a specific recent week of your life. Answer based on what you actually did that week, not what you generally aspire to. Aspiration bias drops dramatically.
The cognitive function stack is more stable than the four-letter code
Here's a useful escape hatch. If your letters keep flipping, especially between two specific types (INTJ and INTP, ENFP and ENFJ, INFJ and INFP), shift your attention from the four-letter code to the underlying cognitive function stack.
Each MBTI type corresponds to a specific stack of four cognitive functions in a specific order. An INTJ uses Ni (Introverted Intuition) as dominant, Te (Extraverted Thinking) as auxiliary, Fi (Introverted Feeling) as tertiary, and Se (Extraverted Sensing) as inferior. An INTP uses Ti-Ne-Si-Fe. Those are radically different mental architectures despite differing by only one letter on the surface.
When you read about how each stack actually operates, one usually fits noticeably better than the other. People who can't decide between INTJ and INTP often resonate strongly with either Ni or Ti once they understand what each function actually feels like to use. The function stack is the more stable layer of identity beneath the noisy four-letter code.
If you want a more reliable measurement
There are two ways to reduce flipping. The first is to use a test format that captures gradient rather than forcing binaries. Our MBTI test uses a 7-point Likert scale per item, which dramatically reduces the rounding errors you get from agree/disagree formats like the one 16Personalities uses. You'll still get a four-letter type at the end, but the underlying data will be more nuanced and the result more stable across retakes.
The second is to use the Big Five. Validated Big Five inventories typically show retest reliability coefficients above 0.8 over multi-week intervals, which is meaningfully higher than what MBTI-style tests achieve. You won't get a memorable four-letter label, but you will get a profile that doesn't reorganize itself every time you take the test.
So what type are you really?
The most useful framing: you probably have a "primary type" that fits you most of the time, and one or two "neighbor types" you flip into under specific conditions. An INTJ who flips to INTP under stress isn't confused — they have a primary INTJ profile with mild T/J midpoint scores. An ENFP who reads as ENFJ in years where they have heavy social responsibility isn't wrong — their J/P preference is contextual rather than rigid.
Hold your type loosely. It's a useful descriptor, not a permanent ID card. Read about the two types you flip between, notice which one matches your behavior across the broadest range of situations, and let that be your working answer. Then check yourself against the function stack and against a Big Five profile for a second opinion. The convergent answer across all three is much more trustworthy than any single test result.
Frequently asked questions
Is it normal for my MBTI type to keep changing?
Completely normal. Across most published studies, around 50% of test takers retest with at least one letter changed within five weeks. The most common flips are along dichotomies where you score near the midpoint — usually T/F or J/P. It doesn't mean the test is broken or that you're confused about yourself; it means you're a continuous human being being forced into a binary slot.
Which MBTI dichotomies change the most?
Thinking vs Feeling and Judging vs Perceiving are by far the most volatile across retakes, because most people score closer to the midpoint on these than on Extraversion/Introversion or Sensing/Intuition. Mood and recent life events affect T/F answers especially strongly. E/I tends to be the most stable letter because most people have clearer preferences about social energy.
Does this mean MBTI is fake?
No, it means MBTI is a blunt instrument. The underlying patterns it measures are real, but cramming continuous traits into binary categories produces avoidable noise. The Big Five framework addresses this by reporting your scores as percentiles along a spectrum rather than collapsing them into a type. Use MBTI as a useful heuristic, not a permanent identity.
Should I just pick the type that feels most like me?
Often, yes. If you've taken the test multiple times and consistently bounce between two specific types (say INFJ and INFP), reading both type descriptions carefully and asking trusted friends which fits you better is often more accurate than another retake. The cognitive function stack of each candidate type is also useful here — one usually fits noticeably better than the other.
What's the most reliable personality test?
By scientific standards, validated Big Five inventories like the NEO-PI-R and public-domain equivalents have the best test-retest reliability of any widely-used personality framework. They typically show retest reliability coefficients above 0.8 over multi-week intervals, substantially better than MBTI-style instruments. If reliability is your priority, Big Five is the right tool.
Take the Mindshape test (7-point scale)
Our 7-point Likert format captures the gradient that forced-binary tests collapse, so your result is more stable across retakes — especially if you've been flipping between two types.