Blog Open the app

Excel Power Query Fuzzy Merge Not Working? Here's What's Actually Going Wrong

May 1, 2026 · 9 min read

Power Query's fuzzy merge feature sounds like exactly what you need. You've got two spreadsheets with company names that don't quite match, and Microsoft added a "Use fuzzy matching to perform the merge" checkbox right there in the Merge dialog.

You check the box. Click OK. And the results are... terrible.

Half the matches are wrong. Companies that are obviously the same aren't matched. Others are paired with completely unrelated records. You fiddle with the similarity threshold, try again, and somehow the results get worse.

I've seen this exact frustration dozens of times. Power Query's fuzzy merge is genuinely useful in certain scenarios, but for the thing most people actually need it for — matching messy company names across two lists — it fails more often than it works.

Here's why, and what to do instead.

How Power Query Fuzzy Merge Actually Works

When you check "Use fuzzy matching," Power Query uses a Jaccard similarity algorithm on character pairs (bigrams). It breaks each text string into overlapping pairs of characters, then measures what fraction of pairs are shared between two strings.

For "Acme Corp":

For "ACME Corporation":

Already there's a problem. "Ac" and "AC" are different bigrams because the comparison is case-sensitive by default. You can normalize case with a transform step before merging, but most people don't realize they need to.

Even after case normalization, the bigram approach fundamentally struggles with company names because:

Problem 1: Short Names Get Bad Scores

Company names are often short. "EY" has one bigram: "EY." "Ernst & Young" has 13 bigrams. Their Jaccard similarity? Essentially zero. But they're the same company.

Same thing with "IBM" and "International Business Machines." Or "GE" and "General Electric." Or "JPM" and "JPMorgan Chase."

Any time a company uses an abbreviation or acronym in one list and the full name in another, Power Query will miss it completely. No amount of threshold adjustment fixes this — the algorithm literally cannot make this connection.

Problem 2: Common Words Dominate the Score

"Johnson Controls International" and "Johnson & Johnson" share the word "Johnson." Power Query sees significant bigram overlap and might match them. They're completely different companies.

"American Express" and "American Airlines" share "American." "First National Bank" and "First Republic Bank" share "First" and "Bank." These are the kinds of false positives that make fuzzy merge results unreliable for company data.

The algorithm doesn't understand that "Johnson" is a common word while "Controls International" is the distinguishing part. It's just counting character pairs.

Problem 3: The Threshold Is a Trap

Power Query's similarity threshold defaults to 0.80 (80%). Most people either leave it at default or try random values until something looks right.

The problem: there's no single threshold that works for company names.

You'd need a different threshold for almost every pair. Raising it reduces false positives but misses more real matches. Lowering it catches more matches but introduces junk. There's no sweet spot.

Problem 4: It Only Returns the Top Match

Power Query's fuzzy merge returns one match per row — the highest similarity score. If the correct match is the second-highest, you'll never see it. And you have no way to review the alternatives.

This is fundamentally different from a proper deduplication workflow, where you want to see all potential matches above a threshold and then decide which ones are correct.

When Fuzzy Merge Actually Works

To be fair, Power Query fuzzy merge is fine for:

These are cases where the strings are very similar — just a character or two off. The bigram approach handles these well.

It's specifically company name matching across different data sources where it falls apart, because company names have abbreviations, acronyms, word reordering, and missing words that go beyond simple typos.

What to Try Before Giving Up on Power Query

If you want to stay in Power Query, you can improve results with preprocessing:

Normalize Before Matching

Add a custom column in Power Query that cleans the company name:

Text.Upper(
  Text.Trim(
    Text.Replace(
      Text.Replace(
        Text.Replace([CompanyName], "Inc.", ""),
      "Corp.", ""),
    "LLC", "")
  )
)

This strips common suffixes and normalizes case. It won't fix abbreviation problems, but it helps with formatting differences.

Use a Transformation Table

Create a lookup table mapping common abbreviations to full names: "IBM" → "International Business Machines", "GE" → "General Electric", etc. Apply this transformation before the fuzzy merge.

This works if you know all the abbreviations in advance. For large, messy datasets, you usually don't.

Lower the Threshold, Then Filter

Set the threshold to 0.50 instead of 0.80. This gives you more matches, including wrong ones. Then manually review every match below 0.70 and delete the false positives.

This is essentially doing the deduplication manually with Power Query generating suggestions. It works for small datasets (under 500 rows) but gets painful fast.

What Actually Works for Company Name Matching

The fundamental problem with Power Query's approach is that it treats company names as character strings. But company names have structure — a name part, a suffix part, sometimes an abbreviation that maps to the full name.

Good fuzzy matching tools for company names typically:

  1. Normalize — strip suffixes, standardize case, remove punctuation
  2. Tokenize — compare individual words, not character pairs
  3. Weight tokens — "International" is less important than "Boeing"
  4. Use multiple algorithms — combine edit distance, token matching, and phonetic similarity
  5. Present results for review — show all potential matches with scores, not just the best one

The technical algorithm details are covered in fuzzy matching explained for anyone who wants the deeper dive.

For a practical solution, export your two lists to CSV and run them through a dedicated matching tool. DedupFuzzy handles the company name problem specifically — it catches abbreviations, word reordering, and suffix variations that Power Query misses. Upload, select the company name column, review matches, download results. The whole process takes a few minutes.

If you code, Python's rapidfuzz library with token_sort_ratio scorer is also solid. The approach is described in how to remove duplicate company names from CSV.

Should Microsoft Improve Fuzzy Merge?

Honestly, probably not. Power Query is a general-purpose data transformation tool, and its fuzzy merge handles the general case (typo correction, minor formatting) adequately.

Company name matching is a specialized problem. It needs domain-specific logic — awareness of business suffixes, abbreviation patterns, common name variations — that doesn't belong in a general-purpose tool.

The right workflow is to use specialized tools for the specialized part (matching), and Power Query for everything else (transforming, reshaping, loading). Don't fight the tool.

Frustrated with Power Query's fuzzy merge on company names? Upload your CSV and get accurate fuzzy matches in about 60 seconds — including abbreviations and word variations that Power Query misses. Free for 500 rows, no signup needed.

🚀 Try DedupFuzzy Free