Connect with us

Misc

Here are 15 Common Data Fallacies to Avoid

Published

on

In today’s tech-driven economy, data is essential for gaining new insights, making decisions, and building products.

In fact, there is so much data out there, that the quantity of it is doubling every two years⁠—and by 2025, there will be 175,000 exabytes of data in existence.

This is an unprecedented figure, and it’s hard to put into perspective. To give you some sense, a single exabyte is equal to 1,000,000,000 GB of data, and five exabytes has been said to be roughly equal to “all of the words ever spoken by mankind”.

Common Fallacies With Data

As you can imagine, digging through all of this data can be quite the challenge.

Data comes in many different forms and not all of them are easy to analyze. As a result, it is tempting to take shortcuts with data, or to try and fit the incoming data we receive into our pre-conceived notions of how things ought to be.

Today’s infographic comes to us from Geckoboard and it shows the common mistakes that people make in analyzing data. We’ve reformatted their PDF to fit here.

15 Common Data Fallacies

Here are 15 Common Data Fallacies to Avoid

How do we avoid painting a bullseye around the arrow, so that we can interpret the meaning of data in a logical, consistent, and methodological way?

The key is to understand common mistakes that people make with data, and why these errors skew our interpretations.

Examples of Fallacies

Here are four examples of fallacies, and why each is considered a faux-pas by data scientists.

1. Survivorship Bias

When people analyze the qualities it takes to be a successful entrepreneur, we typically look at the existing population of established entrepreneurs for clues. However, by limiting our sample just to this “surviving” group of entrepreneurs, we run the risk of survivorship bias.

There are lessons we can learn from all of the entrepreneurs who have failed—they are just much harder to find. Integrating that data into the story can help complete a much fuller picture.

2. False Causality

Did you know that there is a 95% correlation between the marriage rate in Kentucky and the amount of people who drown each year from falling out of fishing boats? (See it, and other bizarre correlations here)

Kentucky marriages vs. people who drown

Does this mean that there is some sort of relationship between the two variables?

Finding a high level of correlation can happen simply by chance—but awarding false causality is one of the most amateur statistical mistakes in the book.

3. The Gambler’s Fallacy

If the roulette wheel turns up black for 26 times in a row, does that mean that it will revert back to red on the next spin?

It’s easy to say that the odds don’t change, but imagine being in the moment. The Gambler’s Fallacy happens with data analysis as well: just because something happens unusually frequently over a period of time doesn’t mean that nature will “even it out”.

4. The Cobra Effect

Data can be used to measure progress in achieving business goals, but what if there is incentive to game these goals?

Wells Fargo, in an effort to upsell existing clients, introduced an incentive called “eight is great”. In short, their employees were encouraged to sell eight accounts per customer, which could take the form of credit cards, savings accounts, and other financial services.

In an example of good intentions gone awry, Wells Fargo employees began breaking the rules to meet their targets. Millions of unauthorized credit card and deposit accounts were opened based on this perverse incentive, and the bank was eventually ordered to pay a $142 million settlement.

Click for Comments

Misc

The Top 25 Nationalities of U.S. Immigrants

Mexico is the largest source of immigrants to the U.S., with almost 11 million immigrants.

Published

on

Bar chart showing the top 25 nationalities of US Immigrants.

The Top 25 Nationalities of U.S. Immigrants

This was originally posted on our Voronoi app. Download the app for free on iOS or Android and discover incredible data-driven charts from a variety of trusted sources.

The United States is home to more than 46 million immigrants, constituting approximately 14% of its total population.

This graphic displays the top 25 countries of origin for U.S. immigrants, based on 2022 estimates. The data is sourced from the Migration Policy Institute (MPI), which analyzed information from the U.S. Census Bureau’s 2022 American Community Survey.

In this context, “immigrants” refer to individuals residing in the United States who were not U.S. citizens at birth.

Mexico Emerges as a Leading Source of Immigration

Mexico stands out as the largest contributor to U.S. immigration due to its geographical proximity and historical ties.

Various economic factors, including wage disparities and employment opportunities, motivate many Mexicans to seek better prospects north of the border.

CountryRegion# of Immigrants
🇲🇽 MexicoLatin America
& Caribbean
10,678,502
🇮🇳 IndiaAsia2,839,618
🇨🇳 ChinaAsia2,217,894
🇵🇭 PhilippinesAsia1,982,333
🇸🇻 El SalvadorLatin America
& Caribbean
1,407,622
🇻🇳 VietnamAsia1,331,192
🇨🇺 CubaLatin America
& Caribbean
1,312,510
🇩🇴 Dominican RepublicLatin America
& Caribbean
1,279,900
🇬🇹 GuatemalaLatin America
& Caribbean
1,148,543
🇰🇷 KoreaAsia1,045,100
🇨🇴 ColombiaLatin America
& Caribbean
928,053
🇭🇳 HondurasLatin America
& Caribbean
843,774
🇨🇦 CanadaNorthern America821,322
🇯🇲 JamaicaLatin America
& Caribbean
804,775
🇭🇹 HaitiLatin America
& Caribbean
730,780
🇬🇧 United KingdomEurope676,652
🇻🇪 VenezuelaLatin America
& Caribbean
667,664
🇧🇷 BrazilLatin America
& Caribbean
618,525
🇩🇪 GermanyEurope537,484
🇪🇨 EcuadorLatin America
& Caribbean
518,287
🇵🇪 PeruLatin America
& Caribbean
471,988
🇳🇬 NigeriaAfrica448,405
🇺🇦 UkraineEurope427,163
🇮🇷 IranMiddle East407,283
🇵🇰 PakistanAsia399,086
Rest of World11,637,634
Total46,182,089

Mexicans are followed in this ranking by Indians, Chinese, and Filipinos, though most immigrants on this list come from countries in the Latin American and Caribbean region.

On the other hand, only three European countries are among the top sources of U.S. immigrants: the UK, Germany, and Ukraine.

Immigration continues to be a significant factor contributing to the overall growth of the U.S. population. Overall population growth has decelerated over the past decade primarily due to declining birth rates.

Between 2021 and 2022, the increase in the immigrant population accounted for 65% of the total population growth in the U.S., representing 912,000 individuals out of nearly 1.4 million.

If you enjoyed this post, be sure to check out Visualized: Why Do People Immigrate to the U.S.? This visualization shows the different reasons cited by new arrivals to America in 2021.

Continue Reading
Visualizing Asia's Water Dilemma

Subscribe

Popular