Effective data storage and management makes data easy to use and becomes a living system that helps find and correct errors--building trust and reliability in the insight it unlocks.
With how much we read data in headlines and social media, it can be easy to assume decisions that impact communities are always informed and guided by the best information possible. Analysts and experts can get the data they need. Officials and public servants have the tools needed to make sense of it. Changemakers can communicate their impact.
Ideal...but not always the case. Data portals and gov websites are often hard to navigate and make data exploration more like a chore than, well...exploring, and you're often only accessing a piece of the picture. Even once you get the data you need, it is only the beginning of the work. There are more questions to be asked: How was this collected or calculated? What is (or is not) represented in this data? Are these data values accurate?
To answer these questions, you need to look toward the ongoing maintenance of your data system.
Data validation and ongoing maintenance are necessary to create and sustain confidence in your data and the insights you find. A system to identify and correct flaws is essential to healthy data.
Though data can get complex, we believe it's important to continue making data easier to find, work with, and share. But how do you effectively and efficiently trust all the data you need when you don't know where to start, or just don't have the time?
From our experience and our collaborations with partners, we know a comprehensive data environment is foundational for building confidence in data. When gauging trust in data, ask yourself:
Is this really all important? Data mistakes don’t just provide incorrect information. At their worst, data errors cause people to lose trust in all presented information. No person or data is immune to flaws, and even the most trustworthy of sources should be checked and validated.
Earlier this year, we ran into this issue with values from the United States Department of Agriculture (USDA).
Whether human or machine, mistakes and errors will happen--but they can be planned for.
Data helps answer questions such as: How do food choices and availability impact my community's health and diet? To provide information on community food access, the USDA's Economic Research Service (ERS) publishes the Food Environment Atlas. Data from the Atlas included in mySidewalk's Data Library, and our internal data process detected flaws in how some data values were calculated.
In a nutshell, important information straight from the source was wrong. To understand how we fixed it, we turn to the experts that manage mySidewalk's Data Library.
What made the data wrong?
How did you fix it?
We were able to make these corrections because the mySidewalk data system is built around geography and maintains metadata about every value. The mySidewalk data system made it easy for the team to pull the needed values for over 3,000 geographies to serve as the correct denominators, fixing over 70,000 precalculated percentages, and update all existing data visualizations with the corrected values.
An organized, clean, and accessible data system is a must-have for driving data-driven decisions with accuracy. Data isn't magic--there's a lot under the hood--but you don't have to be an advanced data scientist to get it right. Thanks to our data process, we were able to correct the issue and update over 70k values, preserving data stories in our partner's communities.
Of course--data alone isn't a path to success, both in process and real world impact. Even good data isn't helpful to anyone if nobody knows it's there.
By quickly correcting the USDA data values, real world stories of food landscapes and access can be told with accuracy and stay focused on building impact.
It is the work done by organizations like Weber-Morgan Health Department, Maricopa County Department of Public Health, and Vitalyst Health Foundation that uses data to push forward change. In this instance: using information from the USDA to share visualizations and narratives about community food conditions to the public. By publishing with mySidewalk, they can rest assured that the information displayed is clear for their audience and that the data is extensively and continuously reviewed.
Whether it's validating and updating data or helping craft a communications strategy, we're honored to have just a part of improving communities across the country.
Building and maintaining a data system requires time and expertise, but data access shouldn't have to. We launched Seek last month as the next step in our mission to democratize data. Seek allows you to browse, select, and download data from the mySidewalk Data Library, which is validated and maintained with the principles above and is what power data stories used by changemakers across the country.
Julian is Product Marketing Manager at mySidewalk, working across teams to connect mySidewalk's data tools with the challenges of today's change-makers. Inspired by using technology to help people, Julian's experience in strategic communications, digital marketing, and creative design helps grow the impact of democratizing data. Julian holds a Bachelor's in Business Information Systems from KU.
We're a technology company that builds data tools for people who aren't data scientists. Data in the right hands can change the world, so we're on a mission to democratize data.