Data Sandbox

Expand all | Collapse all

A Glossary of Data Terms

  • 1.  A Glossary of Data Terms

    Pitney Bowes
    Posted 04-22-2019 10:11

    For the last couple of years, I have been looking for ways to build, maintain and share a glossary of terms that can be used by data related professionals.  With the launch of the Data community, there is an opportunity to utilize the power of this collective working group to share thoughts for building and improving a glossary of terms that we can all use as we strive to get our exact points across in our documents, presentations, proposals, and other forms of communication.  The goal is to have a source of definitions that are free to use, with the certainty that comes from the backing of comments and edits from this community, drawing from the energy, knowledge and interest from all of us.  Using the format of a discussion group, we can all see where there is conflict and agreement on terms and usage.  We may also find that some words elicit emotional reactions.  

    It should be fun, especially if you are a word and data geek like I am.

    Please let me know your thoughts, and if you have any interest in this project.  In the meantime, here's a draft definition of a first term:

    Internal Data Sources:  Data created and/or retrieved from inside a company's, or organization's systems as it conducts day-to-day operations.  The purpose for generating and storing this information is to inform decisions made in successful operations of that company.

    There are usually four areas which generate and gather internal data: 1) operations (e.g. supply chain, and production systems), 2) sales, marketing, and support (revenue-related), 3) finance; and, 4) human resources.

     

    Please offer suggestions on improving this definition-- I've written it purely as a starting point.

    As we get a number of terms defined, we'll post a version of the glossary as a document for download.  At that time, I'll start another thread asking for some input on version control and best practices for shared documents. 



    ------------------------------
    Dan Adams
    Pitney Bowes
    White River Junction, VT
    ------------------------------


  • 2.  RE: A Glossary of Data Terms

    Pitney Bowes
    Posted 04-22-2019 14:46
    This is a great idea Dan.  The combination of creating a common business vocabulary and making it available to everyone will be powerful.  As this project evolves, I'd be interested in suggestions from the community on how best to categorize or order these terms in the most relevant ways so that they can be searched and explored appropriately over time.  I also think it'd be valuable to have a related terms reference.  Using your first definition of internal data sources as the example, it could be helpful to users to have a nearby link or definition for external data sources noting how they differ or referencing that both terms can be referred to as secondary data, and then defining what that term means, etc.

    ------------------------------
    Jessica Willis
    Knowledge Community Shared Account
    Shelton CT
    ------------------------------



  • 3.  RE: A Glossary of Data Terms

    Pitney Bowes
    Posted 04-23-2019 09:40
    Jess, I agree these are useful steps we should take.  There is a lot of benefit to linking terms and showing the relationships of how certain types of data is/n't related. It also makes me that much more aware of how much work putting a complete glossary together is going to be.

    Please feel welcomed to draft and post definitions for terms you want to get feedback on.  The more we put our heads together and offer up definitions, and share the work, the sooner we'll get to having a valuable resource for the entire community.

    I've done a few web searches on these terms, and what I find is that there are definitions on hand, but they often are very focused on a specific technology or vendor (the first return for External Data Sources is an Oracle definition, the second is SFDC).   The definitions don't apply broadly, but are aimed at a specific situation/technology/implementation which makes it difficult to use that term for a cross functional audience such as a  business case or proposal without first defining the term to ensure everyone has the same meaning in mind.  As you point out, when terms are related, it is all the more important to have a source of definitions that enforce those relationships.

    One more thing: As I starting thinking through a definition of external data sources, I realize we need to define first, second, and third-party data as well.

    ------------------------------
    Dan Adams
    Pitney Bowes
    White River Junction VT
    ------------------------------



  • 4.  RE: A Glossary of Data Terms

    Pitney Bowes
    Posted 04-24-2019 11:39
    Dan, there are other types of data that I would categorize as Internal Data Sources. For example, years ago, we collected thousands of GPS control points to measure the spatial accuracy of our digital street network, and the results helped us figure out where to invest in realigning the street network to a more accurate data source. Another example would be customer feedback surveys, where we collect information from customers about their client experience and use the results to understand client pain points and define improvement projects.

    I would consider both the GPS control point database and the data collected from customer feedback surveys as Internal Data Sources, even though the data itself did not originate from internal systems. I would therefore propose a slight modification to the first part of your definition:

    Internal Data Sources: Data created by a company or organization and/or retrieved from inside its business systems as it conducts day-to-day operations.  The purpose of this information is to inform decisions made in successful operations.

    ------------------------------
    Colleen Reed
    Knowledge Community Shared Account
    ------------------------------



  • 5.  RE: A Glossary of Data Terms

    Pitney Bowes
    Posted 04-25-2019 08:46
    Colleen, I think the GPS data was operational data in that it was collected for internal operational purposes, but it certainly isn't generated within "business systems".    That's a good change.  I'm wondering if it can be made easier to read-- what do you think of this:

    Internal Data Sources: Data created by a company or organization to inform decisions made in its successful operations.  This data may be retrieved from inside its business systems as it conducts day-to-day operations, or may be created for a specific need.

    There are usually four areas which create, generate and gather internal data: 1) operations (e.g. supply chain, and production systems), 2) sales, marketing, and support (revenue-related), 3) finance; and, 4) human resources.

    Your comment makes me think of the concepts around manufacturing data for the purpose using that data in an external product or service.  Any ideas for how to handle that?

    ------------------------------
    Dan Adams
    Pitney Bowes
    White River Junction VT
    ------------------------------



  • 6.  RE: A Glossary of Data Terms

    Pitney Bowes
    Posted 04-26-2019 08:58
    Dan, yes, I would classify data that are manufactured for the purposes of creating or enhancing products or services as Internal Data Sources, as these data are created and owned by the company that manufactured it. An example would be field collection of transportation attributes used for building or enhancing a navigation data product. With this in mind, I am proposing the following modification to the definition:

    Internal Data Sources: Data created by a company or organization to inform decisions made in its successful operations, or used to create or enhance products and/or services that it offers to the market.  This data may be retrieved from inside its business systems as it conducts day-to-day operations, or may be created for a specific need.

    There are usually four areas which create, generate and gather internal data: 1) operations (e.g. supply chain, and production systems), 2) sales, marketing, and support (revenue-related), 3) finance; and, 4) human resources.

    ------------------------------
    Colleen Reed
    PITNEY BOWES SOFTWARE, INC
    Maitland FL
    ------------------------------