Gutman and Goldmeier emphasize the importance of developing critical thinking skills in data-related tasks. They argue that being a "Data Head" is not just about understanding the technicalities of data science; it's also about being able to approach data with a healthy dose of skepticism and the ability to ask the right questions. This part will focus on building a robust and critical mindset to effectively engage with data-driven projects.
The authors describe the "Data Science Industrial Complex" as an industry saturated with promises, buzzwords, and products that can lead to data project failures. Data Heads address this by adopting a critical mindset that questions everything, especially in the face of authority, marketing hype, or overly optimistic expectations.
Gutman and Goldmeier advise us to ask five fundamental questions before starting any data initiative. This ensures the project is addressing a meaningful issue for the business and will help avoid common pitfalls of the "Industrial Complex of Data Science."
1. What makes this problem significant? Understand the rationale for the initiative. What company need does it address? Who are the stakeholders, and what value will they receive from the project's outcome? Challenge projects that seem to be focused on specific methodologies or deliverables rather than solving an essential issue for the business.
2. Whom does this issue impact? Identify who will use the data project's output, both internal and external to the company. How will the project affect their workflow? What decisions will be made from the outcomes? Answering this question can help assess the possible impact of the project and identify stakeholders who should be involved.
3. What happens if we lack the correct data? Acknowledge the limitations of existing data and create contingencies for situations where the available data cannot answer the original question. Should we gather new data? Should the project's scope be redefined?
4. At what point is the project finished? Avoid endless endeavors with unclear results. Define the outputs and milestones that signify project completion. This brings expectations into alignment, helps teams focus on actionable goals, and allows for better resource allocation.
5. What happens if we dislike the outcomes? Explore situations where the analysis contradicts the stakeholders' assumptions. How might the team react to such outcomes? What other actions could be considered? Thinking through unfavorable outcomes before they happen ensures that the group isn't pressured to produce a specific, predetermined result.
Practical Tips
- Start a "Why Journal" where you record the reasons behind each new habit or change you're trying to implement. Before adopting a new habit, write down the specific need or goal it addresses. This practice can help solidify your commitment and provide clarity when motivation wanes. If you decide to wake up an hour earlier each day, jot down that it's to create a quiet time for learning a new language, directly tying the habit to your goal of becoming bilingual.
- Use a feedback loop for small personal goals by asking friends or colleagues how they benefit from your actions. For example, if you start a fitness routine, ask your workout buddy how your commitment is motivating them. This can help you tailor your efforts to maximize the positive impact on those around you.
- Develop a "Methodology-Free Meeting" protocol for your team where discussions are centered around problems rather than processes. In these meetings, encourage team members to discuss the core issue at hand and propose solutions without defaulting to standard procedures or past strategies. This could lead to innovative approaches that are more directly targeted at the issue.
- Map out a "day in the life" of a user before and after the data project implementation. Spend a day observing or interviewing a user to understand their workflow, then create a visual comparison that highlights the changes brought by the data project. This could involve shadowing a retail manager to see how a new inventory tracking system saves time or reduces errors in ordering stock.
- Use social media polls to gauge the potential impact of your project. Post a brief description of your project on platforms like Twitter, LinkedIn, or Facebook, and ask your network to vote on aspects they find most beneficial or concerning. This can provide quick, diverse feedback and help you understand the broader implications of your project.
- Develop a "What-If" journal to practice contingency planning in everyday life. Each week, choose a different aspect of your life, like your commute, meal planning, or your workout routine. Write down what you currently expect to happen based on the data you have, then brainstorm at least three "what-if" scenarios where things don't go as planned. For each scenario, come up with a practical contingency plan. This exercise will help you become more adaptable and better prepared for unexpected events.
- Set up a "completion criteria" checklist for each project you undertake. Before starting, list out all the conditions that must be met for you to consider the project finished. As you work, tick off each criterion. This method ensures you don't add unnecessary tasks and helps you stay focused on the original objectives.
- Partner with a friend or family member for a bi-weekly "milestone meetup." During these meetups, discuss the milestones you've both set and the progress you've made. This creates a support system that can offer encouragement, share insights, and help maintain a clear focus on actionable goals.
- Create a "Stakeholder Surprise" journal where you document instances when data contradicted expectations, noting stakeholders' initial...
Unlock the full book summary of Becoming a Data Head by signing up for Shortform.
Shortform summaries help you learn 10x better by:
Here's a preview of the rest of Shortform's Becoming a Data Head summary:
Gutman and Goldmeier emphasize that being a Data Head requires a solid understanding of data, statistics, and probability, whether or not you're the one performing the analysis. They believe these foundational concepts are key to understanding how information is used and misused in the modern business world.
The authors define data as encoded information, distinguishing between the terms. Information consists of knowledge derived from various sources, such as measurements, experiences, or interactions with the environment. Data is a structured representation of this information, typically encoded in numerical or categorical values for analysis and manipulation. They use the example of an organization storing marketing campaign information on a spreadsheet. Each row in the table represents a specific campaign (an observation or record), while the columns represent different features of the campaign, such as the location, advertising spending, or sales volume.
Gutman and Goldmeier emphasize that being proficient in data requires knowing data types and how these relate to the...
The authors provide clear and understandable descriptions of different machine learning methods, emphasizing their applications, limitations, and potential pitfalls. They equip data heads with the knowledge to understand and critically evaluate ML models used in the workplace, regardless of their technical background.
According to the authors, in supervised learning, you train algorithms on labeled data, where each observation has a known input and a corresponding known output. The algorithm learns how inputs and outputs relate, enabling it to predict the output for new, unseen inputs.
Regression models are used in supervised learning when the target variable (which we want to predict) is a continuous numerical value. We explored least squares linear regression extensively in chapter 9, but the general idea behind linear (and non-linear) regression models is to find an optimal equation with parameters that would minimize the difference between the predicted value from the model and the actual value of the target, on average. The authors emphasize that...
This is the best summary of How to Win Friends and Influence People I've ever read. The way you explained the ideas and connected them to other books was amazing.
Gutman and Goldmeier emphasize that successfully navigating data projects requires understanding and mitigating potential pitfalls beyond the technicalities of fields like data science and machine learning. Here, they delve into the human elements of data projects, why communication matters, and the larger ethical considerations.
As we have learned, bias can manifest in many forms, both in datasets and among the individuals who work with them. This sections focuses on the bias associated with bad data, as Data Heads will not always be able to collect new, pristine experimental data to overcome these potential issues.
Gutman and Goldmeier remind us that people who work with data must be vigilant about identifying and addressing biases both in data and in decision-making processes. Frequent biases include:
Survivor bias: Occurs when focusing only on data from successful outcomes, ignoring those that failed or were excluded, leading to an overestimation of success rates and potentially perpetuating flawed strategies.
Confirmation bias: Occurs when analyzing...
Becoming a Data Head