Latest bookmarks (page 1 of 11)

1 Jul 2024 www.phdata.io
"To achieve this, organizations should begin by establishing conventions for project structure, model naming, model testing, and documentation. As a result, automating the enforcement of these conventions becomes feasible. Once these patterns are solidified and enforced, similar activities can be applied to template dbt Cloud Projects with Terraform."
1 Jul 2024 emilyriederer.netlify.app
"Using controlled vocabularies for column names is a low-tech, low-friction approach to building a shared understanding of how each field in a data set is intended to work. In this post, I’ll introduce the concept with an example and demonstrate how controlled vocabularies can offer lightweight solutions to rote data validation, discoverability, and wrangling.“
28 Apr 2024 hakibenita.com
"How to fail loudly when you really should"
12 Apr 2024 thecorrespondent.com
Brand keyword advertising, the presentation informed him, was eBay’s most successful advertising method. Somebody googles "eBay" and for a fee, Google places a link to eBay at the top of the search results. Lots of people, apparently, click on this paid link. So many people, according to the consultants, that the auction website earns at least $12.28 for every dollar it spends on brand keyword advertising – a hefty profit!
Tadelis didn’t buy it. "I thought it was fantastic, and I don’t mean extraordinarily good or attractive. I mean imaginative, fanciful, remote from reality." His rationale? People really do click on the paid-link to eBay.com an awful lot. But if that link weren’t there, presumably they would click on the link just below it: the free link to eBay.com. The data consultants were basing their profit calculations on clicks they would be getting anyway.
29 Mar 2024 simonwillison.net
I love this idea of issue-driven development. Everything (no matter how small) gets an issue, and the steps taken in resolving that issue each get a comment (and even better a screenshot) until the issue is closed with a relevant commit.
“What goes in an issue? Background: the reasons for the change. In six months time you’ll want to know why you did this. State of play before-hand: embed existing code, link to existing docs. I like to start my issues with “I’m going to change this code right here”—that way if I come back the next day I don’t have to repeat that little piece of research. Links to things! Documentation, inspiration, clues found on StackOverflow. The idea is to capture all of the loose information floating around that topic. Code snippets illustrating potential designs and false-starts. Decisions. What did you consider? What did you decide? As programmers we make decisions constantly, all day, about everything. That work doesn’t have to be invisible. Writing them down also avoids having to re-litigate them several months later when you’ve forgotten your original reasoning. Screenshots—of everything! Animated screenshots even better. I even take screenshots of things like the AWS console to remind me what I did there. When you close it: a link to the updated documentation and demo”
29 Mar 2024 dubroy.com
“Don’t get me wrong, I love having a whole afternoon to work on something without interruption. It’s probably ideal for most people. But it’s an exaggeration to say that it’s impossible to program well in units of an hour.”
29 Mar 2024 mollyg.substack.com
“I dare you to fail”
For people to take J Curve risks and be willing to jump off the cliff with you, they need to know that you’re gonna catch them if they fall. It’s important to set them up mentally to understand that their main job is not to be perfect but to learn as fast as they can. And that comes with mistakes. If they fail, they need to know that they still have a job and you’ll still think well of them. Because many high performers are used to being perfect or exceptional at their jobs, I often use the phrase “I dare you to…” followed by “fail” or “make a mistake” or “prove me wrong” to help them see that taking risks, struggling, making mistakes, etc., is an expected part of this journey.
28 Jan 2024 simonwillison.net
"I prepared a lightning talk about Git scraping for the NICAR 2021 data journalism conference. In the talk I explain the idea of running scheduled scrapers in GitHub Actions, show …"
28 Jan 2024 octo.github.com
"GitHub Next Project: Flat explores how to make it easy to work with data in git and GitHub, offering a simple pattern for bringing working datasets into your repositories and versioning them."
28 Jan 2024 www.getdbt.com
"If you’re a hotshot junior analyst and you want a guide for breaking through this skills plateau, this guide is for you."