What’s wrong with the trash can below?
This was on a street in the historical center of Hannover, which is popular among tourists who come to see the sights and have a coffee in the leafy squares bounded by old buildings dating back hundreds of years. Isn’t it good to have a place to get rid of your banana peel or sandwich wrapper?
Inspired by a recent xkcd comic that shows how search terms trend over time, I decided to play around with Google Trends to see if I could find any interesting patterns.
“Random Obsessions” from xkcd
Here’s an attempt at classifying the different patterns that I observed.
Earlier this week I was flying through Munich airport and the plane took the scenic route over Munich and its surroundings. The sun was out and the rapeseed fields were in bloom, making patches of bright and pretty yellow all over the landscape. I saw this from the window and wondered what it could be – was it the world’s longest swimming pool?
A quick search on Google Maps after I got home gave me the answer: it’s the Regattastrecke Oberschleißheim, an artificial rowing course built for the 1972 Olympics. I really enjoyed watching the contrasting colors in the landscape from the air, especially the different blues and greens of the various water bodies, and the snowy mountain ranges that we flew over before getting to Munich. The patterns in the formal gardens at the Nymphenberg Palace could also be seen, but I wasn’t fast enough with my camera.
If only the weather was always so nice when I am in the air!
Nice blog post from Rafa Irizarry on why Interactive Data Analysis (IDA) is important, instead of mindlessly applying workflows.
Some points I agree with:
- IDA is necessary to discover outliers, to get a “feel” for the data, to check if applied analyses are appropriate
- “Data generators” who produce the raw data are usually not trained data analysts
Some reservations I have about the post:
- I think that knocking on mindlessly-applied workflows is a bit of a crowd-pleasing, “preaching to the choir” statement. If you ask people directly, no one would sign on to the statement “We should use workflows without thinking about whether they are appropriate” (even if in practice that is what many of us are doing, myself included)
- Standardized workflows are useful for reproducibility. Outliers that screw up data analyses are like bugs in computer code. And as anybody who’s tried to get IT help knows, one of the first things we’re asked to do is to reproduce the bug.
What I especially like is his call for IDA to be a bigger part of existing workflows. That is to say, when designing a data analysis pipeline, one should think about how to incorporate diagnostic checks and interactive analysis steps along the way, as a sort of heuristic debugging process. My hunch is that most people already do this, but the challenge is to formalize it as part of the process. That’s definitely something I’ll think about as I go about analyzing my own data.
The necessity of IDA also explains why there’s no such thing as taking “a quick look” at the data to see if there’s something interesting there (also sometimes overheard: “just run it through your pipeline”). I work mostly with genomic data, and most of my time is spent on interacting with the data, determining if a particular question is even appropriate to ask for a particular set of data. “Quick and dirty” is usually more dirty than quick, when all is said and done….
I happened upon an interesting phrase in a story, “Signal” by John Lanchester, from the New Yorker (3 Apr 2017):
“Michael wasn’t my oldest friend and he wasn’t my closest friend, but he was older than any of the ones who were closer and closer than any of the ones who were older, ….”
This is an odd way to describe a friendship, but it is precise. However, the more I thought about it, the more dissatisfied I was.
… the joy of optimization combined with the regret of past inefficiencies (joygret?)
A familiar feeling as I am re-analyzing and organizing old data.
Late last year, my colleague Silke W and I went to Denmark for a short field trip to collect ciliates, where we were hosted by Lasse Riemann of the University of Copenhagen. The site where we collected our material was Nivå Bay, which is famous among environmental microbiologists for the several decades of studies there on sulfur-cycling by microorganisms.
Nivå Bay (above, view from birdwatching tower on a sunny day) is a shallow, sheltered bay where the water is only knee- to waist-height at low tide. Scattered between the tufts of seaweed and seagrass were some off-white, slimy films on the surface of the sediment. These are actually bacterial “veils”, which are sheets of mucus produced by bacteria that embed themselves in them. Like a veil made of lace, each sheet is punctuated by many holes. Unlike a wedding veil, these veils are not meant to hide anything. Instead, you can think of them as a sort of natural-born environmental engineering – the holes allow water to flow through, and the bacteria actively circulate water by beating their flagella. By working together in these colonies, the bacteria can set up a continuous flow of water through the veil. This flow mixes sulfide-rich water coming from below with oxygenated water from above, bringing together the chemicals that they use to generate energy.
There are different species of bacteria that have such behavior. One of them has the wonderful name Thioturbo danicus – the sulfur whirl of Denmark. It has flagella on both poles of its rod-shaped cells. In this video you can see what happens when a single cell is detached from the mucus veil – it ends up tumbling like a propeller, which probably was the inspiration for its name!
Here is a somewhat degraded veil that had been sitting around in a Petri dish for too long. Taken from its natural environment, it soon becomes overgrown with grazing protists and small animals that methodically eat up the bacteria:
You can read more about the veil-forming bacteria from these publications from the microbiologists at Helsingør: Thar & Kühl 2002, Muyzer et al. 2005.