The South China Morning Post had a feature this past October on Werner Burger, an expert on old Chinese cash, and his collection of seven tons (!) of coins stored in a warehouse in Hong Kong. Burger recently published a door-stopper of a book on cash in the Qing dynasty, following up on his first book from 1975.
Chinese coin from the Qianlong era of the Qing dynasty. (By Murberget Länsmuseet Västernorrland [Public domain], via Wikimedia Commons)
My father had a copy of the latter at home, and I remember thinking that it was merely a collector’s guide. I had no idea that this work was based on (just a part of) his collection of 2 million coins. The story of how Burger came into possession of these coins is also fascinating – a friend of his in Hong Kong was importing old Chinese cash from Indonesia to use as scrap metal, and let him pick out some for himself.
Since his 1975 book, which covered the Qing dynasty up to the beginning of the Qianlong emperor’s reign, new archival material from the imperial mint has been discovered. Burger used his collection and the archival material to reconstruct the fiscal history of the era, in order to analyze why Chinese currency became so devalued during the Qing.
What a wonderful story of serendipity and sheer persistence! Cash is such a fascinating thing – at once a physical artefact and an abstract idea, a ritual item (in the anthropological sense) that most people handle every day, somewhat mystical and yet mundane.
My latest paper has just been published in Proceedings of the Royal Society B! My colleagues and I describe how a partnership between a group of ciliates (a type of single-celled organism) called Kentrophoros and their bacterial symbionts had a single evolutionary origin. This is despite the fact that different species of Kentrophoros can look very different from each other and are found all over the world. The bacteria are also a lineage that is new to science, and that as far as we know is only associated with these ciliates. This means that after the first Kentrophoros and its bacterial partner got together tens or hundreds of millions of years ago, their descendants have diversified into different species and spread themselves throughout the globe, all the while remaining true to each other.
Kentrophoros sp. from the Mediterranean island of Elba. This ciliate carries a few hundred thousand bacterial symbionts (whitish mass) and is almost 2 mm long despite being a single cell.
Still have questions? Read more below…
What’s wrong with the trash can below?
This was on a street in the historical center of Hannover, which is popular among tourists who come to see the sights and have a coffee in the leafy squares bounded by old buildings dating back hundreds of years. Isn’t it good to have a place to get rid of your banana peel or sandwich wrapper?
Inspired by a recent xkcd comic that shows how search terms trend over time, I decided to play around with Google Trends to see if I could find any interesting patterns.
“Random Obsessions” from xkcd
Here’s an attempt at classifying the different patterns that I observed.
Earlier this week I was flying through Munich airport and the plane took the scenic route over Munich and its surroundings. The sun was out and the rapeseed fields were in bloom, making patches of bright and pretty yellow all over the landscape. I saw this from the window and wondered what it could be – was it the world’s longest swimming pool?
A quick search on Google Maps after I got home gave me the answer: it’s the Regattastrecke Oberschleißheim, an artificial rowing course built for the 1972 Olympics. I really enjoyed watching the contrasting colors in the landscape from the air, especially the different blues and greens of the various water bodies, and the snowy mountain ranges that we flew over before getting to Munich. The patterns in the formal gardens at the Nymphenberg Palace could also be seen, but I wasn’t fast enough with my camera.
If only the weather was always so nice when I am in the air!
Nice blog post from Rafa Irizarry on why Interactive Data Analysis (IDA) is important, instead of mindlessly applying workflows.
Some points I agree with:
- IDA is necessary to discover outliers, to get a “feel” for the data, to check if applied analyses are appropriate
- “Data generators” who produce the raw data are usually not trained data analysts
Some reservations I have about the post:
- I think that knocking on mindlessly-applied workflows is a bit of a crowd-pleasing, “preaching to the choir” statement. If you ask people directly, no one would sign on to the statement “We should use workflows without thinking about whether they are appropriate” (even if in practice that is what many of us are doing, myself included)
- Standardized workflows are useful for reproducibility. Outliers that screw up data analyses are like bugs in computer code. And as anybody who’s tried to get IT help knows, one of the first things we’re asked to do is to reproduce the bug.
What I especially like is his call for IDA to be a bigger part of existing workflows. That is to say, when designing a data analysis pipeline, one should think about how to incorporate diagnostic checks and interactive analysis steps along the way, as a sort of heuristic debugging process. My hunch is that most people already do this, but the challenge is to formalize it as part of the process. That’s definitely something I’ll think about as I go about analyzing my own data.
The necessity of IDA also explains why there’s no such thing as taking “a quick look” at the data to see if there’s something interesting there (also sometimes overheard: “just run it through your pipeline”). I work mostly with genomic data, and most of my time is spent on interacting with the data, determining if a particular question is even appropriate to ask for a particular set of data. “Quick and dirty” is usually more dirty than quick, when all is said and done….
Most regular R users will have felt the influence of Hadley Wickham, whether through the widely-used ggplot2 package that implements the “grammar of graphics”, devtools, plyr, … the list goes on. I was astounded when I first realized that the same person was responsible for all these really useful things.
Most software packages aim at providing tools to make particular tasks easier in a certain language. In comparison, many of the tools that he has developed are in effect streamlining the grammar of the language itself. Once you use ggplot2 and see how intuitive it is to deal with statistical graphics in that way, then the base R plot commands feel impossibly clunky. Similarly, his paper on tidy data and the accompanying tidyr and plyr packages articulate basic ideas about data should be organized in tables. These are ideas that sound very simple, and most of us have probably had some similar thoughts cross our minds as we struggled to reshape raw data into analyzable form, but I certainly would not have been able to formulate the concepts so clearly or implement solutions to change our relationship to data wrangling.
The various packages have seemed to evolve towards a common style and design philosophy, and late last year most of them have been bundled together in a ‘super-package’ called tidyverse. It makes installation much easier, because now you can make sure all these inter-dependent packages are up-to-date with a single command, and probably makes development easier for him and his team. It also goes together with a book titled R for Data Science that he and a coauthor have just released, which is also available online. Noted here for future reference!