A nice-sounding worship session in Frankfurt Germany caused over 100 peoplet to get covid-19 because of the singing. This is a stub for an entry on the probelm of Signal and Noise. The problem is growing. It started around 1500. Before then, you could read every book in your town, and read them all carefully. By 1900, Nietzsche was warning about the timewaste and crud of newspapers. By 1950, Leo Strauss was telling us to become friends with a few Big Books rather than acquaintances with many little ones. Now in 2020 we are frazzled by Information Overload.
This comes up in all kinds of ways. It is killing people now, literally, via covid-19. We demand info from the media and the web, and they give it to us— way too much. We’d be better off with just the best 1%. We’d probably be better off with a random sample of 1%. Possibly we’d better off with the worst 1%. Explaining the worst 1% is enough to understand my point about the rest. If we just had the worst 1%, we who know how to think critically (admittedly a tiny minority) would process it more slowly and critically and would extract what is correct from the dross. Our common sense and our IQ would combine with the actual information. And then we’d tell our less intellectual friends what to do. Instead, we are too busy reading to do any thinking.
This is, of course, the role of society’s leaders, including our formal government. I should be writing on the Center for Disease Control today (and I will, I promise, Mark!). One of the main functions of the CDC in an epidemic is to process information. In fact, processing it is a more important function than creating it. Just this week, Atlantic revealed that the US case count published by the CDC is completely messed up. The CDC took in case counts from the state health departments, which is the way it ought to be set up, just as the FBI takes in criminal reports from city police departments in creating the National Crime Report. But state health departments reported the data differently, and the idiots at the CDC seems to in many cases not to have noticed and in all cases not to have cared. What some states did was to lump together (a) people who tested positive for the virus (the nose test, which tests for people with the germ in their sputum, people who you might loosely say “are sick” even if no symptoms), and (b) peopel who tested positive for the antibody (the blood test, which tests for people with antibodies to the particular germ in their blood, people whom one can say “have recovered”). To confuse those things is unbelievably stupid, though I guess I shouldn’t say that since actually I can believe a small-state health bureaucrat might do it. What is really hard to believe, but apparently happened, however, is that the professional epidemiologists at the CDC didn’t notice. Again we see that epidemiologists are, as a group, no better scholars than chiropractors or climate scientists, though less political than climate scientists and with theory less silly than the chiropractors. (It pains me to say this— my late father-in-law was an epidemiologist, after all— but it clearly is a sick field, however clever and brilliant many individuals in it may be. Same for climate science. I don’t know about chiropracty.) The result, though, is that the CDC has been mixing apples and oranges when it reports covid-19 prevalance. It’s like adding up everybody who has a cold and everybody who had a cold sometime in their life and reporting out that 99.9% of Americans have colds.
Now let’s link that back to Signal and Noise. By mixing the signal— number of people who test positive for the virus— with the noise— irrelevant stuff like how many people have the antibodies— the CDC has “corrupted the signal” as they say. We could use fancy math technique to extract the signal from the noise, to some extent just as a radio extracts a Mozart symphony from the multitudes of electromagnetic waves passing through your house. In fact, this example makes for a nice problem. I think I’d tackle it by finding the correlation between current cases and past cases and deflating the combined number accordingly, with a continually changing deflator that adjusts for the different growth and decline of cases coming from different states. It woudl make a nice programming exercise too for my children who are learning Python 3 this summer. Ben decided to do CodeAcademy’s Data Cleaning course, so it’s quite relevant. Just an hour ago I was explaining how in economics research you learn with experience that half your time is spent cleaning the data of typos and crud rather than what seems hard and glamorous and important to the novice, applying a clever econometric method.
This post is just a stub because CDC data is not the example I started with, and the topic of Signal and Noise might even require a book to properly address. Remember the Frankfurt singers? I’d like to find out more, but I bet they didn’t realize that there’s a lot of information out there that Singing is one of the worst things for spreading disease, along with Sneezing, Coughing, and Spitting. Singing, if you are enthusiastic and know how to do it properly, projects a lot of droplet-laded air from deep in your lungs. (When I do it, on the other hand, it projects just a little bit of hesitant air from the front of my mouth, but that’s why I’m not in the choir.)
Before the current epidemic, I didn’t realize that singing was dangerous this way. Did you? This is extremely useful information. To be useful, information has to change your behavior. This is something I used to teach MBA students in the second week of their core Microeconomics class, though it’s really, like present discounted value, Decision Theory. If knowing whether X is true or not wouldn’t affect your decision on whether to do Y or Z, info on X has zero value: you should not be willing to incur any cost to learn about X. If you could pay $20 to learn whether locating your new factory in Bombay instead of Calcutta has a 100% chance of being the correct decisiona nd saving you $500 million instead of a 60% chance of being correct, you should save the $20 and buy a cigar, since either way you will end up locating the factory in Bombay. That is an example of important but useless information. There is lots of info on the covid-19 virus which is like that. I can’t think of a good exampel now. Say, fro the ordinary person, whether covid-19 started from selling sick wild bats for meat or from accidental release from the Wuhan government lab, from intentioanl release, or from intentional release by the US army. I’d like to know the answer, but it’s really just for my entertainment, since knowing the answer wouldn’t change a single thing I do to try to protect myself from the virus.
Knowing that singing is dangerous, though, is highly useful. I am a churchgoer, in normal times. Now when I sing in a group of people I will wear a mask (God can still hear me) and not sing towards other people, or position myself in front of them. I will do it outside. Maybe I won’t sing at all. There are lots of simple, easy, avoidance decisions I can make.
That’s the Signal–“singing is dangerous in a time of plague”. The Noise is all the other claims we hear, not just the false ones but true and irrelevant ones like news about covid-19’s origins, whether the vaccine will take 12 or 18 months to develop, whether Trump was wearing a mask golfing yesterday, etc. Important but useless information is Noise, not Signal. So too is information that is useless because we already know it. That people who have uncontrollable coughs should stay home is very important, but it is so obvious that when someone tells that to me it is Noise, not Signal.
The CDC should be in the signal extraction business,a dn should transmit Signal to us with a minimum of Noise. It shoudl not tell us, “If you have covid-19, stay home and don’t spit on people.” It should tell us “Singing is a lot more dangerous than you might think.”
The CDC currently has three problems. First, it is so incompetent it can’t event detect the Signal— its people can’t tell true from false. Second, and not unrelated to the first problem, it has zero credibility. The White House (that is, the President’s immediate staff) and the public have noticed that the CDC doesn’t know what it’s doing and have marginalized it, kind of quarantining the CDC after the January test fiasco so as to contain the damage its further mistakes might cause our efforts to fight the epidemic. Letting the CDC stay in charge of reporting disease statistics seemed harmless enough, like singing as compared to coughing, but we were wrong. The CDC quarantine needs to be tightened. We need to put a wall around Atlanta and shoot anybody in a white coat who tries to escape, as creating too much of a risk that he might spread false information throughout the entire population. Rememeber, the spread of false info is exponential, especially if it gets picked up by a superspreader like the New York Times.
With so much to inspire rhetoric, it’s hard to stay on track. But the third problem is the one I started out to write about. The third problem is that the CDC, and government generally, doesn’t focus, and gives us too much information. It should tell us only what we need to know, what is useful because it is relevant to our behavior. One press release saying, “Singing is dangerous” is better than 100 press releases that mixes the crucial Signal in with a lot of other stuff that though true, is Noise.