But of Course

At Washington, DC’s direction, dozens of groups operating as 501(c)(4)s were flagged for IRS surveillance, including monitoring of the groups’ activities, websites and any other publicly available information. Of these groups, 83% were right-leaning. And of the groups the IRS selected for audit, 100% were right-leaning.


E Hines said...

And on a related note, the Federal government is closed today, due to the weather.

Evil right wing extremists of the Republican Party interfering with Global Warming and shutting down government, again.

Eric Hines

Eric Blair said...

Meh. It's just some anecdotes.

Cass said...

Actually, 83% and 100% are not anecdotes. They're statistics of the kind I said would be good evidence that we are or are not living in a police state.

Grim said...

He knows that. I'm not sure I buy the distinction, since there's a smooth gear between anecdotes and data at some point. But he's pulling your chain, as he'll admit himself.

Problem is, so what? You've got your data now. What does it change? Can you run it in with the anecdote about the Ohio National Guard? Should you? To make it data, you'd have to decide what would qualify running anecdotes together. It's a problem, a conceptual problem, and a rather tricky one.

Cass said...

buy the distinction, since there's a smooth gear between anecdotes and data at some point.

No statistician would tell you that small, non-random samples were anything close to an accurate picture of the entire data set. The only point at which one could say there's a "smooth gear" between a sample and the population would be when the sample is random and includes enough instances to be representative of the population.

Basing analysis on all the data (as opposed to a few stories sensational enough to have made the news) controls for selection bias. That's the very first thing anyone should control for when using a few cases to draw conclusions about a larger data set.

Grim said...

I'm not a statistician, of course. I'm talking about what must be true for the method to be true, which is to say that I'm talking about what philosophers call ontology.

What I'm suggesting is that statistics don't actually exist. What actually exists is the cases, which by nature are individual (what we are calling anecdotes). The statistics are an abstraction from what really exists, and as such don't exist in the same way.

To get to statistics, then, there must be a gear that operates between the cases and the thing being generated from them. This is called a Sorites Problem (or Sorites Paradox), which can be described as a problem of vagueness. How many hairs can a man have on his head and still be bald? One? Two? A hundred? A thousand?

The concept 'that he is bald' doesn't exist in the same way that the man and his hairs exist. Whatever level of existence it has is derived from the actual existence of the man, and his hairs. To reason from the concept isn't wrong, but it is a step removed from the actual reality that justifies it.

Statisticians fall in love with the data and the things they can do with it, and I understand that. Ideas are beautiful, sometimes. But it's helpful to remember that the data doesn't (note singular verb) exist. The data do; 'the data' doesn't. It isn't real, and our conclusions drawn by abstracting from it are two steps removed from the things that are.

Cass said...

No, statistics are NOT an abstraction. In many cases they're an accumulation or a summary-level view.

If you have only ever seen 4 men and all of them happen to be bald, you might assume that all men are bald. But you would be wrong - they're not.

If, however, you had only ever seen 4 men but you also read that 15% of all men are bald, you'd know the 4 men you know were not really representative of "all men".

The statistic is an accurate representation of "all men" in a way that your 4 anecdotal cases are not.

Grim said...

True. However, the reason justifying that claim is the existence of actual other men.

You used the word "representation." This is a word with a substantial history -- Kant relies on it in responding to Hume, for example. Hume's point is that we might have "impressions" of physical things, direct and immediate -- and anything beyond that, any concepts we have of how things might be, derive from these impressions. So we might speak (to use one of his examples) of a "Golden Mountain"; but that needn't imply that anything like a Golden Mountain exists. What's really going on, he thought, is that we have an idea of what it is to be golden (perhaps from seeing and touching a ring), and an idea of what it is to be a mountain, and we're just recombining these ideas. We are re-presenting them to ourselves. (Aristotle makes a similar point about the function of the imagination in showing us what is essential to concepts: we can re-image a table as blue instead of brown, and find it still works as a table; but if it is slanted and unable to hold things up, it isn't a table anymore. That shows us that flatness and stability are essential to 'being a table,' but color is accidental.)

Kant makes the point that representation (re-presentation) involves the mind making a unity out of several different things. I have a sight of you, I hear your voice, I may feel your touch, but my mind unifies this into a single conceptual object I call "you." This is a necessary condition for me experiencing "you" at all.

Statistics are a second re-presentation. They aren't the sight of a man with bald hair; they aren't even the unified experience of a "you" who is bald. They're a further abstraction to an idea of "men," some of whom are one way and some of whom are the other.

But if (and insofar as) the statistic has real existence, its existence is derived from the men. Its reality is only as an idea abstracted from the many actual physical men in the world.

Cass said...

Grim, I understand that lots of ordinary words have specialized meanings in philosophy (as they do in law). But I'm not a philosopher, and to take my meaning as somehow being defined by a field I know very little of is to distort my meaning.

Statistics have two broad uses:

1. To describe "what is" (descriptive stats). So, 15% of men are bald and 85% percent of men are not comes from observations of - hopefully - a great many men.

Descriptive stats can only be used to draw conclusions about the group used to compile them. So, Eric's stories can reasonably be used to draw conclusions about SWAT-type raids that have gone wrong somehow. They CANNOT be reasonably extended to all SWAT-type raids because they leave out raids that were conducted properly and used appropriately.

The reason they can't be used for that purpose is that the sample is biased - if you use a sample of only bald men to draw conclusions about the scalps of all men, you'll wrongly decide no men have hair because no one in your sample did. It's incomplete.

2.To make predictions or draw inferences (inferential stats). To decide whether bad SWAT raids have reached an unacceptably high frequency, we need to know what percentage of all raids go wrong/are used inappropriately AND we need to define a threshold for that decision.

We can't use a sample of newspaper stories that doesn't include ANY raids that were conducted appropriately.

To use the term "representative" wrt to stats, we'd want to know how "representative" bad raids are of the entire population of raids. If our "wrongness" threshold is defined as "even one bad raid", then a single anecdote will suffice.

But you wouldn't accept "even one wrongful shooting" as a reasonable threshold for handgun ownership, would you? If a single death is an unacceptably bad outcome, one might point out that a failure to conduct a raid that resulted in a single death was also unacceptable. This is why these questions aren't simple.

Cass said...

On balance, I see several things wrong with my comment that I'll have to clarify/correct later. For instance, here I should have said:

Eric's stories can reasonably be used to draw conclusions the group of anecdotes he cites. But they can't even necessarily be used to extrapolate to the larger group of all SWAT-type raids that have gone wrong somehow (much less the even larger group of wrongful and not-wrongful raids.

Never comment at the end of a long day. My eyes are just bad enough that typing into these little boxes is an error-prone endeavor. Please disregard until I have time to frame things less sloppily :)

Grim said...

I'm not trying to distort your meaning, but to explain why my view of statistics is different from yours. You take them as being closer to the truth, because they capture more of the data (or at least hopefully-unbiased random samples); I take them as being further away from it (because they are less real than the things they model).

There's something going on when you re-present a thing as something else. Kant's point about representation is very strong: we can't actually know anything about the reality of things as they are in the world, he claims, because we only have access to the compilations, the "representations." I think that's too strong, because although he's right that we only have representations, we can come at them in different ways to see if they confirm each other -- we can, for example, use scientific instruments operating on different wavelengths to gain information that our vision isn't providing.

Statistics seem to me to be a tool like the scientific instrument. They aren't necessarily wrong or bad or unreliable, but they're an artificial tool providing an artificial product. We may find it very useful, but the data about the thing isn't the thing itself.

Grim said...

There's something similar going on, really, any time we make logical objects out of actual ones. To make a thing into an object to which you can apply logic, you have to subsume it under a kind of universal category. We can do that freely with objects of thought -- "all squares have four right angles" -- because they exist in that way. But to make actual objects into objects of thought, we have to strip away some part of their character so that they fit the universal category.

(Baldness is even harder, because we can't define exactly what would cause you to be included. To say that 15% of men are bald is almost to speak nonsense, because we can't say just what it is to be bald. We could say that 15% of men are incontestably bald, perhaps; but there's a certain number we can't be sure about one way or the other.)

Grim said...

I think I've expressed these ideas quite badly, so it's probably not clear what I mean to say. I'll try again tomorrow. :)

Cass said...

Grim, you're doing exactly what frustrates me about philosophy: you're losing sight of the forest for all the trees.

No one is interested in counting hairs on scalps. We are interested in the larger question of men who have lost the hair they had when they were younger, and how many of them there are out there. Getting wrapped around the axle by quibbling about what precisely counts as "bald" is really a side issue.

You can't talk about the essential character of a large, hetergeneous population because they don't share one. It's more accurate to carefully document their differences and the relative frequency of differing states.

Statistics does that. There really is no other way to do it that I am aware of. Philosophy can't tell me anything about how many men lose their hair as they age or how likely it is that a particular man will lose his hair. That's why we have different tools for different tasks.

A hammer is not suited for backing wood screws out of a block of wood because it wasn't designed for that task.

Grim said...

I think we're in danger of talking past each other (as T99 often puts it). I'm not talking about what the tool is useful for, but about what kind of thing is real. Logic is a really useful tool also, but there's a grave danger involved in mistaking the logical object for the real object.

Thus the analogy to scientific tools, which in a way let us get at things that Kant thought we couldn't get at. We can get past our impressions of objects in part because we can test those impressions against data from sources we wouldn't normally have access to via our senses. Those are also tools that are useful for things. And they really are useful -- there are things you can't do without the right tool.

Nevertheless, to convert actual events into data subject to statistical analysis means altering the real events, turning them into a new kind of object. You have to strip out a lot of the information that the actual event had, because you can only compare the events across a large body of evidence if you can render them 'the same' in some important way. (This is what I meant by saying that they have to be made into an object with a kind of universal character.)

Since we're talking about SWAT raids, we want to know (a) how many are there, and (b) how many go bad? Now in order to do that we have to decide how to streamline all these actual events, by deciding what to cut out.

For example, we have to decide just what it means for a raid to "go bad." This is why I brought up the baldness issue, because it's a very similar issue. Do we mean that there was a plausible accusation of wrongdoing? If so, plausible to whom? Do we mean an actual investigation, and if so do we mean internal or only those that produced a criminal investigation by an independent body? Whichever we mean, do we include the ones where the department has a prudential policy of always investigating claims, or are we only interested in ones where the department's hand was forced because the facts seemed so alarming? Maybe we only want convictions?

Of course, we could set up several categories, and track for each. But we still have to decide which of the cases fits in each category.

We end up with a kind of picture, but it's not the same picture we'd get if we studied each case individually (even if we studied exactly the same cases). We lose a lot of information. The cases are altered in subtle ways by being sorted into pre-determined categories. And ultimately, we're talking about a report that is as much about our pre-determined decisions about what kinds of categories we'd take as valid, as it is about the cases as they existed before we applied our analysis.

But the thing that's real was the cases. It was the living men who got shot. That was the real thing.